path: root/dedupe.c
AgeCommit message (Collapse)Author
2023-03-03Refactor for_each_td() to catch inappropriate td ptr reuseHorshack
I recently introduced a bug caused by reusing a struct thread_data *td after the end of a for_each_td() loop construct. Link: To prevent others from making this same mistake, this commit refactors for_each_td() so that both the struct thread_data * and the loop index variable are placed inside their own scope for the loop. This will cause any reference to those variables outside the for_each_td() to produce an undeclared identifier error, provided the outer scope doesn't already reuse those same variable names for other code within the routine (which is fine because the scopes are separate). Because C/C++ doesn't let you declare two different variable types within the scope of a for() loop initializer, creating a scope for both struct thread_data * and the loop index required explicitly declaring a scope with a curly brace. This means for_each_td() includes an opening curly brace to create the scope, which means all uses of for_each_td() must now end with an invocation of a new macro named end_for_each() to emit an ending curly brace to match the scope brace created by for_each_td(): for_each_td(td) { while (td->runstate < TD_EXITED) sleep(1); } end_for_each(); The alternative is to end every for_each_td() construct with an inline curly brace, which is off-putting since the implementation of an extra opening curly brace is abstracted in for_each_td(): for_each_td(td) { while (td->runstate < TD_EXITED) sleep(1); }} Most fio logic only declares "struct thread_data *td" and "int i" for use in for_each_td(), which means those declarations will now cause -Wunused-variable warnings since they're not used outside the scope of the refactored for_each_td(). Those declarations have been removed. Implementing this change caught a latent bug in eta.c::calc_thread_status() that accesses the ending value of struct thread_data *td after the end of for_each_td(), now manifesting as a compile error, so working as designed :) Signed-off-by: Adam Horshack (
2022-04-27Introducing support for generation of dedup buffersBar David
across jobs. The dedup buffers are spread evenly between the jobs that enabled the dedupe_global option Note only dedupe_mode=working_set is supported. Note compression is supported with the global dedup enabled Signed-off-by: Bar David <>
2021-11-21Mixed dedup and compressionBar David
Introducing support for dedupe and compression on the same job. When used together, compression is calculated from unique capacity. E.g. when using dedupe_percentage=50 and buffer_compress_percentage=50, then total reduction should be 75% - 50% would be deduped while 50% of the remaining buffers would be compressed Signed-off-by: Bar David <>
2021-07-15dedupe: allow to generate dedupe buffers from working setBar David
This commit introduced new dedupe generation mode "working_set". Working set mode simulates a more realistic approach to deduped data, in which deduped buffers are generated from pre-existing working set - % size of the device or file. In other words, dedupe is not usually expected to be close in time with the source buffer, as well as source buffers are usually composed of small subset of the entire file or device. Signed-off-by: Bar David <>