| 1 | - Mixed buffers of dedupe-able and compressible data. |
| 2 | Major usecase in performance benchmarking of storage subsystems. |
| 3 | |
| 4 | - Shifted dedup-able data. |
| 5 | Allow for dedup buffer generation to shift contents by random number |
| 6 | of sectors (fill the gaps with uncompressible data). Some storage |
| 7 | subsystems modernized the deduplication detection algorithms to look |
| 8 | for shifted data as well. For example, some databases push a timestamp |
| 9 | on the prefix of written blocks, which makes the underlying data |
| 10 | dedup-able in different alignment. FIO should be able to simulate such |
| 11 | workload. |
| 12 | |
| 13 | - Generation of similar data (but not exact). |
| 14 | A rising trend in enterprise storage systems. |
| 15 | Generation of "similar" data means random uncompressible buffers |
| 16 | that differ by few(configurable number of) bits from each other. |
| 17 | The storage subsystem usually identifies the similar buffers using |
| 18 | locality-sensitive hashing or other methods. |
| 19 | |