io_u: speed up small_content_scramble()
This is a hot path for write workloads, since we don't want to send the
same buffers to the device again and again. The idea is to defeat basic
dedupe/compression, but slightly modifying the buffer for each write.
small_content_scramble() does this by filling in the io_u offset into a
random spot in each 512b chunk of an io buffer, and filling in the start
time (sec,nsec) at the end of each 512b chunk.
With this change, we still do those two things, but we generate a random
cacheline within each 512b chunk, and fill the offset at the beginning
of the cacheline, and the time at the end of it. This means that
instead of potentially dirtying 2 cachelines for each 512b chunk in an
IO buffer, we dirty just 1.
The results should still be random enough that small_content_scramble()
fullfils the promise to defeat basic dedupe and compression, but it is
lighter to run.
Signed-off-by: Jens Axboe <axboe@kernel.dk>