compiler: clang
osx_image: xcode8.3
env: BUILD_ARCH="x86_64"
+ - os: osx
+ compiler: clang
+ osx_image: xcode9.4
+ env: BUILD_ARCH="x86_64"
exclude:
- os: osx
compiler: gcc
.. option:: --aux-path=path
- Use this `path` for fio state generated files.
+ Use the directory specified by `path` for generated state files instead
+ of the current working directory.
Any parameters following the options will be assumed to be job files, unless
they match a job file parameter. Multiple job files can be listed and each job
assigned equally distributed to job clones created by :option:`numjobs` as
long as they are using generated filenames. If specific `filename(s)` are
set fio will use the first listed directory, and thereby matching the
- `filename` semantic which generates a file each clone if not specified, but
- let all clones use the same if set.
+ `filename` semantic (which generates a file for each clone if not
+ specified, but lets all clones use the same file if set).
See the :option:`filename` option for information on how to escape "``:``" and
"``\``" characters within the directory path itself.
+ Note: To control the directory fio will use for internal state files
+ use :option:`--aux-path`.
+
.. option:: filename=str
Fio normally makes up a `filename` based on the job name, thread number, and
Unlink job files after each iteration or loop. Default: false.
-.. option:: zonesize=int
+.. option:: zonerange=int
- Divide a file into zones of the specified size. See :option:`zoneskip`.
+ Size of a single zone in which I/O occurs. See also :option:`zonesize`
+ and :option:`zoneskip`.
-.. option:: zonerange=int
+.. option:: zonesize=int
- Give size of an I/O zone. See :option:`zoneskip`.
+ Number of bytes to transfer before skipping :option:`zoneskip`
+ bytes. If this parameter is smaller than :option:`zonerange` then only
+ a fraction of each zone with :option:`zonerange` bytes will be
+ accessed. If this parameter is larger than :option:`zonerange` then
+ each zone will be accessed multiple times before skipping
.. option:: zoneskip=int
- Skip the specified number of bytes when :option:`zonesize` data has been
- read. The two zone options can be used to only do I/O on zones of a file.
+ Skip the specified number of bytes when :option:`zonesize` data have
+ been transferred. The three zone options can be used to do strided I/O
+ on a file.
I/O type
(RBD) via librbd without the need to use the kernel rbd driver. This
ioengine defines engine specific options.
+ **http**
+ I/O engine supporting GET/PUT requests over HTTP(S) with libcurl to
+ a WebDAV or S3 endpoint. This ioengine defines engine specific options.
+
+ This engine only supports direct IO of iodepth=1; you need to scale this
+ via numjobs. blocksize defines the size of the objects to be created.
+
+ TRIM is translated to object deletion.
+
**gfapi**
Using GlusterFS libgfapi sync interface to direct access to
GlusterFS volumes without having to go through FUSE. This ioengine
mounted with DAX on a persistent memory device through the PMDK
libpmem library.
+ **ime_psync**
+ Synchronous read and write using DDN's Infinite Memory Engine (IME).
+ This engine is very basic and issues calls to IME whenever an IO is
+ queued.
+
+ **ime_psyncv**
+ Synchronous read and write using DDN's Infinite Memory Engine (IME).
+ This engine uses iovecs and will try to stack as much IOs as possible
+ (if the IOs are "contiguous" and the IO depth is not exceeded)
+ before issuing a call to IME.
+
+ **ime_aio**
+ Asynchronous read and write using DDN's Infinite Memory Engine (IME).
+ This engine will try to stack as much IOs as possible by creating
+ requests for IME. FIO will then decide when to commit these requests.
+
I/O engine specific parameters
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
transferred to the device. The writefua option is ignored with this
selection.
+.. option:: http_host=str : [http]
+
+ Hostname to connect to. For S3, this could be the bucket hostname.
+ Default is **localhost**
+
+.. option:: http_user=str : [http]
+
+ Username for HTTP authentication.
+
+.. option:: http_pass=str : [http]
+
+ Password for HTTP authentication.
+
+.. option:: https=str : [http]
+
+ Enable HTTPS instead of http. *on* enables HTTPS; *insecure*
+ will enable HTTPS, but disable SSL peer verification (use with
+ caution!). Default is **off**
+
+.. option:: http_mode=str : [http]
+
+ Which HTTP access mode to use: *webdav*, *swift*, or *s3*.
+ Default is **webdav**
+
+.. option:: http_s3_region=str : [http]
+
+ The S3 region/zone string.
+ Default is **us-east-1**
+
+.. option:: http_s3_key=str : [http]
+
+ The S3 secret key.
+
+.. option:: http_s3_keyid=str : [http]
+
+ The S3 key/access id.
+
+.. option:: http_swift_auth_token=str : [http]
+
+ The Swift auth token. See the example configuration file on how
+ to retrieve this.
+
+.. option:: http_verbose=int : [http]
+
+ Enable verbose requests from libcurl. Useful for debugging. 1
+ turns on verbose logging from libcurl, 2 additionally enables
+ HTTP IO tracing. Default is **0**
+
I/O depth
~~~~~~~~~
.. option:: write_iops_log=str
Same as :option:`write_bw_log`, but writes an IOPS file (e.g.
- :file:`name_iops.x.log`) instead. See :option:`write_bw_log` for
- details about the filename format and `Log File Formats`_ for how data
- is structured within the file.
+ :file:`name_iops.x.log`) instead. Because fio defaults to individual
+ I/O logging, the value entry in the IOPS log will be 1 unless windowed
+ logging (see :option:`log_avg_msec`) has been enabled. See
+ :option:`write_bw_log` for details about the filename format and `Log
+ File Formats`_ for how data is structured within the file.
.. option:: log_avg_msec=int
**2**
I/O is a TRIM
-The entry's *block size* is always in bytes. The *offset* is the offset, in bytes,
-from the start of the file, for that particular I/O. The logging of the offset can be
+The entry's *block size* is always in bytes. The *offset* is the position in bytes
+from the start of the file for that particular I/O. The logging of the offset can be
toggled with :option:`log_offset`.
-Fio defaults to logging every individual I/O. When IOPS are logged for individual
-I/Os the *value* entry will always be 1. If windowed logging is enabled through
-:option:`log_avg_msec`, fio logs the average values over the specified period of time.
-If windowed logging is enabled and :option:`log_max_value` is set, then fio logs
-maximum values in that window instead of averages. Since *data direction*, *block
-size* and *offset* are per-I/O values, if windowed logging is enabled they
-aren't applicable and will be 0.
+Fio defaults to logging every individual I/O but when windowed logging is set
+through :option:`log_avg_msec`, either the average (by default) or the maximum
+(:option:`log_max_value` is set) *value* seen over the specified period of time
+is recorded. Each *data direction* seen within the window period will aggregate
+its values in a separate row. Further, when using windowed logging the *block
+size* and *offset* entries will always contain 0.
Client/Server
-------------
gettime-thread.c helpers.c json.c idletime.c td_error.c \
profiles/tiobench.c profiles/act.c io_u_queue.c filelock.c \
workqueue.c rate-submit.c optgroup.c helper_thread.c \
- steadystate.c
+ steadystate.c zone-dist.c
ifdef CONFIG_LIBHDFS
HDFSFLAGS= -I $(JAVA_HOME)/include -I $(JAVA_HOME)/include/linux -I $(FIO_LIBHDFS_INCLUDE)
ifdef CONFIG_RBD
SOURCE += engines/rbd.c
endif
+ifdef CONFIG_HTTP
+ SOURCE += engines/http.c
+endif
SOURCE += oslib/asprintf.c
ifndef CONFIG_STRSEP
SOURCE += oslib/strsep.c
ifdef CONFIG_LIBPMEM
SOURCE += engines/libpmem.c
endif
+ifdef CONFIG_IME
+ SOURCE += engines/ime.c
+endif
ifeq ($(CONFIG_TARGET_OS), Linux)
SOURCE += diskutil.c fifo.c blktrace.c cgroup.c trim.c engines/sg.c \
#if defined (__ARM_ARCH_4__) || defined (__ARM_ARCH_4T__) \
|| defined (__ARM_ARCH_5__) || defined (__ARM_ARCH_5T__) || defined (__ARM_ARCH_5E__)\
|| defined (__ARM_ARCH_5TE__) || defined (__ARM_ARCH_5TEJ__) \
- || defined(__ARM_ARCH_6__) || defined(__ARM_ARCH_6J__) || defined(__ARM_ARCH_6Z__) || defined(__ARM_ARCH_6ZK__)
+ || defined(__ARM_ARCH_6__) || defined(__ARM_ARCH_6J__) || defined(__ARM_ARCH_6Z__) || defined(__ARM_ARCH_6ZK__) \
+ || defined(__ARM_ARCH_6KZ__)
#define nop __asm__ __volatile__("mov\tr0,r0\t@ nop\n\t")
#define read_barrier() __asm__ __volatile__ ("" : : : "memory")
#define write_barrier() __asm__ __volatile__ ("" : : : "memory")
#include "rate-submit.h"
#include "helper_thread.h"
#include "pshared.h"
+#include "zone-dist.h"
static struct fio_sem *startup_sem;
static struct flist_head *cgroup_list;
goto err;
}
+ td_zone_gen_index(td);
+
/*
* Do this early, we don't want the compress threads to be limited
* to the same CPUs as the IO workers. So do this before we set
close_ioengine(td);
cgroup_shutdown(td, cgroup_mnt);
verify_free_state(td);
-
- if (td->zone_state_index) {
- int i;
-
- for (i = 0; i < DDIR_RWDIR_CNT; i++)
- free(td->zone_state_index[i]);
- free(td->zone_state_index);
- td->zone_state_index = NULL;
- }
+ td_zone_free_index(td);
if (fio_option_is_set(o, cpumask)) {
ret = fio_cpuset_exit(&o->cpumask);
fi
TMPC="${TMPDIR1}/fio-conf-${RANDOM}-$$-${RANDOM}.c"
+TMPC2="${TMPDIR1}/fio-conf-${RANDOM}-$$-${RANDOM}-2.c"
TMPO="${TMPDIR1}/fio-conf-${RANDOM}-$$-${RANDOM}.o"
TMPE="${TMPDIR1}/fio-conf-${RANDOM}-$$-${RANDOM}.exe"
# NB: do not call "exit" in the trap handler; this is buggy with some shells;
# see <1285349658-3122-1-git-send-email-loic.minier@linaro.org>
-trap "rm -f $TMPC $TMPO $TMPE" EXIT INT QUIT TERM
+trap "rm -f $TMPC $TMPC2 $TMPO $TMPE" EXIT INT QUIT TERM
rm -rf config.log
;;
--disable-rbd) disable_rbd="yes"
;;
+ --disable-http) disable_http="yes"
+ ;;
--disable-gfapi) disable_gfapi="yes"
;;
--enable-libhdfs) libhdfs="yes"
;;
--disable-native) disable_native="yes"
;;
+ --with-ime=*) ime_path="$optarg"
+ ;;
--help)
show_help="yes"
;;
echo "--disable-optimizations Don't enable compiler optimizations"
echo "--enable-cuda Enable GPUDirect RDMA support"
echo "--disable-native Don't build for native host"
+ echo "--with-ime= Install path for DDN's Infinite Memory Engine"
exit $exit_val
fi
fi
print_config "IPv6 helpers" "$ipv6"
+##########################################
+# check for http
+if test "$http" != "yes" ; then
+ http="no"
+fi
+# check for openssl >= 1.1.0, which uses an opaque HMAC_CTX pointer
+cat > $TMPC << EOF
+#include <curl/curl.h>
+#include <openssl/hmac.h>
+
+int main(int argc, char **argv)
+{
+ CURL *curl;
+ HMAC_CTX *ctx;
+
+ curl = curl_easy_init();
+ curl_easy_cleanup(curl);
+
+ ctx = HMAC_CTX_new();
+ HMAC_CTX_reset(ctx);
+ HMAC_CTX_free(ctx);
+ return 0;
+}
+EOF
+# openssl < 1.1.0 uses the HMAC_CTX type directly
+cat > $TMPC2 << EOF
+#include <curl/curl.h>
+#include <openssl/hmac.h>
+
+int main(int argc, char **argv)
+{
+ CURL *curl;
+ HMAC_CTX ctx;
+
+ curl = curl_easy_init();
+ curl_easy_cleanup(curl);
+
+ HMAC_CTX_init(&ctx);
+ HMAC_CTX_cleanup(&ctx);
+ return 0;
+}
+EOF
+if test "$disable_http" != "yes"; then
+ HTTP_LIBS="-lcurl -lssl -lcrypto"
+ if compile_prog "" "$HTTP_LIBS" "curl-new-ssl"; then
+ output_sym "CONFIG_HAVE_OPAQUE_HMAC_CTX"
+ http="yes"
+ LIBS="$HTTP_LIBS $LIBS"
+ elif mv $TMPC2 $TMPC && compile_prog "" "$HTTP_LIBS" "curl-old-ssl"; then
+ http="yes"
+ LIBS="$HTTP_LIBS $LIBS"
+ fi
+fi
+print_config "http engine" "$http"
+
##########################################
# check for rados
if test "$rados" != "yes" ; then
# Report whether libpmem engine is enabled
print_config "PMDK libpmem engine" "$pmem"
+##########################################
+# Check whether we support DDN's IME
+if test "$libime" != "yes" ; then
+ libime="no"
+fi
+cat > $TMPC << EOF
+#include <ime_native.h>
+int main(int argc, char **argv)
+{
+ int rc;
+ ime_native_init();
+ rc = ime_native_finalize();
+ return 0;
+}
+EOF
+if compile_prog "-I${ime_path}/include" "-L${ime_path}/lib -lim_client" "libime"; then
+ libime="yes"
+ CFLAGS="-I${ime_path}/include $CFLAGS"
+ LDFLAGS="-Wl,-rpath ${ime_path}/lib -L${ime_path}/lib $LDFLAGS"
+ LIBS="-lim_client $LIBS"
+fi
+print_config "DDN's Infinite Memory Engine" "$libime"
+
##########################################
# Check if we have lex/yacc available
yacc="no"
if test "$ipv6" = "yes" ; then
output_sym "CONFIG_IPV6"
fi
+if test "$http" = "yes" ; then
+ output_sym "CONFIG_HTTP"
+fi
if test "$rados" = "yes" ; then
output_sym "CONFIG_RADOS"
fi
if test "$pmem" = "yes" ; then
output_sym "CONFIG_LIBPMEM"
fi
+if test "$libime" = "yes" ; then
+ output_sym "CONFIG_IME"
+fi
if test "$arith" = "yes" ; then
output_sym "CONFIG_ARITHMETIC"
if test "$yacc_is_bison" = "yes" ; then
--- /dev/null
+/*
+ * HTTP GET/PUT IO engine
+ *
+ * IO engine to perform HTTP(S) GET/PUT requests via libcurl-easy.
+ *
+ * Copyright (C) 2018 SUSE LLC
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License,
+ * version 2 as published by the Free Software Foundation..
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public
+ * License along with this program; if not, write to the Free
+ * Software Foundation, Inc., 51 Franklin Street, Fifth Floor,
+ * Boston, MA 02110-1301, USA.
+ */
+
+#include <pthread.h>
+#include <time.h>
+#include <curl/curl.h>
+#include <openssl/hmac.h>
+#include <openssl/sha.h>
+#include <openssl/md5.h>
+#include "fio.h"
+#include "../optgroup.h"
+
+
+enum {
+ FIO_HTTP_WEBDAV = 0,
+ FIO_HTTP_S3 = 1,
+ FIO_HTTP_SWIFT = 2,
+
+ FIO_HTTPS_OFF = 0,
+ FIO_HTTPS_ON = 1,
+ FIO_HTTPS_INSECURE = 2,
+};
+
+struct http_data {
+ CURL *curl;
+};
+
+struct http_options {
+ void *pad;
+ unsigned int https;
+ char *host;
+ char *user;
+ char *pass;
+ char *s3_key;
+ char *s3_keyid;
+ char *s3_region;
+ char *swift_auth_token;
+ int verbose;
+ unsigned int mode;
+};
+
+struct http_curl_stream {
+ char *buf;
+ size_t pos;
+ size_t max;
+};
+
+static struct fio_option options[] = {
+ {
+ .name = "https",
+ .lname = "https",
+ .type = FIO_OPT_STR,
+ .help = "Enable https",
+ .off1 = offsetof(struct http_options, https),
+ .def = "off",
+ .posval = {
+ { .ival = "off",
+ .oval = FIO_HTTPS_OFF,
+ .help = "No HTTPS",
+ },
+ { .ival = "on",
+ .oval = FIO_HTTPS_ON,
+ .help = "Enable HTTPS",
+ },
+ { .ival = "insecure",
+ .oval = FIO_HTTPS_INSECURE,
+ .help = "Enable HTTPS, disable peer verification",
+ },
+ },
+ .category = FIO_OPT_C_ENGINE,
+ .group = FIO_OPT_G_HTTP,
+ },
+ {
+ .name = "http_host",
+ .lname = "http_host",
+ .type = FIO_OPT_STR_STORE,
+ .help = "Hostname (S3 bucket)",
+ .off1 = offsetof(struct http_options, host),
+ .def = "localhost",
+ .category = FIO_OPT_C_ENGINE,
+ .group = FIO_OPT_G_HTTP,
+ },
+ {
+ .name = "http_user",
+ .lname = "http_user",
+ .type = FIO_OPT_STR_STORE,
+ .help = "HTTP user name",
+ .off1 = offsetof(struct http_options, user),
+ .category = FIO_OPT_C_ENGINE,
+ .group = FIO_OPT_G_HTTP,
+ },
+ {
+ .name = "http_pass",
+ .lname = "http_pass",
+ .type = FIO_OPT_STR_STORE,
+ .help = "HTTP password",
+ .off1 = offsetof(struct http_options, pass),
+ .category = FIO_OPT_C_ENGINE,
+ .group = FIO_OPT_G_HTTP,
+ },
+ {
+ .name = "http_s3_key",
+ .lname = "S3 secret key",
+ .type = FIO_OPT_STR_STORE,
+ .help = "S3 secret key",
+ .off1 = offsetof(struct http_options, s3_key),
+ .def = "",
+ .category = FIO_OPT_C_ENGINE,
+ .group = FIO_OPT_G_HTTP,
+ },
+ {
+ .name = "http_s3_keyid",
+ .lname = "S3 key id",
+ .type = FIO_OPT_STR_STORE,
+ .help = "S3 key id",
+ .off1 = offsetof(struct http_options, s3_keyid),
+ .def = "",
+ .category = FIO_OPT_C_ENGINE,
+ .group = FIO_OPT_G_HTTP,
+ },
+ {
+ .name = "http_swift_auth_token",
+ .lname = "Swift auth token",
+ .type = FIO_OPT_STR_STORE,
+ .help = "OpenStack Swift auth token",
+ .off1 = offsetof(struct http_options, swift_auth_token),
+ .def = "",
+ .category = FIO_OPT_C_ENGINE,
+ .group = FIO_OPT_G_HTTP,
+ },
+ {
+ .name = "http_s3_region",
+ .lname = "S3 region",
+ .type = FIO_OPT_STR_STORE,
+ .help = "S3 region",
+ .off1 = offsetof(struct http_options, s3_region),
+ .def = "us-east-1",
+ .category = FIO_OPT_C_ENGINE,
+ .group = FIO_OPT_G_HTTP,
+ },
+ {
+ .name = "http_mode",
+ .lname = "Request mode to use",
+ .type = FIO_OPT_STR,
+ .help = "Whether to use WebDAV, Swift, or S3",
+ .off1 = offsetof(struct http_options, mode),
+ .def = "webdav",
+ .posval = {
+ { .ival = "webdav",
+ .oval = FIO_HTTP_WEBDAV,
+ .help = "WebDAV server",
+ },
+ { .ival = "s3",
+ .oval = FIO_HTTP_S3,
+ .help = "S3 storage backend",
+ },
+ { .ival = "swift",
+ .oval = FIO_HTTP_SWIFT,
+ .help = "OpenStack Swift storage",
+ },
+ },
+ .category = FIO_OPT_C_ENGINE,
+ .group = FIO_OPT_G_HTTP,
+ },
+ {
+ .name = "http_verbose",
+ .lname = "HTTP verbosity level",
+ .type = FIO_OPT_INT,
+ .help = "increase http engine verbosity",
+ .off1 = offsetof(struct http_options, verbose),
+ .def = "0",
+ .category = FIO_OPT_C_ENGINE,
+ .group = FIO_OPT_G_HTTP,
+ },
+ {
+ .name = NULL,
+ },
+};
+
+static char *_aws_uriencode(const char *uri)
+{
+ size_t bufsize = 1024;
+ char *r = malloc(bufsize);
+ char c;
+ int i, n;
+ const char *hex = "0123456789ABCDEF";
+
+ if (!r) {
+ log_err("malloc failed\n");
+ return NULL;
+ }
+
+ n = 0;
+ for (i = 0; (c = uri[i]); i++) {
+ if (n > bufsize-5) {
+ log_err("encoding the URL failed\n");
+ return NULL;
+ }
+
+ if ( (c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z')
+ || (c >= '0' && c <= '9') || c == '_' || c == '-'
+ || c == '~' || c == '.' || c == '/')
+ r[n++] = c;
+ else {
+ r[n++] = '%';
+ r[n++] = hex[(c >> 4 ) & 0xF];
+ r[n++] = hex[c & 0xF];
+ }
+ }
+ r[n++] = 0;
+ return r;
+}
+
+static char *_conv_hex(const unsigned char *p, size_t len)
+{
+ char *r;
+ int i,n;
+ const char *hex = "0123456789abcdef";
+ r = malloc(len * 2 + 1);
+ n = 0;
+ for (i = 0; i < len; i++) {
+ r[n++] = hex[(p[i] >> 4 ) & 0xF];
+ r[n++] = hex[p[i] & 0xF];
+ }
+ r[n] = 0;
+
+ return r;
+}
+
+static char *_gen_hex_sha256(const char *p, size_t len)
+{
+ unsigned char hash[SHA256_DIGEST_LENGTH];
+
+ SHA256((unsigned char*)p, len, hash);
+ return _conv_hex(hash, SHA256_DIGEST_LENGTH);
+}
+
+static char *_gen_hex_md5(const char *p, size_t len)
+{
+ unsigned char hash[MD5_DIGEST_LENGTH];
+
+ MD5((unsigned char*)p, len, hash);
+ return _conv_hex(hash, MD5_DIGEST_LENGTH);
+}
+
+static void _hmac(unsigned char *md, void *key, int key_len, char *data) {
+#ifndef CONFIG_HAVE_OPAQUE_HMAC_CTX
+ HMAC_CTX _ctx;
+#endif
+ HMAC_CTX *ctx;
+ unsigned int hmac_len;
+
+#ifdef CONFIG_HAVE_OPAQUE_HMAC_CTX
+ ctx = HMAC_CTX_new();
+#else
+ ctx = &_ctx;
+#endif
+ HMAC_Init_ex(ctx, key, key_len, EVP_sha256(), NULL);
+ HMAC_Update(ctx, (unsigned char*)data, strlen(data));
+ HMAC_Final(ctx, md, &hmac_len);
+#ifdef CONFIG_HAVE_OPAQUE_HMAC_CTX
+ HMAC_CTX_free(ctx);
+#else
+ HMAC_CTX_cleanup(ctx);
+#endif
+}
+
+static int _curl_trace(CURL *handle, curl_infotype type,
+ char *data, size_t size,
+ void *userp)
+{
+ const char *text;
+ (void)handle; /* prevent compiler warning */
+ (void)userp;
+
+ switch (type) {
+ case CURLINFO_TEXT:
+ fprintf(stderr, "== Info: %s", data);
+ default:
+ case CURLINFO_SSL_DATA_OUT:
+ case CURLINFO_SSL_DATA_IN:
+ return 0;
+
+ case CURLINFO_HEADER_OUT:
+ text = "=> Send header";
+ break;
+ case CURLINFO_DATA_OUT:
+ text = "=> Send data";
+ break;
+ case CURLINFO_HEADER_IN:
+ text = "<= Recv header";
+ break;
+ case CURLINFO_DATA_IN:
+ text = "<= Recv data";
+ break;
+ }
+
+ log_info("%s: %s", text, data);
+ return 0;
+}
+
+/* https://docs.aws.amazon.com/AmazonS3/latest/API/sig-v4-header-based-auth.html
+ * https://docs.aws.amazon.com/AmazonS3/latest/API/sig-v4-authenticating-requests.html#signing-request-intro
+ */
+static void _add_aws_auth_header(CURL *curl, struct curl_slist *slist, struct http_options *o,
+ int op, const char *uri, char *buf, size_t len)
+{
+ char date_short[16];
+ char date_iso[32];
+ char method[8];
+ char dkey[128];
+ char creq[512];
+ char sts[256];
+ char s[512];
+ char *uri_encoded = NULL;
+ char *dsha = NULL;
+ char *csha = NULL;
+ char *signature = NULL;
+ const char *service = "s3";
+ const char *aws = "aws4_request";
+ unsigned char md[SHA256_DIGEST_LENGTH];
+
+ time_t t = time(NULL);
+ struct tm *gtm = gmtime(&t);
+
+ strftime (date_short, sizeof(date_short), "%Y%m%d", gtm);
+ strftime (date_iso, sizeof(date_iso), "%Y%m%dT%H%M%SZ", gtm);
+ uri_encoded = _aws_uriencode(uri);
+
+ if (op == DDIR_WRITE) {
+ dsha = _gen_hex_sha256(buf, len);
+ sprintf(method, "PUT");
+ } else {
+ /* DDIR_READ && DDIR_TRIM supply an empty body */
+ if (op == DDIR_READ)
+ sprintf(method, "GET");
+ else
+ sprintf(method, "DELETE");
+ dsha = _gen_hex_sha256("", 0);
+ }
+
+ /* Create the canonical request first */
+ snprintf(creq, sizeof(creq),
+ "%s\n"
+ "%s\n"
+ "\n"
+ "host:%s\n"
+ "x-amz-content-sha256:%s\n"
+ "x-amz-date:%s\n"
+ "\n"
+ "host;x-amz-content-sha256;x-amz-date\n"
+ "%s"
+ , method
+ , uri_encoded, o->host, dsha, date_iso, dsha);
+
+ csha = _gen_hex_sha256(creq, strlen(creq));
+ snprintf(sts, sizeof(sts), "AWS4-HMAC-SHA256\n%s\n%s/%s/%s/%s\n%s",
+ date_iso, date_short, o->s3_region, service, aws, csha);
+
+ snprintf((char *)dkey, sizeof(dkey), "AWS4%s", o->s3_key);
+ _hmac(md, dkey, strlen(dkey), date_short);
+ _hmac(md, md, SHA256_DIGEST_LENGTH, o->s3_region);
+ _hmac(md, md, SHA256_DIGEST_LENGTH, (char*) service);
+ _hmac(md, md, SHA256_DIGEST_LENGTH, (char*) aws);
+ _hmac(md, md, SHA256_DIGEST_LENGTH, sts);
+
+ signature = _conv_hex(md, SHA256_DIGEST_LENGTH);
+
+ /* Surpress automatic Accept: header */
+ slist = curl_slist_append(slist, "Accept:");
+
+ snprintf(s, sizeof(s), "x-amz-content-sha256: %s", dsha);
+ slist = curl_slist_append(slist, s);
+
+ snprintf(s, sizeof(s), "x-amz-date: %s", date_iso);
+ slist = curl_slist_append(slist, s);
+
+ snprintf(s, sizeof(s), "Authorization: AWS4-HMAC-SHA256 Credential=%s/%s/%s/s3/aws4_request,"
+ "SignedHeaders=host;x-amz-content-sha256;x-amz-date,Signature=%s",
+ o->s3_keyid, date_short, o->s3_region, signature);
+ slist = curl_slist_append(slist, s);
+
+ curl_easy_setopt(curl, CURLOPT_HTTPHEADER, slist);
+
+ free(uri_encoded);
+ free(csha);
+ free(dsha);
+ free(signature);
+}
+
+static void _add_swift_header(CURL *curl, struct curl_slist *slist, struct http_options *o,
+ int op, const char *uri, char *buf, size_t len)
+{
+ char *dsha = NULL;
+ char s[512];
+
+ if (op == DDIR_WRITE) {
+ dsha = _gen_hex_md5(buf, len);
+ }
+ /* Surpress automatic Accept: header */
+ slist = curl_slist_append(slist, "Accept:");
+
+ snprintf(s, sizeof(s), "etag: %s", dsha);
+ slist = curl_slist_append(slist, s);
+
+ snprintf(s, sizeof(s), "x-auth-token: %s", o->swift_auth_token);
+ slist = curl_slist_append(slist, s);
+
+ curl_easy_setopt(curl, CURLOPT_HTTPHEADER, slist);
+
+ free(dsha);
+}
+
+static void fio_http_cleanup(struct thread_data *td)
+{
+ struct http_data *http = td->io_ops_data;
+
+ if (http) {
+ curl_easy_cleanup(http->curl);
+ free(http);
+ }
+}
+
+static size_t _http_read(void *ptr, size_t size, size_t nmemb, void *stream)
+{
+ struct http_curl_stream *state = stream;
+ size_t len = size * nmemb;
+ /* We're retrieving; nothing is supposed to be read locally */
+ if (!stream)
+ return 0;
+ if (len+state->pos > state->max)
+ len = state->max - state->pos;
+ memcpy(ptr, &state->buf[state->pos], len);
+ state->pos += len;
+ return len;
+}
+
+static size_t _http_write(void *ptr, size_t size, size_t nmemb, void *stream)
+{
+ struct http_curl_stream *state = stream;
+ /* We're just discarding the returned body after a PUT */
+ if (!stream)
+ return nmemb;
+ if (size != 1)
+ return CURLE_WRITE_ERROR;
+ if (nmemb + state->pos > state->max)
+ return CURLE_WRITE_ERROR;
+ memcpy(&state->buf[state->pos], ptr, nmemb);
+ state->pos += nmemb;
+ return nmemb;
+}
+
+static int _http_seek(void *stream, curl_off_t offset, int origin)
+{
+ struct http_curl_stream *state = stream;
+ if (offset < state->max && origin == SEEK_SET) {
+ state->pos = offset;
+ return CURL_SEEKFUNC_OK;
+ } else
+ return CURL_SEEKFUNC_FAIL;
+}
+
+static enum fio_q_status fio_http_queue(struct thread_data *td,
+ struct io_u *io_u)
+{
+ struct http_data *http = td->io_ops_data;
+ struct http_options *o = td->eo;
+ struct http_curl_stream _curl_stream;
+ struct curl_slist *slist = NULL;
+ char object[512];
+ char url[1024];
+ long status;
+ CURLcode res;
+ int r = -1;
+
+ fio_ro_check(td, io_u);
+ memset(&_curl_stream, 0, sizeof(_curl_stream));
+ snprintf(object, sizeof(object), "%s_%llu_%llu", td->files[0]->file_name,
+ io_u->offset, io_u->xfer_buflen);
+ if (o->https == FIO_HTTPS_OFF)
+ snprintf(url, sizeof(url), "http://%s%s", o->host, object);
+ else
+ snprintf(url, sizeof(url), "https://%s%s", o->host, object);
+ curl_easy_setopt(http->curl, CURLOPT_URL, url);
+ _curl_stream.buf = io_u->xfer_buf;
+ _curl_stream.max = io_u->xfer_buflen;
+ curl_easy_setopt(http->curl, CURLOPT_SEEKDATA, &_curl_stream);
+ curl_easy_setopt(http->curl, CURLOPT_INFILESIZE_LARGE, (curl_off_t)io_u->xfer_buflen);
+
+ if (o->mode == FIO_HTTP_S3)
+ _add_aws_auth_header(http->curl, slist, o, io_u->ddir, object,
+ io_u->xfer_buf, io_u->xfer_buflen);
+ else if (o->mode == FIO_HTTP_SWIFT)
+ _add_swift_header(http->curl, slist, o, io_u->ddir, object,
+ io_u->xfer_buf, io_u->xfer_buflen);
+
+ if (io_u->ddir == DDIR_WRITE) {
+ curl_easy_setopt(http->curl, CURLOPT_READDATA, &_curl_stream);
+ curl_easy_setopt(http->curl, CURLOPT_WRITEDATA, NULL);
+ curl_easy_setopt(http->curl, CURLOPT_UPLOAD, 1L);
+ res = curl_easy_perform(http->curl);
+ if (res == CURLE_OK) {
+ curl_easy_getinfo(http->curl, CURLINFO_RESPONSE_CODE, &status);
+ if (status == 100 || (status >= 200 && status <= 204))
+ goto out;
+ log_err("DDIR_WRITE failed with HTTP status code %ld\n", status);
+ goto err;
+ }
+ } else if (io_u->ddir == DDIR_READ) {
+ curl_easy_setopt(http->curl, CURLOPT_READDATA, NULL);
+ curl_easy_setopt(http->curl, CURLOPT_WRITEDATA, &_curl_stream);
+ curl_easy_setopt(http->curl, CURLOPT_HTTPGET, 1L);
+ res = curl_easy_perform(http->curl);
+ if (res == CURLE_OK) {
+ curl_easy_getinfo(http->curl, CURLINFO_RESPONSE_CODE, &status);
+ if (status == 200)
+ goto out;
+ else if (status == 404) {
+ /* Object doesn't exist. Pretend we read
+ * zeroes */
+ memset(io_u->xfer_buf, 0, io_u->xfer_buflen);
+ goto out;
+ }
+ log_err("DDIR_READ failed with HTTP status code %ld\n", status);
+ }
+ goto err;
+ } else if (io_u->ddir == DDIR_TRIM) {
+ curl_easy_setopt(http->curl, CURLOPT_HTTPGET, 1L);
+ curl_easy_setopt(http->curl, CURLOPT_CUSTOMREQUEST, "DELETE");
+ curl_easy_setopt(http->curl, CURLOPT_INFILESIZE_LARGE, (curl_off_t)0);
+ curl_easy_setopt(http->curl, CURLOPT_READDATA, NULL);
+ curl_easy_setopt(http->curl, CURLOPT_WRITEDATA, NULL);
+ res = curl_easy_perform(http->curl);
+ if (res == CURLE_OK) {
+ curl_easy_getinfo(http->curl, CURLINFO_RESPONSE_CODE, &status);
+ if (status == 200 || status == 202 || status == 204 || status == 404)
+ goto out;
+ log_err("DDIR_TRIM failed with HTTP status code %ld\n", status);
+ }
+ goto err;
+ }
+
+ log_err("WARNING: Only DDIR_READ/DDIR_WRITE/DDIR_TRIM are supported!\n");
+
+err:
+ io_u->error = r;
+ td_verror(td, io_u->error, "transfer");
+out:
+ curl_slist_free_all(slist);
+ return FIO_Q_COMPLETED;
+}
+
+static struct io_u *fio_http_event(struct thread_data *td, int event)
+{
+ /* sync IO engine - never any outstanding events */
+ return NULL;
+}
+
+int fio_http_getevents(struct thread_data *td, unsigned int min,
+ unsigned int max, const struct timespec *t)
+{
+ /* sync IO engine - never any outstanding events */
+ return 0;
+}
+
+static int fio_http_setup(struct thread_data *td)
+{
+ struct http_data *http = NULL;
+ struct http_options *o = td->eo;
+
+ /* allocate engine specific structure to deal with libhttp. */
+ http = calloc(1, sizeof(*http));
+ if (!http) {
+ log_err("calloc failed.\n");
+ goto cleanup;
+ }
+
+ http->curl = curl_easy_init();
+ if (o->verbose)
+ curl_easy_setopt(http->curl, CURLOPT_VERBOSE, 1L);
+ if (o->verbose > 1)
+ curl_easy_setopt(http->curl, CURLOPT_DEBUGFUNCTION, &_curl_trace);
+ curl_easy_setopt(http->curl, CURLOPT_NOPROGRESS, 1L);
+ curl_easy_setopt(http->curl, CURLOPT_FOLLOWLOCATION, 1L);
+ curl_easy_setopt(http->curl, CURLOPT_PROTOCOLS, CURLPROTO_HTTP|CURLPROTO_HTTPS);
+ if (o->https == FIO_HTTPS_INSECURE) {
+ curl_easy_setopt(http->curl, CURLOPT_SSL_VERIFYPEER, 0L);
+ curl_easy_setopt(http->curl, CURLOPT_SSL_VERIFYHOST, 0L);
+ }
+ curl_easy_setopt(http->curl, CURLOPT_READFUNCTION, _http_read);
+ curl_easy_setopt(http->curl, CURLOPT_WRITEFUNCTION, _http_write);
+ curl_easy_setopt(http->curl, CURLOPT_SEEKFUNCTION, &_http_seek);
+ if (o->user && o->pass) {
+ curl_easy_setopt(http->curl, CURLOPT_USERNAME, o->user);
+ curl_easy_setopt(http->curl, CURLOPT_PASSWORD, o->pass);
+ curl_easy_setopt(http->curl, CURLOPT_HTTPAUTH, CURLAUTH_ANY);
+ }
+
+ td->io_ops_data = http;
+
+ /* Force single process mode. */
+ td->o.use_thread = 1;
+
+ return 0;
+cleanup:
+ fio_http_cleanup(td);
+ return 1;
+}
+
+static int fio_http_open(struct thread_data *td, struct fio_file *f)
+{
+ return 0;
+}
+static int fio_http_invalidate(struct thread_data *td, struct fio_file *f)
+{
+ return 0;
+}
+
+static struct ioengine_ops ioengine = {
+ .name = "http",
+ .version = FIO_IOOPS_VERSION,
+ .flags = FIO_DISKLESSIO,
+ .setup = fio_http_setup,
+ .queue = fio_http_queue,
+ .getevents = fio_http_getevents,
+ .event = fio_http_event,
+ .cleanup = fio_http_cleanup,
+ .open_file = fio_http_open,
+ .invalidate = fio_http_invalidate,
+ .options = options,
+ .option_struct_size = sizeof(struct http_options),
+};
+
+static void fio_init fio_http_register(void)
+{
+ register_ioengine(&ioengine);
+}
+
+static void fio_exit fio_http_unregister(void)
+{
+ unregister_ioengine(&ioengine);
+}
--- /dev/null
+/*
+ * FIO engines for DDN's Infinite Memory Engine.
+ * This file defines 3 engines: ime_psync, ime_psyncv, and ime_aio
+ *
+ * Copyright (C) 2018 DataDirect Networks. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License,
+ * version 2 as published by the Free Software Foundation..
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+/*
+ * Some details about the new engines are given below:
+ *
+ *
+ * ime_psync:
+ * Most basic engine that issues calls to ime_native whenever an IO is queued.
+ *
+ * ime_psyncv:
+ * This engine tries to queue the IOs (by creating iovecs) if asked by FIO (via
+ * iodepth_batch). It refuses to queue when the iovecs can't be appended, and
+ * waits for FIO to issue a commit. After a call to commit and get_events, new
+ * IOs can be queued.
+ *
+ * ime_aio:
+ * This engine tries to queue the IOs (by creating iovecs) if asked by FIO (via
+ * iodepth_batch). When the iovecs can't be appended to the current request, a
+ * new request for IME is created. These requests will be issued to IME when
+ * commit is called. Contrary to ime_psyncv, there can be several requests at
+ * once. We don't need to wait for a request to terminate before creating a new
+ * one.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <errno.h>
+#include <linux/limits.h>
+#include <ime_native.h>
+
+#include "../fio.h"
+
+
+/**************************************************************
+ * Types and constants definitions
+ *
+ **************************************************************/
+
+/* define constants for async IOs */
+#define FIO_IME_IN_PROGRESS -1
+#define FIO_IME_REQ_ERROR -2
+
+/* This flag is used when some jobs were created using threads. In that
+ case, IME can't be finalized in the engine-specific cleanup function,
+ because other threads might still use IME. Instead, IME is finalized
+ in the destructor (see fio_ime_unregister), only when the flag
+ fio_ime_is_initialized is true (which means at least one thread has
+ initialized IME). */
+static bool fio_ime_is_initialized = false;
+
+struct imesio_req {
+ int fd; /* File descriptor */
+ enum fio_ddir ddir; /* Type of IO (read or write) */
+ off_t offset; /* File offset */
+};
+struct imeaio_req {
+ struct ime_aiocb iocb; /* IME aio request */
+ ssize_t status; /* Status of the IME request */
+ enum fio_ddir ddir; /* Type of IO (read or write) */
+ pthread_cond_t cond_endio; /* Condition var to notify FIO */
+ pthread_mutex_t status_mutex; /* Mutex for cond_endio */
+};
+
+/* This structure will be used for 2 engines: ime_psyncv and ime_aio */
+struct ime_data {
+ union {
+ struct imeaio_req *aioreqs; /* array of aio requests */
+ struct imesio_req *sioreq; /* pointer to the only syncio request */
+ };
+ struct iovec *iovecs; /* array of queued iovecs */
+ struct io_u **io_us; /* array of queued io_u pointers */
+ struct io_u **event_io_us; /* array of the events retieved afer get_events*/
+ unsigned int queued; /* iovecs/io_us in the queue */
+ unsigned int events; /* number of committed iovecs/io_us */
+
+ /* variables used to implement a "ring" queue */
+ unsigned int depth; /* max entries in the queue */
+ unsigned int head; /* index used to append */
+ unsigned int tail; /* index used to pop */
+ unsigned int cur_commit; /* index of the first uncommitted req */
+
+ /* offset used by the last iovec (used to check if the iovecs can be appended)*/
+ unsigned long long last_offset;
+
+ /* The variables below are used for aio only */
+ struct imeaio_req *last_req; /* last request awaiting committing */
+};
+
+
+/**************************************************************
+ * Private functions for queueing/unqueueing
+ *
+ **************************************************************/
+
+static void fio_ime_queue_incr (struct ime_data *ime_d)
+{
+ ime_d->head = (ime_d->head + 1) % ime_d->depth;
+ ime_d->queued++;
+}
+
+static void fio_ime_queue_red (struct ime_data *ime_d)
+{
+ ime_d->tail = (ime_d->tail + 1) % ime_d->depth;
+ ime_d->queued--;
+ ime_d->events--;
+}
+
+static void fio_ime_queue_commit (struct ime_data *ime_d, int iovcnt)
+{
+ ime_d->cur_commit = (ime_d->cur_commit + iovcnt) % ime_d->depth;
+ ime_d->events += iovcnt;
+}
+
+static void fio_ime_queue_reset (struct ime_data *ime_d)
+{
+ ime_d->head = 0;
+ ime_d->tail = 0;
+ ime_d->cur_commit = 0;
+ ime_d->queued = 0;
+ ime_d->events = 0;
+}
+
+/**************************************************************
+ * General IME functions
+ * (needed for both sync and async IOs)
+ **************************************************************/
+
+static char *fio_set_ime_filename(char* filename)
+{
+ static __thread char ime_filename[PATH_MAX];
+ int ret;
+
+ ret = snprintf(ime_filename, PATH_MAX, "%s%s", DEFAULT_IME_FILE_PREFIX, filename);
+ if (ret < PATH_MAX)
+ return ime_filename;
+
+ return NULL;
+}
+
+static int fio_ime_get_file_size(struct thread_data *td, struct fio_file *f)
+{
+ struct stat buf;
+ int ret;
+ char *ime_filename;
+
+ dprint(FD_FILE, "get file size %s\n", f->file_name);
+
+ ime_filename = fio_set_ime_filename(f->file_name);
+ if (ime_filename == NULL)
+ return 1;
+ ret = ime_native_stat(ime_filename, &buf);
+ if (ret == -1) {
+ td_verror(td, errno, "fstat");
+ return 1;
+ }
+
+ f->real_file_size = buf.st_size;
+ return 0;
+}
+
+/* This functions mimics the generic_file_open function, but issues
+ IME native calls instead of POSIX calls. */
+static int fio_ime_open_file(struct thread_data *td, struct fio_file *f)
+{
+ int flags = 0;
+ int ret;
+ uint64_t desired_fs;
+ char *ime_filename;
+
+ dprint(FD_FILE, "fd open %s\n", f->file_name);
+
+ if (td_trim(td)) {
+ td_verror(td, EINVAL, "IME does not support TRIM operation");
+ return 1;
+ }
+
+ if (td->o.oatomic) {
+ td_verror(td, EINVAL, "IME does not support atomic IO");
+ return 1;
+ }
+ if (td->o.odirect)
+ flags |= O_DIRECT;
+ if (td->o.sync_io)
+ flags |= O_SYNC;
+ if (td->o.create_on_open && td->o.allow_create)
+ flags |= O_CREAT;
+
+ if (td_write(td)) {
+ if (!read_only)
+ flags |= O_RDWR;
+
+ if (td->o.allow_create)
+ flags |= O_CREAT;
+ } else if (td_read(td)) {
+ flags |= O_RDONLY;
+ } else {
+ /* We should never go here. */
+ td_verror(td, EINVAL, "Unsopported open mode");
+ return 1;
+ }
+
+ ime_filename = fio_set_ime_filename(f->file_name);
+ if (ime_filename == NULL)
+ return 1;
+ f->fd = ime_native_open(ime_filename, flags, 0600);
+ if (f->fd == -1) {
+ char buf[FIO_VERROR_SIZE];
+ int __e = errno;
+
+ snprintf(buf, sizeof(buf), "open(%s)", f->file_name);
+ td_verror(td, __e, buf);
+ return 1;
+ }
+
+ /* Now we need to make sure the real file size is sufficient for FIO
+ to do its things. This is normally done before the file open function
+ is called, but because FIO would use POSIX calls, we need to do it
+ ourselves */
+ ret = fio_ime_get_file_size(td, f);
+ if (ret < 0) {
+ ime_native_close(f->fd);
+ td_verror(td, errno, "ime_get_file_size");
+ return 1;
+ }
+
+ desired_fs = f->io_size + f->file_offset;
+ if (td_write(td)) {
+ dprint(FD_FILE, "Laying out file %s%s\n",
+ DEFAULT_IME_FILE_PREFIX, f->file_name);
+ if (!td->o.create_on_open &&
+ f->real_file_size < desired_fs &&
+ ime_native_ftruncate(f->fd, desired_fs) < 0) {
+ ime_native_close(f->fd);
+ td_verror(td, errno, "ime_native_ftruncate");
+ return 1;
+ }
+ if (f->real_file_size < desired_fs)
+ f->real_file_size = desired_fs;
+ } else if (td_read(td) && f->real_file_size < desired_fs) {
+ ime_native_close(f->fd);
+ log_err("error: can't read %lu bytes from file with "
+ "%lu bytes\n", desired_fs, f->real_file_size);
+ return 1;
+ }
+
+ return 0;
+}
+
+static int fio_ime_close_file(struct thread_data fio_unused *td, struct fio_file *f)
+{
+ int ret = 0;
+
+ dprint(FD_FILE, "fd close %s\n", f->file_name);
+
+ if (ime_native_close(f->fd) < 0)
+ ret = errno;
+
+ f->fd = -1;
+ return ret;
+}
+
+static int fio_ime_unlink_file(struct thread_data *td, struct fio_file *f)
+{
+ char *ime_filename = fio_set_ime_filename(f->file_name);
+ int ret;
+
+ if (ime_filename == NULL)
+ return 1;
+
+ ret = unlink(ime_filename);
+ return ret < 0 ? errno : 0;
+}
+
+static struct io_u *fio_ime_event(struct thread_data *td, int event)
+{
+ struct ime_data *ime_d = td->io_ops_data;
+
+ return ime_d->event_io_us[event];
+}
+
+/* Setup file used to replace get_file_sizes when settin up the file.
+ Instead we will set real_file_sie to 0 for each file. This way we
+ can avoid calling ime_native_init before the forks are created. */
+static int fio_ime_setup(struct thread_data *td)
+{
+ struct fio_file *f;
+ unsigned int i;
+
+ for_each_file(td, f, i) {
+ dprint(FD_FILE, "setup: set file size to 0 for %p/%d/%s\n",
+ f, i, f->file_name);
+ f->real_file_size = 0;
+ }
+
+ return 0;
+}
+
+static int fio_ime_engine_init(struct thread_data *td)
+{
+ struct fio_file *f;
+ unsigned int i;
+
+ dprint(FD_IO, "ime engine init\n");
+ if (fio_ime_is_initialized && !td->o.use_thread) {
+ log_err("Warning: something might go wrong. Not all threads/forks were"
+ " created before the FIO jobs were initialized.\n");
+ }
+
+ ime_native_init();
+ fio_ime_is_initialized = true;
+
+ /* We have to temporarily set real_file_size so that
+ FIO can initialize properly. It will be corrected
+ on file open. */
+ for_each_file(td, f, i)
+ f->real_file_size = f->io_size + f->file_offset;
+
+ return 0;
+}
+
+static void fio_ime_engine_finalize(struct thread_data *td)
+{
+ /* Only finalize IME when using forks */
+ if (!td->o.use_thread) {
+ if (ime_native_finalize() < 0)
+ log_err("error in ime_native_finalize\n");
+ fio_ime_is_initialized = false;
+ }
+}
+
+
+/**************************************************************
+ * Private functions for blocking IOs
+ * (without iovecs)
+ **************************************************************/
+
+/* Notice: this function comes from the sync engine */
+/* It is used by the commit function to return a proper code and fill
+ some attributes in the io_u used for the IO. */
+static int fio_ime_psync_end(struct thread_data *td, struct io_u *io_u, ssize_t ret)
+{
+ if (ret != (ssize_t) io_u->xfer_buflen) {
+ if (ret >= 0) {
+ io_u->resid = io_u->xfer_buflen - ret;
+ io_u->error = 0;
+ return FIO_Q_COMPLETED;
+ } else
+ io_u->error = errno;
+ }
+
+ if (io_u->error) {
+ io_u_log_error(td, io_u);
+ td_verror(td, io_u->error, "xfer");
+ }
+
+ return FIO_Q_COMPLETED;
+}
+
+static enum fio_q_status fio_ime_psync_queue(struct thread_data *td,
+ struct io_u *io_u)
+{
+ struct fio_file *f = io_u->file;
+ ssize_t ret;
+
+ fio_ro_check(td, io_u);
+
+ if (io_u->ddir == DDIR_READ)
+ ret = ime_native_pread(f->fd, io_u->xfer_buf, io_u->xfer_buflen, io_u->offset);
+ else if (io_u->ddir == DDIR_WRITE)
+ ret = ime_native_pwrite(f->fd, io_u->xfer_buf, io_u->xfer_buflen, io_u->offset);
+ else if (io_u->ddir == DDIR_SYNC)
+ ret = ime_native_fsync(f->fd);
+ else {
+ ret = io_u->xfer_buflen;
+ io_u->error = EINVAL;
+ }
+
+ return fio_ime_psync_end(td, io_u, ret);
+}
+
+
+/**************************************************************
+ * Private functions for blocking IOs
+ * (with iovecs)
+ **************************************************************/
+
+static bool fio_ime_psyncv_can_queue(struct ime_data *ime_d, struct io_u *io_u)
+{
+ /* We can only queue if:
+ - There are no queued iovecs
+ - Or if there is at least one:
+ - There must be no event waiting for retrieval
+ - The offsets must be contiguous
+ - The ddir and fd must be the same */
+ return (ime_d->queued == 0 || (
+ ime_d->events == 0 &&
+ ime_d->last_offset == io_u->offset &&
+ ime_d->sioreq->ddir == io_u->ddir &&
+ ime_d->sioreq->fd == io_u->file->fd));
+}
+
+/* Before using this function, we should have already
+ ensured that the queue is not full */
+static void fio_ime_psyncv_enqueue(struct ime_data *ime_d, struct io_u *io_u)
+{
+ struct imesio_req *ioreq = ime_d->sioreq;
+ struct iovec *iov = &ime_d->iovecs[ime_d->head];
+
+ iov->iov_base = io_u->xfer_buf;
+ iov->iov_len = io_u->xfer_buflen;
+
+ if (ime_d->queued == 0) {
+ ioreq->offset = io_u->offset;
+ ioreq->ddir = io_u->ddir;
+ ioreq->fd = io_u->file->fd;
+ }
+
+ ime_d->io_us[ime_d->head] = io_u;
+ ime_d->last_offset = io_u->offset + io_u->xfer_buflen;
+ fio_ime_queue_incr(ime_d);
+}
+
+/* Tries to queue an IO. It will fail if the IO can't be appended to the
+ current request or if the current request has been committed but not
+ yet retrieved by get_events. */
+static enum fio_q_status fio_ime_psyncv_queue(struct thread_data *td,
+ struct io_u *io_u)
+{
+ struct ime_data *ime_d = td->io_ops_data;
+
+ fio_ro_check(td, io_u);
+
+ if (ime_d->queued == ime_d->depth)
+ return FIO_Q_BUSY;
+
+ if (io_u->ddir == DDIR_READ || io_u->ddir == DDIR_WRITE) {
+ if (!fio_ime_psyncv_can_queue(ime_d, io_u))
+ return FIO_Q_BUSY;
+
+ dprint(FD_IO, "queue: ddir=%d at %u commit=%u queued=%u events=%u\n",
+ io_u->ddir, ime_d->head, ime_d->cur_commit,
+ ime_d->queued, ime_d->events);
+ fio_ime_psyncv_enqueue(ime_d, io_u);
+ return FIO_Q_QUEUED;
+ } else if (io_u->ddir == DDIR_SYNC) {
+ if (ime_native_fsync(io_u->file->fd) < 0) {
+ io_u->error = errno;
+ td_verror(td, io_u->error, "fsync");
+ }
+ return FIO_Q_COMPLETED;
+ } else {
+ io_u->error = EINVAL;
+ td_verror(td, io_u->error, "wrong ddir");
+ return FIO_Q_COMPLETED;
+ }
+}
+
+/* Notice: this function comes from the sync engine */
+/* It is used by the commit function to return a proper code and fill
+ some attributes in the io_us appended to the current request. */
+static int fio_ime_psyncv_end(struct thread_data *td, ssize_t bytes)
+{
+ struct ime_data *ime_d = td->io_ops_data;
+ struct io_u *io_u;
+ unsigned int i;
+ int err = errno;
+
+ for (i = 0; i < ime_d->queued; i++) {
+ io_u = ime_d->io_us[i];
+
+ if (bytes == -1)
+ io_u->error = err;
+ else {
+ unsigned int this_io;
+
+ this_io = bytes;
+ if (this_io > io_u->xfer_buflen)
+ this_io = io_u->xfer_buflen;
+
+ io_u->resid = io_u->xfer_buflen - this_io;
+ io_u->error = 0;
+ bytes -= this_io;
+ }
+ }
+
+ if (bytes == -1) {
+ td_verror(td, err, "xfer psyncv");
+ return -err;
+ }
+
+ return 0;
+}
+
+/* Commits the current request by calling ime_native (with one or several
+ iovecs). After this commit, the corresponding events (one per iovec)
+ can be retrieved by get_events. */
+static int fio_ime_psyncv_commit(struct thread_data *td)
+{
+ struct ime_data *ime_d = td->io_ops_data;
+ struct imesio_req *ioreq;
+ int ret = 0;
+
+ /* Exit if there are no (new) events to commit
+ or if the previous committed event haven't been retrieved */
+ if (!ime_d->queued || ime_d->events)
+ return 0;
+
+ ioreq = ime_d->sioreq;
+ ime_d->events = ime_d->queued;
+ if (ioreq->ddir == DDIR_READ)
+ ret = ime_native_preadv(ioreq->fd, ime_d->iovecs, ime_d->queued, ioreq->offset);
+ else
+ ret = ime_native_pwritev(ioreq->fd, ime_d->iovecs, ime_d->queued, ioreq->offset);
+
+ dprint(FD_IO, "committed %d iovecs\n", ime_d->queued);
+
+ return fio_ime_psyncv_end(td, ret);
+}
+
+static int fio_ime_psyncv_getevents(struct thread_data *td, unsigned int min,
+ unsigned int max, const struct timespec *t)
+{
+ struct ime_data *ime_d = td->io_ops_data;
+ struct io_u *io_u;
+ int events = 0;
+ unsigned int count;
+
+ if (ime_d->events) {
+ for (count = 0; count < ime_d->events; count++) {
+ io_u = ime_d->io_us[count];
+ ime_d->event_io_us[events] = io_u;
+ events++;
+ }
+ fio_ime_queue_reset(ime_d);
+ }
+
+ dprint(FD_IO, "getevents(%u,%u) ret=%d queued=%u events=%u\n",
+ min, max, events, ime_d->queued, ime_d->events);
+ return events;
+}
+
+static int fio_ime_psyncv_init(struct thread_data *td)
+{
+ struct ime_data *ime_d;
+
+ if (fio_ime_engine_init(td) < 0)
+ return 1;
+
+ ime_d = calloc(1, sizeof(*ime_d));
+
+ ime_d->sioreq = malloc(sizeof(struct imesio_req));
+ ime_d->iovecs = malloc(td->o.iodepth * sizeof(struct iovec));
+ ime_d->io_us = malloc(2 * td->o.iodepth * sizeof(struct io_u *));
+ ime_d->event_io_us = ime_d->io_us + td->o.iodepth;
+
+ ime_d->depth = td->o.iodepth;
+
+ td->io_ops_data = ime_d;
+ return 0;
+}
+
+static void fio_ime_psyncv_clean(struct thread_data *td)
+{
+ struct ime_data *ime_d = td->io_ops_data;
+
+ if (ime_d) {
+ free(ime_d->sioreq);
+ free(ime_d->iovecs);
+ free(ime_d->io_us);
+ free(ime_d);
+ td->io_ops_data = NULL;
+ }
+
+ fio_ime_engine_finalize(td);
+}
+
+
+/**************************************************************
+ * Private functions for non-blocking IOs
+ *
+ **************************************************************/
+
+void fio_ime_aio_complete_cb (struct ime_aiocb *aiocb, int err,
+ ssize_t bytes)
+{
+ struct imeaio_req *ioreq = (struct imeaio_req *) aiocb->user_context;
+
+ pthread_mutex_lock(&ioreq->status_mutex);
+ ioreq->status = err == 0 ? bytes : FIO_IME_REQ_ERROR;
+ pthread_mutex_unlock(&ioreq->status_mutex);
+
+ pthread_cond_signal(&ioreq->cond_endio);
+}
+
+static bool fio_ime_aio_can_queue (struct ime_data *ime_d, struct io_u *io_u)
+{
+ /* So far we can queue in any case. */
+ return true;
+}
+static bool fio_ime_aio_can_append (struct ime_data *ime_d, struct io_u *io_u)
+{
+ /* We can only append if:
+ - The iovecs will be contiguous in the array
+ - There is already a queued iovec
+ - The offsets are contiguous
+ - The ddir and fs are the same */
+ return (ime_d->head != 0 &&
+ ime_d->queued - ime_d->events > 0 &&
+ ime_d->last_offset == io_u->offset &&
+ ime_d->last_req->ddir == io_u->ddir &&
+ ime_d->last_req->iocb.fd == io_u->file->fd);
+}
+
+/* Before using this function, we should have already
+ ensured that the queue is not full */
+static void fio_ime_aio_enqueue(struct ime_data *ime_d, struct io_u *io_u)
+{
+ struct imeaio_req *ioreq = &ime_d->aioreqs[ime_d->head];
+ struct ime_aiocb *iocb = &ioreq->iocb;
+ struct iovec *iov = &ime_d->iovecs[ime_d->head];
+
+ iov->iov_base = io_u->xfer_buf;
+ iov->iov_len = io_u->xfer_buflen;
+
+ if (fio_ime_aio_can_append(ime_d, io_u))
+ ime_d->last_req->iocb.iovcnt++;
+ else {
+ ioreq->status = FIO_IME_IN_PROGRESS;
+ ioreq->ddir = io_u->ddir;
+ ime_d->last_req = ioreq;
+
+ iocb->complete_cb = &fio_ime_aio_complete_cb;
+ iocb->fd = io_u->file->fd;
+ iocb->file_offset = io_u->offset;
+ iocb->iov = iov;
+ iocb->iovcnt = 1;
+ iocb->flags = 0;
+ iocb->user_context = (intptr_t) ioreq;
+ }
+
+ ime_d->io_us[ime_d->head] = io_u;
+ ime_d->last_offset = io_u->offset + io_u->xfer_buflen;
+ fio_ime_queue_incr(ime_d);
+}
+
+/* Tries to queue an IO. It will create a new request if the IO can't be
+ appended to the current request. It will fail if the queue can't contain
+ any more io_u/iovec. In this case, commit and then get_events need to be
+ called. */
+static enum fio_q_status fio_ime_aio_queue(struct thread_data *td,
+ struct io_u *io_u)
+{
+ struct ime_data *ime_d = td->io_ops_data;
+
+ fio_ro_check(td, io_u);
+
+ dprint(FD_IO, "queue: ddir=%d at %u commit=%u queued=%u events=%u\n",
+ io_u->ddir, ime_d->head, ime_d->cur_commit,
+ ime_d->queued, ime_d->events);
+
+ if (ime_d->queued == ime_d->depth)
+ return FIO_Q_BUSY;
+
+ if (io_u->ddir == DDIR_READ || io_u->ddir == DDIR_WRITE) {
+ if (!fio_ime_aio_can_queue(ime_d, io_u))
+ return FIO_Q_BUSY;
+
+ fio_ime_aio_enqueue(ime_d, io_u);
+ return FIO_Q_QUEUED;
+ } else if (io_u->ddir == DDIR_SYNC) {
+ if (ime_native_fsync(io_u->file->fd) < 0) {
+ io_u->error = errno;
+ td_verror(td, io_u->error, "fsync");
+ }
+ return FIO_Q_COMPLETED;
+ } else {
+ io_u->error = EINVAL;
+ td_verror(td, io_u->error, "wrong ddir");
+ return FIO_Q_COMPLETED;
+ }
+}
+
+static int fio_ime_aio_commit(struct thread_data *td)
+{
+ struct ime_data *ime_d = td->io_ops_data;
+ struct imeaio_req *ioreq;
+ int ret = 0;
+
+ /* Loop while there are events to commit */
+ while (ime_d->queued - ime_d->events) {
+ ioreq = &ime_d->aioreqs[ime_d->cur_commit];
+ if (ioreq->ddir == DDIR_READ)
+ ret = ime_native_aio_read(&ioreq->iocb);
+ else
+ ret = ime_native_aio_write(&ioreq->iocb);
+
+ fio_ime_queue_commit(ime_d, ioreq->iocb.iovcnt);
+
+ /* fio needs a negative error code */
+ if (ret < 0) {
+ ioreq->status = FIO_IME_REQ_ERROR;
+ return -errno;
+ }
+
+ io_u_mark_submit(td, ioreq->iocb.iovcnt);
+ dprint(FD_IO, "committed %d iovecs commit=%u queued=%u events=%u\n",
+ ioreq->iocb.iovcnt, ime_d->cur_commit,
+ ime_d->queued, ime_d->events);
+ }
+
+ return 0;
+}
+
+static int fio_ime_aio_getevents(struct thread_data *td, unsigned int min,
+ unsigned int max, const struct timespec *t)
+{
+ struct ime_data *ime_d = td->io_ops_data;
+ struct imeaio_req *ioreq;
+ struct io_u *io_u;
+ int events = 0;
+ unsigned int count;
+ ssize_t bytes;
+
+ while (ime_d->events) {
+ ioreq = &ime_d->aioreqs[ime_d->tail];
+
+ /* Break if we already got events, and if we will
+ exceed max if we append the next events */
+ if (events && events + ioreq->iocb.iovcnt > max)
+ break;
+
+ if (ioreq->status != FIO_IME_IN_PROGRESS) {
+
+ bytes = ioreq->status;
+ for (count = 0; count < ioreq->iocb.iovcnt; count++) {
+ io_u = ime_d->io_us[ime_d->tail];
+ ime_d->event_io_us[events] = io_u;
+ events++;
+ fio_ime_queue_red(ime_d);
+
+ if (ioreq->status == FIO_IME_REQ_ERROR)
+ io_u->error = EIO;
+ else {
+ io_u->resid = bytes > io_u->xfer_buflen ?
+ 0 : io_u->xfer_buflen - bytes;
+ io_u->error = 0;
+ bytes -= io_u->xfer_buflen - io_u->resid;
+ }
+ }
+ } else {
+ pthread_mutex_lock(&ioreq->status_mutex);
+ while (ioreq->status == FIO_IME_IN_PROGRESS)
+ pthread_cond_wait(&ioreq->cond_endio, &ioreq->status_mutex);
+ pthread_mutex_unlock(&ioreq->status_mutex);
+ }
+
+ }
+
+ dprint(FD_IO, "getevents(%u,%u) ret=%d queued=%u events=%u\n", min, max,
+ events, ime_d->queued, ime_d->events);
+ return events;
+}
+
+static int fio_ime_aio_init(struct thread_data *td)
+{
+ struct ime_data *ime_d;
+ struct imeaio_req *ioreq;
+ unsigned int i;
+
+ if (fio_ime_engine_init(td) < 0)
+ return 1;
+
+ ime_d = calloc(1, sizeof(*ime_d));
+
+ ime_d->aioreqs = malloc(td->o.iodepth * sizeof(struct imeaio_req));
+ ime_d->iovecs = malloc(td->o.iodepth * sizeof(struct iovec));
+ ime_d->io_us = malloc(2 * td->o.iodepth * sizeof(struct io_u *));
+ ime_d->event_io_us = ime_d->io_us + td->o.iodepth;
+
+ ime_d->depth = td->o.iodepth;
+ for (i = 0; i < ime_d->depth; i++) {
+ ioreq = &ime_d->aioreqs[i];
+ pthread_cond_init(&ioreq->cond_endio, NULL);
+ pthread_mutex_init(&ioreq->status_mutex, NULL);
+ }
+
+ td->io_ops_data = ime_d;
+ return 0;
+}
+
+static void fio_ime_aio_clean(struct thread_data *td)
+{
+ struct ime_data *ime_d = td->io_ops_data;
+ struct imeaio_req *ioreq;
+ unsigned int i;
+
+ if (ime_d) {
+ for (i = 0; i < ime_d->depth; i++) {
+ ioreq = &ime_d->aioreqs[i];
+ pthread_cond_destroy(&ioreq->cond_endio);
+ pthread_mutex_destroy(&ioreq->status_mutex);
+ }
+ free(ime_d->aioreqs);
+ free(ime_d->iovecs);
+ free(ime_d->io_us);
+ free(ime_d);
+ td->io_ops_data = NULL;
+ }
+
+ fio_ime_engine_finalize(td);
+}
+
+
+/**************************************************************
+ * IO engines definitions
+ *
+ **************************************************************/
+
+/* The FIO_DISKLESSIO flag used for these engines is necessary to prevent
+ FIO from using POSIX calls. See fio_ime_open_file for more details. */
+
+static struct ioengine_ops ioengine_prw = {
+ .name = "ime_psync",
+ .version = FIO_IOOPS_VERSION,
+ .setup = fio_ime_setup,
+ .init = fio_ime_engine_init,
+ .cleanup = fio_ime_engine_finalize,
+ .queue = fio_ime_psync_queue,
+ .open_file = fio_ime_open_file,
+ .close_file = fio_ime_close_file,
+ .get_file_size = fio_ime_get_file_size,
+ .unlink_file = fio_ime_unlink_file,
+ .flags = FIO_SYNCIO | FIO_DISKLESSIO,
+};
+
+static struct ioengine_ops ioengine_pvrw = {
+ .name = "ime_psyncv",
+ .version = FIO_IOOPS_VERSION,
+ .setup = fio_ime_setup,
+ .init = fio_ime_psyncv_init,
+ .cleanup = fio_ime_psyncv_clean,
+ .queue = fio_ime_psyncv_queue,
+ .commit = fio_ime_psyncv_commit,
+ .getevents = fio_ime_psyncv_getevents,
+ .event = fio_ime_event,
+ .open_file = fio_ime_open_file,
+ .close_file = fio_ime_close_file,
+ .get_file_size = fio_ime_get_file_size,
+ .unlink_file = fio_ime_unlink_file,
+ .flags = FIO_SYNCIO | FIO_DISKLESSIO,
+};
+
+static struct ioengine_ops ioengine_aio = {
+ .name = "ime_aio",
+ .version = FIO_IOOPS_VERSION,
+ .setup = fio_ime_setup,
+ .init = fio_ime_aio_init,
+ .cleanup = fio_ime_aio_clean,
+ .queue = fio_ime_aio_queue,
+ .commit = fio_ime_aio_commit,
+ .getevents = fio_ime_aio_getevents,
+ .event = fio_ime_event,
+ .open_file = fio_ime_open_file,
+ .close_file = fio_ime_close_file,
+ .get_file_size = fio_ime_get_file_size,
+ .unlink_file = fio_ime_unlink_file,
+ .flags = FIO_DISKLESSIO,
+};
+
+static void fio_init fio_ime_register(void)
+{
+ register_ioengine(&ioengine_prw);
+ register_ioengine(&ioengine_pvrw);
+ register_ioengine(&ioengine_aio);
+}
+
+static void fio_exit fio_ime_unregister(void)
+{
+ unregister_ioengine(&ioengine_prw);
+ unregister_ioengine(&ioengine_pvrw);
+ unregister_ioengine(&ioengine_aio);
+
+ if (fio_ime_is_initialized && ime_native_finalize() < 0)
+ log_err("Warning: IME did not finalize properly\n");
+}
return true;
}
+static int gen_eta_str(struct jobs_eta *je, char *p, size_t left,
+ char **rate_str, char **iops_str)
+{
+ bool has_r = je->rate[DDIR_READ] || je->iops[DDIR_READ];
+ bool has_w = je->rate[DDIR_WRITE] || je->iops[DDIR_WRITE];
+ bool has_t = je->rate[DDIR_TRIM] || je->iops[DDIR_TRIM];
+ int l = 0;
+
+ if (!has_r && !has_w && !has_t)
+ return 0;
+
+ if (has_r) {
+ l += snprintf(p + l, left - l, "[r=%s", rate_str[DDIR_READ]);
+ if (!has_w)
+ l += snprintf(p + l, left - l, "]");
+ }
+ if (has_w) {
+ if (has_r)
+ l += snprintf(p + l, left - l, ",");
+ else
+ l += snprintf(p + l, left - l, "[");
+ l += snprintf(p + l, left - l, "w=%s", rate_str[DDIR_WRITE]);
+ if (!has_t)
+ l += snprintf(p + l, left - l, "]");
+ }
+ if (has_t) {
+ if (has_r || has_w)
+ l += snprintf(p + l, left - l, ",");
+ else if (!has_r && !has_w)
+ l += snprintf(p + l, left - l, "[");
+ l += snprintf(p + l, left - l, "t=%s]", rate_str[DDIR_TRIM]);
+ }
+ if (has_r) {
+ l += snprintf(p + l, left - l, "[r=%s", iops_str[DDIR_READ]);
+ if (!has_w)
+ l += snprintf(p + l, left - l, " IOPS]");
+ }
+ if (has_w) {
+ if (has_r)
+ l += snprintf(p + l, left - l, ",");
+ else
+ l += snprintf(p + l, left - l, "[");
+ l += snprintf(p + l, left - l, "w=%s", iops_str[DDIR_WRITE]);
+ if (!has_t)
+ l += snprintf(p + l, left - l, " IOPS]");
+ }
+ if (has_t) {
+ if (has_r || has_w)
+ l += snprintf(p + l, left - l, ",");
+ else if (!has_r && !has_w)
+ l += snprintf(p + l, left - l, "[");
+ l += snprintf(p + l, left - l, "t=%s IOPS]", iops_str[DDIR_TRIM]);
+ }
+
+ return l;
+}
+
void display_thread_status(struct jobs_eta *je)
{
static struct timespec disp_eta_new_line;
}
left = sizeof(output) - (p - output) - 1;
+ l = snprintf(p, left, ": [%s][%s]", je->run_str, perc_str);
+ l += gen_eta_str(je, p + l, left - l, rate_str, iops_str);
+ l += snprintf(p + l, left - l, "[eta %s]", eta_str);
- if (je->rate[DDIR_TRIM] || je->iops[DDIR_TRIM])
- l = snprintf(p, left,
- ": [%s][%s][r=%s,w=%s,t=%s][r=%s,w=%s,t=%s IOPS][eta %s]",
- je->run_str, perc_str, rate_str[DDIR_READ],
- rate_str[DDIR_WRITE], rate_str[DDIR_TRIM],
- iops_str[DDIR_READ], iops_str[DDIR_WRITE],
- iops_str[DDIR_TRIM], eta_str);
- else
- l = snprintf(p, left,
- ": [%s][%s][r=%s,w=%s][r=%s,w=%s IOPS][eta %s]",
- je->run_str, perc_str,
- rate_str[DDIR_READ], rate_str[DDIR_WRITE],
- iops_str[DDIR_READ], iops_str[DDIR_WRITE],
- eta_str);
/* If truncation occurred adjust l so p is on the null */
if (l >= left)
l = left - 1;
--- /dev/null
+# Example test for the HTTP engine's S3 support against Amazon AWS.
+# Obviously, you have to adjust the S3 credentials; for this example,
+# they're passed in via the environment.
+#
+
+[global]
+ioengine=http
+name=test
+direct=1
+filename=/larsmb-fio-test/object
+http_verbose=0
+https=on
+http_mode=s3
+http_s3_key=${S3_KEY}
+http_s3_keyid=${S3_ID}
+http_host=s3.eu-central-1.amazonaws.com
+http_s3_region=eu-central-1
+group_reporting
+
+# With verify, this both writes and reads the object
+[create]
+rw=write
+bs=4k
+size=64k
+io_size=4k
+verify=sha256
+
+[trim]
+stonewall
+rw=trim
+bs=4k
+size=64k
+io_size=4k
+
--- /dev/null
+[global]
+ioengine=http
+rw=randwrite
+name=test
+direct=1
+http_verbose=0
+http_mode=swift
+https=on
+# This is the hostname and port portion of the public access link for
+# the container:
+http_host=swift.srv.openstack.local:8081
+filename_format=/swift/v1/fio-test/bucket.$jobnum
+group_reporting
+bs=64k
+size=1M
+# Currently, fio cannot yet generate the Swift Auth-Token itself.
+# You need to set this prior to running fio via
+# eval $(openstack token issue -f shell --prefix SWIFT_) ; export SWIFT_id
+http_swift_auth_token=${SWIFT_id}
+
+[create]
+numjobs=1
+rw=randwrite
+io_size=256k
+verify=sha256
+
+# This will delete all created objects again
+[trim]
+stonewall
+numjobs=1
+rw=trim
+io_size=64k
--- /dev/null
+[global]
+ioengine=http
+rw=randwrite
+name=test
+direct=1
+http_verbose=0
+http_mode=webdav
+https=off
+http_host=localhost
+filename_format=/dav/bucket.$jobnum
+group_reporting
+bs=64k
+size=1M
+
+[create]
+numjobs=16
+rw=randwrite
+io_size=10M
+verify=sha256
+
+# This will delete all created objects again
+[trim]
+stonewall
+numjobs=16
+rw=trim
+io_size=1M
--- /dev/null
+# This jobfile performs basic write+read operations using
+# DDN's Infinite Memory Engine.
+
+[global]
+
+# Use as much jobs as possible to maximize performance
+numjobs=8
+
+# The filename should be uniform so that "read" jobs can read what
+# the "write" jobs have written.
+filename_format=fio-test-ime.$jobnum.$filenum
+
+size=25g
+bs=128k
+
+# These settings are useful for the asynchronous ime_aio engine:
+# by setting the io depth to twice the size of a "batch", we can
+# queue IOs while other IOs are "in-flight".
+iodepth=32
+iodepth_batch=16
+iodepth_batch_complete=16
+
+[write-psync]
+stonewall
+rw=write
+ioengine=ime_psync
+
+[read-psync]
+stonewall
+rw=read
+ioengine=ime_psync
+
+[write-psyncv]
+stonewall
+rw=write
+ioengine=ime_psyncv
+
+[read-psyncv]
+stonewall
+rw=read
+ioengine=ime_psyncv
+
+[write-aio]
+stonewall
+rw=write
+ioengine=ime_aio
+
+[read-aio]
+stonewall
+rw=read
+ioengine=ime_aio
\ No newline at end of file
Set this \fIcommand\fR as remote trigger.
.TP
.BI \-\-aux\-path \fR=\fPpath
-Use this \fIpath\fR for fio state generated files.
+Use the directory specified by \fIpath\fP for generated state files instead
+of the current working directory.
.SH "JOB FILE FORMAT"
Any parameters following the options will be assumed to be job files, unless
they match a job file parameter. Multiple job files can be listed and each job
assigned equally distributed to job clones created by \fBnumjobs\fR as
long as they are using generated filenames. If specific \fBfilename\fR(s) are
set fio will use the first listed directory, and thereby matching the
-\fBfilename\fR semantic which generates a file each clone if not specified, but
-let all clones use the same if set.
+\fBfilename\fR semantic (which generates a file for each clone if not
+specified, but lets all clones use the same file if set).
.RS
.P
-See the \fBfilename\fR option for information on how to escape ':' and '\'
+See the \fBfilename\fR option for information on how to escape ':' and '\\'
characters within the directory path itself.
+.P
+Note: To control the directory fio will use for internal state files
+use \fB\-\-aux\-path\fR.
.RE
.TP
.BI filename \fR=\fPstr
explicit size is specified by \fBfilesize\fR.
.RS
.P
-Each colon and backslash in the wanted path must be escaped with a '\'
+Each colon and backslash in the wanted path must be escaped with a '\\'
character. For instance, if the path is `/dev/dsk/foo@3,0:c' then you
would use `filename=/dev/dsk/foo@3,0\\:c' and if the path is
-`F:\\\\filename' then you would use `filename=F\\:\\\\filename'.
+`F:\\filename' then you would use `filename=F\\:\\\\filename'.
.P
-On Windows, disk devices are accessed as `\\\\\\\\.\\\\PhysicalDrive0' for
-the first device, `\\\\\\\\.\\\\PhysicalDrive1' for the second etc.
+On Windows, disk devices are accessed as `\\\\.\\PhysicalDrive0' for
+the first device, `\\\\.\\PhysicalDrive1' for the second etc.
Note: Windows and FreeBSD prevent write access to areas
of the disk containing in\-use data (e.g. filesystems).
.P
.BI unlink_each_loop \fR=\fPbool
Unlink job files after each iteration or loop. Default: false.
.TP
-.BI zonesize \fR=\fPint
-Divide a file into zones of the specified size. See \fBzoneskip\fR.
+Fio supports strided data access. After having read \fBzonesize\fR bytes from an area that is \fBzonerange\fR bytes big, \fBzoneskip\fR bytes are skipped.
.TP
.BI zonerange \fR=\fPint
-Give size of an I/O zone. See \fBzoneskip\fR.
+Size of a single zone in which I/O occurs.
+.TP
+.BI zonesize \fR=\fPint
+Number of bytes to transfer before skipping \fBzoneskip\fR bytes. If this
+parameter is smaller than \fBzonerange\fR then only a fraction of each zone
+with \fBzonerange\fR bytes will be accessed. If this parameter is larger than
+\fBzonerange\fR then each zone will be accessed multiple times before skipping
+to the next zone.
.TP
.BI zoneskip \fR=\fPint
-Skip the specified number of bytes when \fBzonesize\fR data has been
-read. The two zone options can be used to only do I/O on zones of a file.
+Skip the specified number of bytes after \fBzonesize\fR bytes of data have been
+transferred.
+
.SS "I/O type"
.TP
.BI direct \fR=\fPbool
(RBD) via librbd without the need to use the kernel rbd driver. This
ioengine defines engine specific options.
.TP
+.B http
+I/O engine supporting GET/PUT requests over HTTP(S) with libcurl to
+a WebDAV or S3 endpoint. This ioengine defines engine specific options.
+
+This engine only supports direct IO of iodepth=1; you need to scale this
+via numjobs. blocksize defines the size of the objects to be created.
+
+TRIM is translated to object deletion.
+.TP
.B gfapi
Using GlusterFS libgfapi sync interface to direct access to
GlusterFS volumes without having to go through FUSE. This ioengine
Read and write using mmap I/O to a file on a filesystem
mounted with DAX on a persistent memory device through the PMDK
libpmem library.
+.TP
+.B ime_psync
+Synchronous read and write using DDN's Infinite Memory Engine (IME). This
+engine is very basic and issues calls to IME whenever an IO is queued.
+.TP
+.B ime_psyncv
+Synchronous read and write using DDN's Infinite Memory Engine (IME). This
+engine uses iovecs and will try to stack as much IOs as possible (if the IOs
+are "contiguous" and the IO depth is not exceeded) before issuing a call to IME.
+.TP
+.B ime_aio
+Asynchronous read and write using DDN's Infinite Memory Engine (IME). This
+engine will try to stack as much IOs as possible by creating requests for IME.
+FIO will then decide when to commit these requests.
.SS "I/O engine specific parameters"
In addition, there are some parameters which are only valid when a specific
\fBioengine\fR is in use. These are used identically to normal parameters,
Poll store instead of waiting for completion. Usually this provides better
throughput at cost of higher(up to 100%) CPU utilization.
.TP
+.BI (http)http_host \fR=\fPstr
+Hostname to connect to. For S3, this could be the bucket name. Default
+is \fBlocalhost\fR
+.TP
+.BI (http)http_user \fR=\fPstr
+Username for HTTP authentication.
+.TP
+.BI (http)http_pass \fR=\fPstr
+Password for HTTP authentication.
+.TP
+.BI (http)https \fR=\fPstr
+Whether to use HTTPS instead of plain HTTP. \fRon\fP enables HTTPS;
+\fRinsecure\fP will enable HTTPS, but disable SSL peer verification (use
+with caution!). Default is \fBoff\fR.
+.TP
+.BI (http)http_mode \fR=\fPstr
+Which HTTP access mode to use: webdav, swift, or s3. Default is
+\fBwebdav\fR.
+.TP
+.BI (http)http_s3_region \fR=\fPstr
+The S3 region/zone to include in the request. Default is \fBus-east-1\fR.
+.TP
+.BI (http)http_s3_key \fR=\fPstr
+The S3 secret key.
+.TP
+.BI (http)http_s3_keyid \fR=\fPstr
+The S3 key/access id.
+.TP
+.BI (http)http_swift_auth_token \fR=\fPstr
+The Swift auth token. See the example configuration file on how to
+retrieve this.
+.TP
+.BI (http)http_verbose \fR=\fPint
+Enable verbose requests from libcurl. Useful for debugging. 1 turns on
+verbose logging from libcurl, 2 additionally enables HTTP IO tracing.
+Default is \fB0\fR
+.TP
.BI (mtd)skip_bad \fR=\fPbool
Skip operations against known bad blocks.
.TP
.TP
.BI write_iops_log \fR=\fPstr
Same as \fBwrite_bw_log\fR, but writes an IOPS file (e.g.
-`name_iops.x.log') instead. See \fBwrite_bw_log\fR for
-details about the filename format and the \fBLOG FILE FORMATS\fR section for how data
-is structured within the file.
+`name_iops.x.log`) instead. Because fio defaults to individual
+I/O logging, the value entry in the IOPS log will be 1 unless windowed
+logging (see \fBlog_avg_msec\fR) has been enabled. See
+\fBwrite_bw_log\fR for details about the filename format and \fBLOG
+FILE FORMATS\fR for how data is structured within the file.
.TP
.BI log_avg_msec \fR=\fPint
By default, fio will log an entry in the iops, latency, or bw log for every
I/O is a TRIM
.RE
.P
-The entry's `block size' is always in bytes. The `offset' is the offset, in bytes,
-from the start of the file, for that particular I/O. The logging of the offset can be
+The entry's `block size' is always in bytes. The `offset' is the position in bytes
+from the start of the file for that particular I/O. The logging of the offset can be
toggled with \fBlog_offset\fR.
.P
-Fio defaults to logging every individual I/O. When IOPS are logged for individual
-I/Os the `value' entry will always be 1. If windowed logging is enabled through
-\fBlog_avg_msec\fR, fio logs the average values over the specified period of time.
-If windowed logging is enabled and \fBlog_max_value\fR is set, then fio logs
-maximum values in that window instead of averages. Since `data direction', `block size'
-and `offset' are per\-I/O values, if windowed logging is enabled they
-aren't applicable and will be 0.
+Fio defaults to logging every individual I/O but when windowed logging is set
+through \fBlog_avg_msec\fR, either the average (by default) or the maximum
+(\fBlog_max_value\fR is set) `value' seen over the specified period of time
+is recorded. Each `data direction' seen within the window period will aggregate
+its values in a separate row. Further, when using windowed logging the `block
+size' and `offset' entries will always contain 0.
.SH CLIENT / SERVER
Normally fio is invoked as a stand\-alone application on the machine where the
I/O workload should be generated. However, the backend and frontend of fio can
GtkTreeIter iter;
struct tm *tm;
time_t sec;
- char tmp[64], timebuf[80];
+ char tmp[64], timebuf[96];
sec = p->log_sec;
tm = localtime(&sec);
#define BLOCKS_PER_UNIT (1U << UNIT_SHIFT)
#define BLOCKS_PER_UNIT_MASK (BLOCKS_PER_UNIT - 1)
-#define firstfree_valid(b) ((b)->first_free != (uint64_t) -1)
-
static const unsigned long bit_masks[] = {
0x0000000000000000, 0x0000000000000001, 0x0000000000000003, 0x0000000000000007,
0x000000000000000f, 0x000000000000001f, 0x000000000000003f, 0x000000000000007f,
struct axmap {
unsigned int nr_levels;
struct axmap_level *levels;
- uint64_t first_free;
uint64_t nr_bits;
};
memset(al->map, 0, al->map_size * sizeof(unsigned long));
}
-
- axmap->first_free = 0;
}
void axmap_free(struct axmap *axmap)
return false;
}
-static bool axmap_clear_fn(struct axmap_level *al, unsigned long offset,
- unsigned int bit, void *unused)
-{
- if (!(al->map[offset] & (1UL << bit)))
- return true;
-
- al->map[offset] &= ~(1UL << bit);
- return false;
-}
-
-void axmap_clear(struct axmap *axmap, uint64_t bit_nr)
-{
- axmap_handler(axmap, bit_nr, axmap_clear_fn, NULL);
-
- if (bit_nr < axmap->first_free)
- axmap->first_free = bit_nr;
-}
-
struct axmap_set_data {
unsigned int nr_bits;
unsigned int set_bits;
{
unsigned int set_bits, nr_bits = data->nr_bits;
- if (axmap->first_free >= bit_nr &&
- axmap->first_free < bit_nr + data->nr_bits)
- axmap->first_free = -1ULL;
-
if (bit_nr > axmap->nr_bits)
return;
else if (bit_nr + nr_bits > axmap->nr_bits)
return false;
}
-static uint64_t axmap_find_first_free(struct axmap *axmap, unsigned int level,
- uint64_t index)
+/*
+ * Find the first free bit that is at least as large as bit_nr. Return
+ * -1 if no free bit is found before the end of the map.
+ */
+static uint64_t axmap_find_first_free(struct axmap *axmap, uint64_t bit_nr)
{
- uint64_t ret = -1ULL;
- unsigned long j;
int i;
+ unsigned long temp;
+ unsigned int bit;
+ uint64_t offset, base_index, index;
+ struct axmap_level *al;
- /*
- * Start at the bottom, then converge towards first free bit at the top
- */
- for (i = level; i >= 0; i--) {
- struct axmap_level *al = &axmap->levels[i];
-
- if (index >= al->map_size)
- goto err;
-
- for (j = index; j < al->map_size; j++) {
- if (al->map[j] == -1UL)
- continue;
+ index = 0;
+ for (i = axmap->nr_levels - 1; i >= 0; i--) {
+ al = &axmap->levels[i];
- /*
- * First free bit here is our index into the first
- * free bit at the next higher level
- */
- ret = index = (j << UNIT_SHIFT) + ffz(al->map[j]);
- break;
+ /* Shift previously calculated index for next level */
+ index <<= UNIT_SHIFT;
+
+ /*
+ * Start from an index that's at least as large as the
+ * originally passed in bit number.
+ */
+ base_index = bit_nr >> (UNIT_SHIFT * i);
+ if (index < base_index)
+ index = base_index;
+
+ /* Get the offset and bit for this level */
+ offset = index >> UNIT_SHIFT;
+ bit = index & BLOCKS_PER_UNIT_MASK;
+
+ /*
+ * If the previous level had unused bits in its last
+ * word, the offset could be bigger than the map at
+ * this level. That means no free bits exist before the
+ * end of the map, so return -1.
+ */
+ if (offset >= al->map_size)
+ return -1ULL;
+
+ /* Check the first word starting with the specific bit */
+ temp = ~bit_masks[bit] & ~al->map[offset];
+ if (temp)
+ goto found;
+
+ /*
+ * No free bit in the first word, so iterate
+ * looking for a word with one or more free bits.
+ */
+ for (offset++; offset < al->map_size; offset++) {
+ temp = ~al->map[offset];
+ if (temp)
+ goto found;
}
- }
-
- if (ret < axmap->nr_bits)
- return ret;
-
-err:
- return (uint64_t) -1ULL;
-}
-
-static uint64_t axmap_first_free(struct axmap *axmap)
-{
- if (!firstfree_valid(axmap))
- axmap->first_free = axmap_find_first_free(axmap, axmap->nr_levels - 1, 0);
-
- return axmap->first_free;
-}
-
-struct axmap_next_free_data {
- unsigned int level;
- unsigned long offset;
- uint64_t bit;
-};
-static bool axmap_next_free_fn(struct axmap_level *al, unsigned long offset,
- unsigned int bit, void *__data)
-{
- struct axmap_next_free_data *data = __data;
- uint64_t mask = ~bit_masks[(data->bit + 1) & BLOCKS_PER_UNIT_MASK];
-
- if (!(mask & ~al->map[offset]))
- return false;
+ /* Did not find a free bit */
+ return -1ULL;
- if (al->map[offset] != -1UL) {
- data->level = al->level;
- data->offset = offset;
- return true;
+found:
+ /* Compute the index of the free bit just found */
+ index = (offset << UNIT_SHIFT) + ffz(~temp);
}
- data->bit = (data->bit + BLOCKS_PER_UNIT - 1) / BLOCKS_PER_UNIT;
- return false;
+ /* If found an unused bit in the last word of level 0, return -1 */
+ if (index >= axmap->nr_bits)
+ return -1ULL;
+
+ return index;
}
/*
* 'bit_nr' is already set. Find the next free bit after this one.
+ * Return -1 if no free bits found.
*/
uint64_t axmap_next_free(struct axmap *axmap, uint64_t bit_nr)
{
- struct axmap_next_free_data data = { .level = -1U, .bit = bit_nr, };
uint64_t ret;
+ uint64_t next_bit = bit_nr + 1;
+ unsigned long temp;
+ uint64_t offset;
+ unsigned int bit;
- if (firstfree_valid(axmap) && bit_nr < axmap->first_free)
- return axmap->first_free;
+ if (bit_nr >= axmap->nr_bits)
+ return -1ULL;
- if (!axmap_handler(axmap, bit_nr, axmap_next_free_fn, &data))
- return axmap_first_free(axmap);
+ /* If at the end of the map, wrap-around */
+ if (next_bit == axmap->nr_bits)
+ next_bit = 0;
- assert(data.level != -1U);
+ offset = next_bit >> UNIT_SHIFT;
+ bit = next_bit & BLOCKS_PER_UNIT_MASK;
/*
- * In the rare case that the map is unaligned, we might end up
- * finding an offset that's beyond the valid end. For that case,
- * find the first free one, the map is practically full.
+ * As an optimization, do a quick check for a free bit
+ * in the current word at level 0. If not found, do
+ * a topdown search.
*/
- ret = axmap_find_first_free(axmap, data.level, data.offset);
- if (ret != -1ULL)
- return ret;
+ temp = ~bit_masks[bit] & ~axmap->levels[0].map[offset];
+ if (temp) {
+ ret = (offset << UNIT_SHIFT) + ffz(~temp);
+
+ /* Might have found an unused bit at level 0 */
+ if (ret >= axmap->nr_bits)
+ ret = -1ULL;
+ } else
+ ret = axmap_find_first_free(axmap, next_bit);
- return axmap_first_free(axmap);
+ /*
+ * If there are no free bits starting at next_bit and going
+ * to the end of the map, wrap around by searching again
+ * starting at bit 0.
+ */
+ if (ret == -1ULL && next_bit != 0)
+ ret = axmap_find_first_free(axmap, 0);
+ return ret;
}
struct axmap *axmap_new(unsigned long nr_bits);
void axmap_free(struct axmap *bm);
-void axmap_clear(struct axmap *axmap, uint64_t bit_nr);
void axmap_set(struct axmap *axmap, uint64_t bit_nr);
unsigned int axmap_set_nr(struct axmap *axmap, uint64_t bit_nr, unsigned int nr_bits);
bool axmap_isset(struct axmap *axmap, uint64_t bit_nr);
#include <stdio.h>
#include <stdarg.h>
+#include <unistd.h>
#include "lib/output_buffer.h"
__FIO_OPT_G_ACT,
__FIO_OPT_G_LATPROF,
__FIO_OPT_G_RBD,
+ __FIO_OPT_G_HTTP,
__FIO_OPT_G_GFAPI,
__FIO_OPT_G_MTD,
__FIO_OPT_G_HDFS,
FIO_OPT_G_ACT = (1ULL << __FIO_OPT_G_ACT),
FIO_OPT_G_LATPROF = (1ULL << __FIO_OPT_G_LATPROF),
FIO_OPT_G_RBD = (1ULL << __FIO_OPT_G_RBD),
+ FIO_OPT_G_HTTP = (1ULL << __FIO_OPT_G_HTTP),
FIO_OPT_G_GFAPI = (1ULL << __FIO_OPT_G_GFAPI),
FIO_OPT_G_MTD = (1ULL << __FIO_OPT_G_MTD),
FIO_OPT_G_HDFS = (1ULL << __FIO_OPT_G_HDFS),
return 0;
}
-static void __td_zone_gen_index(struct thread_data *td, enum fio_ddir ddir)
-{
- unsigned int i, j, sprev, aprev;
- uint64_t sprev_sz;
-
- td->zone_state_index[ddir] = malloc(sizeof(struct zone_split_index) * 100);
-
- sprev_sz = sprev = aprev = 0;
- for (i = 0; i < td->o.zone_split_nr[ddir]; i++) {
- struct zone_split *zsp = &td->o.zone_split[ddir][i];
-
- for (j = aprev; j < aprev + zsp->access_perc; j++) {
- struct zone_split_index *zsi = &td->zone_state_index[ddir][j];
-
- zsi->size_perc = sprev + zsp->size_perc;
- zsi->size_perc_prev = sprev;
-
- zsi->size = sprev_sz + zsp->size;
- zsi->size_prev = sprev_sz;
- }
-
- aprev += zsp->access_perc;
- sprev += zsp->size_perc;
- sprev_sz += zsp->size;
- }
-}
-
-/*
- * Generate state table for indexes, so we don't have to do it inline from
- * the hot IO path
- */
-static void td_zone_gen_index(struct thread_data *td)
-{
- int i;
-
- td->zone_state_index = malloc(DDIR_RWDIR_CNT *
- sizeof(struct zone_split_index *));
-
- for (i = 0; i < DDIR_RWDIR_CNT; i++)
- __td_zone_gen_index(td, i);
-}
-
static int parse_zoned_distribution(struct thread_data *td, const char *input,
bool absolute)
{
return ret;
}
- if (!ret)
- td_zone_gen_index(td);
- else {
+ if (ret) {
for (i = 0; i < DDIR_RWDIR_CNT; i++)
td->o.zone_split_nr[i] = 0;
}
},
#endif
+#ifdef CONFIG_IME
+ { .ival = "ime_psync",
+ .help = "DDN's IME synchronous IO engine",
+ },
+ { .ival = "ime_psyncv",
+ .help = "DDN's IME synchronous IO engine using iovecs",
+ },
+ { .ival = "ime_aio",
+ .help = "DDN's IME asynchronous IO engine",
+ },
+#endif
#ifdef CONFIG_LINUX_DEVDAX
{ .ival = "dev-dax",
.help = "DAX Device based IO engine",
{ .ival = "libpmem",
.help = "PMDK libpmem based IO engine",
},
+#endif
+#ifdef CONFIG_HTTP
+ { .ival = "http",
+ .help = "HTTP (WebDAV/S3) IO engine",
+ },
#endif
},
},
#ifndef CONFIG_HAVE_VASPRINTF
int vasprintf(char **strp, const char *fmt, va_list ap)
{
- va_list ap_copy;
- char *str;
- int len;
+ va_list ap_copy;
+ char *str;
+ int len;
#ifdef va_copy
- va_copy(ap_copy, ap);
+ va_copy(ap_copy, ap);
#else
- __va_copy(ap_copy, ap);
+ __va_copy(ap_copy, ap);
#endif
- len = vsnprintf(NULL, 0, fmt, ap_copy);
- va_end(ap_copy);
+ len = vsnprintf(NULL, 0, fmt, ap_copy);
+ va_end(ap_copy);
- if (len < 0)
- return len;
+ if (len < 0)
+ return len;
- len++;
- str = malloc(len);
- *strp = str;
- return str ? vsnprintf(str, len, fmt, ap) : -1;
+ len++;
+ str = malloc(len);
+ *strp = str;
+ return str ? vsnprintf(str, len, fmt, ap) : -1;
}
#endif
#ifndef CONFIG_HAVE_ASPRINTF
int asprintf(char **strp, const char *fmt, ...)
{
- va_list arg;
- int done;
+ va_list arg;
+ int done;
- va_start(arg, fmt);
- done = vasprintf(strp, fmt, arg);
- va_end(arg);
+ va_start(arg, fmt);
+ done = vasprintf(strp, fmt, arg);
+ va_end(arg);
- return done;
+ return done;
}
#endif
{
struct fio_lfsr lfsr;
struct axmap *map;
- size_t osize;
- uint64_t ff;
int err;
printf("Using %llu entries...", (unsigned long long) size);
lfsr_init(&lfsr, size, seed, seed & 0xF);
map = axmap_new(size);
- osize = size;
err = 0;
while (size--) {
if (err)
return err;
- ff = axmap_next_free(map, osize);
- if (ff != (uint64_t) -1ULL) {
- printf("axmap_next_free broken: got %llu\n", (unsigned long long) ff);
+ printf("pass!\n");
+ axmap_free(map);
+ return 0;
+}
+
+static int check_next_free(struct axmap *map, uint64_t start, uint64_t expected)
+{
+
+ uint64_t ff;
+
+ ff = axmap_next_free(map, start);
+ if (ff != expected) {
+ printf("axmap_next_free broken: Expected %llu, got %llu\n",
+ (unsigned long long)expected, (unsigned long long) ff);
return 1;
}
+ return 0;
+}
+
+static int test_next_free(size_t size, int seed)
+{
+ struct fio_lfsr lfsr;
+ struct axmap *map;
+ size_t osize;
+ uint64_t ff, lastfree;
+ int err, i;
+
+ printf("Test next_free %llu entries...", (unsigned long long) size);
+ fflush(stdout);
+
+ map = axmap_new(size);
+ err = 0;
+
+
+ /* Empty map. Next free after 0 should be 1. */
+ if (check_next_free(map, 0, 1))
+ err = 1;
+
+ /* Empty map. Next free after 63 should be 64. */
+ if (check_next_free(map, 63, 64))
+ err = 1;
+
+ /* Empty map. Next free after size - 2 should be size - 1 */
+ if (check_next_free(map, size - 2, size - 1))
+ err = 1;
+
+ /* Empty map. Next free after size - 1 should be 0 */
+ if (check_next_free(map, size - 1, 0))
+ err = 1;
+
+ /* Empty map. Next free after 63 should be 64. */
+ if (check_next_free(map, 63, 64))
+ err = 1;
+
+
+ /* Bit 63 set. Next free after 62 should be 64. */
+ axmap_set(map, 63);
+ if (check_next_free(map, 62, 64))
+ err = 1;
+
+ /* Last bit set. Next free after size - 2 should be 0. */
+ axmap_set(map, size - 1);
+ if (check_next_free(map, size - 2, 0))
+ err = 1;
+
+ /* Last bit set. Next free after size - 1 should be 0. */
+ if (check_next_free(map, size - 1, 0))
+ err = 1;
+
+ /* Last 64 bits set. Next free after size - 66 or size - 65 should be 0. */
+ for (i=size - 65; i < size; i++)
+ axmap_set(map, i);
+ if (check_next_free(map, size - 66, 0))
+ err = 1;
+ if (check_next_free(map, size - 65, 0))
+ err = 1;
+
+ /* Last 64 bits set. Next free after size - 67 should be size - 66. */
+ if (check_next_free(map, size - 67, size - 66))
+ err = 1;
+
+ axmap_free(map);
+
+ /* Start with a fresh map and mostly fill it up */
+ lfsr_init(&lfsr, size, seed, seed & 0xF);
+ map = axmap_new(size);
+ osize = size;
+
+ /* Leave 1 entry free */
+ size--;
+ while (size--) {
+ uint64_t val;
+
+ if (lfsr_next(&lfsr, &val)) {
+ printf("lfsr: short loop\n");
+ err = 1;
+ break;
+ }
+ if (axmap_isset(map, val)) {
+ printf("bit already set\n");
+ err = 1;
+ break;
+ }
+ axmap_set(map, val);
+ if (!axmap_isset(map, val)) {
+ printf("bit not set\n");
+ err = 1;
+ break;
+ }
+ }
+
+ /* Get last free bit */
+ lastfree = axmap_next_free(map, 0);
+ if (lastfree == -1ULL) {
+ printf("axmap_next_free broken: Couldn't find last free bit\n");
+ err = 1;
+ }
+
+ /* Start with last free bit and test wrap-around */
+ ff = axmap_next_free(map, lastfree);
+ if (ff != lastfree) {
+ printf("axmap_next_free broken: wrap-around test #1 failed\n");
+ err = 1;
+ }
+
+ /* Start with last bit and test wrap-around */
+ ff = axmap_next_free(map, osize - 1);
+ if (ff != lastfree) {
+ printf("axmap_next_free broken: wrap-around test #2 failed\n");
+ err = 1;
+ }
+
+ /* Set last free bit */
+ axmap_set(map, lastfree);
+ ff = axmap_next_free(map, 0);
+ if (ff != -1ULL) {
+ printf("axmap_next_free broken: Expected -1 from full map\n");
+ err = 1;
+ }
+
+ ff = axmap_next_free(map, osize);
+ if (ff != -1ULL) {
+ printf("axmap_next_free broken: Expected -1 from out of bounds request\n");
+ err = 1;
+ }
+
+ if (err)
+ return err;
printf("pass!\n");
axmap_free(map);
return 3;
if (test_overlap())
return 4;
+ if (test_next_free(size, seed))
+ return 5;
+
+ /* Test 3 levels, all full: 64*64*64 */
+ if (test_next_free(64*64*64, seed))
+ return 6;
+
+ /* Test 4 levels, with 2 inner levels not full */
+ if (test_next_free(((((64*64)-63)*64)-63)*64*12, seed))
+ return 7;
return 0;
}
--- /dev/null
+#!/usr/bin/python2.7
+# Note: this script is python2 and python 3 compatible.
+#
+# steadystate_tests.py
+#
+# Test option parsing and functonality for fio's steady state detection feature.
+#
+# steadystate_tests.py --read file-for-read-testing --write file-for-write-testing ./fio
+#
+# REQUIREMENTS
+# Python 2.6+
+# SciPy
+#
+# KNOWN ISSUES
+# only option parsing and read tests are carried out
+# On Windows this script works under Cygwin but not from cmd.exe
+# On Windows I encounter frequent fio problems generating JSON output (nothing to decode)
+# min runtime:
+# if ss attained: min runtime = ss_dur + ss_ramp
+# if not attained: runtime = timeout
+
+from __future__ import absolute_import
+from __future__ import print_function
+import os
+import sys
+import json
+import uuid
+import pprint
+import argparse
+import subprocess
+from scipy import stats
+from six.moves import range
+
+def parse_args():
+ parser = argparse.ArgumentParser()
+ parser.add_argument('fio',
+ help='path to fio executable')
+ parser.add_argument('--read',
+ help='target for read testing')
+ parser.add_argument('--write',
+ help='target for write testing')
+ args = parser.parse_args()
+
+ return args
+
+
+def check(data, iops, slope, pct, limit, dur, criterion):
+ measurement = 'iops' if iops else 'bw'
+ data = data[measurement]
+ mean = sum(data) / len(data)
+ if slope:
+ x = list(range(len(data)))
+ m, intercept, r_value, p_value, std_err = stats.linregress(x,data)
+ m = abs(m)
+ if pct:
+ target = m / mean * 100
+ criterion = criterion[:-1]
+ else:
+ target = m
+ else:
+ maxdev = 0
+ for x in data:
+ maxdev = max(abs(mean-x), maxdev)
+ if pct:
+ target = maxdev / mean * 100
+ criterion = criterion[:-1]
+ else:
+ target = maxdev
+
+ criterion = float(criterion)
+ return (abs(target - criterion) / criterion < 0.005), target < limit, mean, target
+
+
+if __name__ == '__main__':
+ args = parse_args()
+
+ pp = pprint.PrettyPrinter(indent=4)
+
+#
+# test option parsing
+#
+ parsing = [ { 'args': ["--parse-only", "--debug=parse", "--ss_dur=10s", "--ss=iops:10", "--ss_ramp=5"],
+ 'output': "set steady state IOPS threshold to 10.000000" },
+ { 'args': ["--parse-only", "--debug=parse", "--ss_dur=10s", "--ss=iops:10%", "--ss_ramp=5"],
+ 'output': "set steady state threshold to 10.000000%" },
+ { 'args': ["--parse-only", "--debug=parse", "--ss_dur=10s", "--ss=iops:.1%", "--ss_ramp=5"],
+ 'output': "set steady state threshold to 0.100000%" },
+ { 'args': ["--parse-only", "--debug=parse", "--ss_dur=10s", "--ss=bw:10%", "--ss_ramp=5"],
+ 'output': "set steady state threshold to 10.000000%" },
+ { 'args': ["--parse-only", "--debug=parse", "--ss_dur=10s", "--ss=bw:.1%", "--ss_ramp=5"],
+ 'output': "set steady state threshold to 0.100000%" },
+ { 'args': ["--parse-only", "--debug=parse", "--ss_dur=10s", "--ss=bw:12", "--ss_ramp=5"],
+ 'output': "set steady state BW threshold to 12" },
+ ]
+ for test in parsing:
+ output = subprocess.check_output([args.fio] + test['args'])
+ if test['output'] in output.decode():
+ print("PASSED '{0}' found with arguments {1}".format(test['output'], test['args']))
+ else:
+ print("FAILED '{0}' NOT found with arguments {1}".format(test['output'], test['args']))
+
+#
+# test some read workloads
+#
+# if ss active and attained,
+# check that runtime is less than job time
+# check criteria
+# how to check ramp time?
+#
+# if ss inactive
+# check that runtime is what was specified
+#
+ reads = [ {'s': True, 'timeout': 100, 'numjobs': 1, 'ss_dur': 5, 'ss_ramp': 3, 'iops': True, 'slope': True, 'ss_limit': 0.1, 'pct': True},
+ {'s': False, 'timeout': 20, 'numjobs': 2},
+ {'s': True, 'timeout': 100, 'numjobs': 3, 'ss_dur': 10, 'ss_ramp': 5, 'iops': False, 'slope': True, 'ss_limit': 0.1, 'pct': True},
+ {'s': True, 'timeout': 10, 'numjobs': 3, 'ss_dur': 10, 'ss_ramp': 500, 'iops': False, 'slope': True, 'ss_limit': 0.1, 'pct': True},
+ ]
+
+ if args.read == None:
+ if os.name == 'posix':
+ args.read = '/dev/zero'
+ extra = [ "--size=134217728" ] # 128 MiB
+ else:
+ print("ERROR: file for read testing must be specified on non-posix systems")
+ sys.exit(1)
+ else:
+ extra = []
+
+ jobnum = 0
+ for job in reads:
+
+ tf = uuid.uuid4().hex
+ parameters = [ "--name=job{0}".format(jobnum) ]
+ parameters.extend(extra)
+ parameters.extend([ "--thread",
+ "--output-format=json",
+ "--output={0}".format(tf),
+ "--filename={0}".format(args.read),
+ "--rw=randrw",
+ "--rwmixread=100",
+ "--stonewall",
+ "--group_reporting",
+ "--numjobs={0}".format(job['numjobs']),
+ "--time_based",
+ "--runtime={0}".format(job['timeout']) ])
+ if job['s']:
+ if job['iops']:
+ ss = 'iops'
+ else:
+ ss = 'bw'
+ if job['slope']:
+ ss += "_slope"
+ ss += ":" + str(job['ss_limit'])
+ if job['pct']:
+ ss += '%'
+ parameters.extend([ '--ss_dur={0}'.format(job['ss_dur']),
+ '--ss={0}'.format(ss),
+ '--ss_ramp={0}'.format(job['ss_ramp']) ])
+
+ output = subprocess.call([args.fio] + parameters)
+ with open(tf, 'r') as source:
+ jsondata = json.loads(source.read())
+ os.remove(tf)
+
+ for jsonjob in jsondata['jobs']:
+ line = "job {0}".format(jsonjob['job options']['name'])
+ if job['s']:
+ if jsonjob['steadystate']['attained'] == 1:
+ # check runtime >= ss_dur + ss_ramp, check criterion, check criterion < limit
+ mintime = (job['ss_dur'] + job['ss_ramp']) * 1000
+ actual = jsonjob['read']['runtime']
+ if mintime > actual:
+ line = 'FAILED ' + line + ' ss attained, runtime {0} < ss_dur {1} + ss_ramp {2}'.format(actual, job['ss_dur'], job['ss_ramp'])
+ else:
+ line = line + ' ss attained, runtime {0} > ss_dur {1} + ss_ramp {2},'.format(actual, job['ss_dur'], job['ss_ramp'])
+ objsame, met, mean, target = check(data=jsonjob['steadystate']['data'],
+ iops=job['iops'],
+ slope=job['slope'],
+ pct=job['pct'],
+ limit=job['ss_limit'],
+ dur=job['ss_dur'],
+ criterion=jsonjob['steadystate']['criterion'])
+ if not objsame:
+ line = 'FAILED ' + line + ' fio criterion {0} != calculated criterion {1} '.format(jsonjob['steadystate']['criterion'], target)
+ else:
+ if met:
+ line = 'PASSED ' + line + ' target {0} < limit {1}'.format(target, job['ss_limit'])
+ else:
+ line = 'FAILED ' + line + ' target {0} < limit {1} but fio reports ss not attained '.format(target, job['ss_limit'])
+ else:
+ # check runtime, confirm criterion calculation, and confirm that criterion was not met
+ expected = job['timeout'] * 1000
+ actual = jsonjob['read']['runtime']
+ if abs(expected - actual) > 10:
+ line = 'FAILED ' + line + ' ss not attained, expected runtime {0} != actual runtime {1}'.format(expected, actual)
+ else:
+ line = line + ' ss not attained, runtime {0} != ss_dur {1} + ss_ramp {2},'.format(actual, job['ss_dur'], job['ss_ramp'])
+ objsame, met, mean, target = check(data=jsonjob['steadystate']['data'],
+ iops=job['iops'],
+ slope=job['slope'],
+ pct=job['pct'],
+ limit=job['ss_limit'],
+ dur=job['ss_dur'],
+ criterion=jsonjob['steadystate']['criterion'])
+ if not objsame:
+ if actual > (job['ss_dur'] + job['ss_ramp'])*1000:
+ line = 'FAILED ' + line + ' fio criterion {0} != calculated criterion {1} '.format(jsonjob['steadystate']['criterion'], target)
+ else:
+ line = 'PASSED ' + line + ' fio criterion {0} == 0.0 since ss_dur + ss_ramp has not elapsed '.format(jsonjob['steadystate']['criterion'])
+ else:
+ if met:
+ line = 'FAILED ' + line + ' target {0} < threshold {1} but fio reports ss not attained '.format(target, job['ss_limit'])
+ else:
+ line = 'PASSED ' + line + ' criterion {0} > threshold {1}'.format(target, job['ss_limit'])
+ else:
+ expected = job['timeout'] * 1000
+ actual = jsonjob['read']['runtime']
+ if abs(expected - actual) < 10:
+ result = 'PASSED '
+ else:
+ result = 'FAILED '
+ line = result + line + ' no ss, expected runtime {0} ~= actual runtime {1}'.format(expected, actual)
+ print(line)
+ if 'steadystate' in jsonjob:
+ pp.pprint(jsonjob['steadystate'])
+ jobnum += 1
[Unit]
-
-Description=flexible I/O tester server
+Description=Flexible I/O tester server
After=network.target
[Service]
-
Type=simple
-PIDFile=/run/fio.pid
ExecStart=/usr/bin/fio --server
+
+[Install]
+WantedBy=multi-user.target
+++ /dev/null
-#!/usr/bin/python2.7
-# Note: this script is python2 and python 3 compatible.
-#
-# steadystate_tests.py
-#
-# Test option parsing and functonality for fio's steady state detection feature.
-#
-# steadystate_tests.py --read file-for-read-testing --write file-for-write-testing ./fio
-#
-# REQUIREMENTS
-# Python 2.6+
-# SciPy
-#
-# KNOWN ISSUES
-# only option parsing and read tests are carried out
-# On Windows this script works under Cygwin but not from cmd.exe
-# On Windows I encounter frequent fio problems generating JSON output (nothing to decode)
-# min runtime:
-# if ss attained: min runtime = ss_dur + ss_ramp
-# if not attained: runtime = timeout
-
-from __future__ import absolute_import
-from __future__ import print_function
-import os
-import sys
-import json
-import uuid
-import pprint
-import argparse
-import subprocess
-from scipy import stats
-from six.moves import range
-
-def parse_args():
- parser = argparse.ArgumentParser()
- parser.add_argument('fio',
- help='path to fio executable')
- parser.add_argument('--read',
- help='target for read testing')
- parser.add_argument('--write',
- help='target for write testing')
- args = parser.parse_args()
-
- return args
-
-
-def check(data, iops, slope, pct, limit, dur, criterion):
- measurement = 'iops' if iops else 'bw'
- data = data[measurement]
- mean = sum(data) / len(data)
- if slope:
- x = list(range(len(data)))
- m, intercept, r_value, p_value, std_err = stats.linregress(x,data)
- m = abs(m)
- if pct:
- target = m / mean * 100
- criterion = criterion[:-1]
- else:
- target = m
- else:
- maxdev = 0
- for x in data:
- maxdev = max(abs(mean-x), maxdev)
- if pct:
- target = maxdev / mean * 100
- criterion = criterion[:-1]
- else:
- target = maxdev
-
- criterion = float(criterion)
- return (abs(target - criterion) / criterion < 0.005), target < limit, mean, target
-
-
-if __name__ == '__main__':
- args = parse_args()
-
- pp = pprint.PrettyPrinter(indent=4)
-
-#
-# test option parsing
-#
- parsing = [ { 'args': ["--parse-only", "--debug=parse", "--ss_dur=10s", "--ss=iops:10", "--ss_ramp=5"],
- 'output': "set steady state IOPS threshold to 10.000000" },
- { 'args': ["--parse-only", "--debug=parse", "--ss_dur=10s", "--ss=iops:10%", "--ss_ramp=5"],
- 'output': "set steady state threshold to 10.000000%" },
- { 'args': ["--parse-only", "--debug=parse", "--ss_dur=10s", "--ss=iops:.1%", "--ss_ramp=5"],
- 'output': "set steady state threshold to 0.100000%" },
- { 'args': ["--parse-only", "--debug=parse", "--ss_dur=10s", "--ss=bw:10%", "--ss_ramp=5"],
- 'output': "set steady state threshold to 10.000000%" },
- { 'args': ["--parse-only", "--debug=parse", "--ss_dur=10s", "--ss=bw:.1%", "--ss_ramp=5"],
- 'output': "set steady state threshold to 0.100000%" },
- { 'args': ["--parse-only", "--debug=parse", "--ss_dur=10s", "--ss=bw:12", "--ss_ramp=5"],
- 'output': "set steady state BW threshold to 12" },
- ]
- for test in parsing:
- output = subprocess.check_output([args.fio] + test['args'])
- if test['output'] in output.decode():
- print("PASSED '{0}' found with arguments {1}".format(test['output'], test['args']))
- else:
- print("FAILED '{0}' NOT found with arguments {1}".format(test['output'], test['args']))
-
-#
-# test some read workloads
-#
-# if ss active and attained,
-# check that runtime is less than job time
-# check criteria
-# how to check ramp time?
-#
-# if ss inactive
-# check that runtime is what was specified
-#
- reads = [ {'s': True, 'timeout': 100, 'numjobs': 1, 'ss_dur': 5, 'ss_ramp': 3, 'iops': True, 'slope': True, 'ss_limit': 0.1, 'pct': True},
- {'s': False, 'timeout': 20, 'numjobs': 2},
- {'s': True, 'timeout': 100, 'numjobs': 3, 'ss_dur': 10, 'ss_ramp': 5, 'iops': False, 'slope': True, 'ss_limit': 0.1, 'pct': True},
- {'s': True, 'timeout': 10, 'numjobs': 3, 'ss_dur': 10, 'ss_ramp': 500, 'iops': False, 'slope': True, 'ss_limit': 0.1, 'pct': True},
- ]
-
- if args.read == None:
- if os.name == 'posix':
- args.read = '/dev/zero'
- extra = [ "--size=134217728" ] # 128 MiB
- else:
- print("ERROR: file for read testing must be specified on non-posix systems")
- sys.exit(1)
- else:
- extra = []
-
- jobnum = 0
- for job in reads:
-
- tf = uuid.uuid4().hex
- parameters = [ "--name=job{0}".format(jobnum) ]
- parameters.extend(extra)
- parameters.extend([ "--thread",
- "--output-format=json",
- "--output={0}".format(tf),
- "--filename={0}".format(args.read),
- "--rw=randrw",
- "--rwmixread=100",
- "--stonewall",
- "--group_reporting",
- "--numjobs={0}".format(job['numjobs']),
- "--time_based",
- "--runtime={0}".format(job['timeout']) ])
- if job['s']:
- if job['iops']:
- ss = 'iops'
- else:
- ss = 'bw'
- if job['slope']:
- ss += "_slope"
- ss += ":" + str(job['ss_limit'])
- if job['pct']:
- ss += '%'
- parameters.extend([ '--ss_dur={0}'.format(job['ss_dur']),
- '--ss={0}'.format(ss),
- '--ss_ramp={0}'.format(job['ss_ramp']) ])
-
- output = subprocess.call([args.fio] + parameters)
- with open(tf, 'r') as source:
- jsondata = json.loads(source.read())
- os.remove(tf)
-
- for jsonjob in jsondata['jobs']:
- line = "job {0}".format(jsonjob['job options']['name'])
- if job['s']:
- if jsonjob['steadystate']['attained'] == 1:
- # check runtime >= ss_dur + ss_ramp, check criterion, check criterion < limit
- mintime = (job['ss_dur'] + job['ss_ramp']) * 1000
- actual = jsonjob['read']['runtime']
- if mintime > actual:
- line = 'FAILED ' + line + ' ss attained, runtime {0} < ss_dur {1} + ss_ramp {2}'.format(actual, job['ss_dur'], job['ss_ramp'])
- else:
- line = line + ' ss attained, runtime {0} > ss_dur {1} + ss_ramp {2},'.format(actual, job['ss_dur'], job['ss_ramp'])
- objsame, met, mean, target = check(data=jsonjob['steadystate']['data'],
- iops=job['iops'],
- slope=job['slope'],
- pct=job['pct'],
- limit=job['ss_limit'],
- dur=job['ss_dur'],
- criterion=jsonjob['steadystate']['criterion'])
- if not objsame:
- line = 'FAILED ' + line + ' fio criterion {0} != calculated criterion {1} '.format(jsonjob['steadystate']['criterion'], target)
- else:
- if met:
- line = 'PASSED ' + line + ' target {0} < limit {1}'.format(target, job['ss_limit'])
- else:
- line = 'FAILED ' + line + ' target {0} < limit {1} but fio reports ss not attained '.format(target, job['ss_limit'])
- else:
- # check runtime, confirm criterion calculation, and confirm that criterion was not met
- expected = job['timeout'] * 1000
- actual = jsonjob['read']['runtime']
- if abs(expected - actual) > 10:
- line = 'FAILED ' + line + ' ss not attained, expected runtime {0} != actual runtime {1}'.format(expected, actual)
- else:
- line = line + ' ss not attained, runtime {0} != ss_dur {1} + ss_ramp {2},'.format(actual, job['ss_dur'], job['ss_ramp'])
- objsame, met, mean, target = check(data=jsonjob['steadystate']['data'],
- iops=job['iops'],
- slope=job['slope'],
- pct=job['pct'],
- limit=job['ss_limit'],
- dur=job['ss_dur'],
- criterion=jsonjob['steadystate']['criterion'])
- if not objsame:
- if actual > (job['ss_dur'] + job['ss_ramp'])*1000:
- line = 'FAILED ' + line + ' fio criterion {0} != calculated criterion {1} '.format(jsonjob['steadystate']['criterion'], target)
- else:
- line = 'PASSED ' + line + ' fio criterion {0} == 0.0 since ss_dur + ss_ramp has not elapsed '.format(jsonjob['steadystate']['criterion'])
- else:
- if met:
- line = 'FAILED ' + line + ' target {0} < threshold {1} but fio reports ss not attained '.format(target, job['ss_limit'])
- else:
- line = 'PASSED ' + line + ' criterion {0} > threshold {1}'.format(target, job['ss_limit'])
- else:
- expected = job['timeout'] * 1000
- actual = jsonjob['read']['runtime']
- if abs(expected - actual) < 10:
- result = 'PASSED '
- else:
- result = 'FAILED '
- line = result + line + ' no ss, expected runtime {0} ~= actual runtime {1}'.format(expected, actual)
- print(line)
- if 'steadystate' in jsonjob:
- pp.pprint(jsonjob['steadystate'])
- jobnum += 1
--- /dev/null
+#include <stdlib.h>
+#include "fio.h"
+#include "zone-dist.h"
+
+static void __td_zone_gen_index(struct thread_data *td, enum fio_ddir ddir)
+{
+ unsigned int i, j, sprev, aprev;
+ uint64_t sprev_sz;
+
+ td->zone_state_index[ddir] = malloc(sizeof(struct zone_split_index) * 100);
+
+ sprev_sz = sprev = aprev = 0;
+ for (i = 0; i < td->o.zone_split_nr[ddir]; i++) {
+ struct zone_split *zsp = &td->o.zone_split[ddir][i];
+
+ for (j = aprev; j < aprev + zsp->access_perc; j++) {
+ struct zone_split_index *zsi = &td->zone_state_index[ddir][j];
+
+ zsi->size_perc = sprev + zsp->size_perc;
+ zsi->size_perc_prev = sprev;
+
+ zsi->size = sprev_sz + zsp->size;
+ zsi->size_prev = sprev_sz;
+ }
+
+ aprev += zsp->access_perc;
+ sprev += zsp->size_perc;
+ sprev_sz += zsp->size;
+ }
+}
+
+static bool has_zones(struct thread_data *td)
+{
+ int i, zones = 0;
+
+ for (i = 0; i < DDIR_RWDIR_CNT; i++)
+ zones += td->o.zone_split_nr[i];
+
+ return zones != 0;
+}
+
+/*
+ * Generate state table for indexes, so we don't have to do it inline from
+ * the hot IO path
+ */
+void td_zone_gen_index(struct thread_data *td)
+{
+ int i;
+
+ if (!has_zones(td))
+ return;
+
+ td->zone_state_index = malloc(DDIR_RWDIR_CNT *
+ sizeof(struct zone_split_index *));
+
+ for (i = 0; i < DDIR_RWDIR_CNT; i++)
+ __td_zone_gen_index(td, i);
+}
+
+void td_zone_free_index(struct thread_data *td)
+{
+ int i;
+
+ if (!td->zone_state_index)
+ return;
+
+ for (i = 0; i < DDIR_RWDIR_CNT; i++) {
+ free(td->zone_state_index[i]);
+ td->zone_state_index[i] = NULL;
+ }
+
+ free(td->zone_state_index);
+ td->zone_state_index = NULL;
+}
--- /dev/null
+#ifndef FIO_ZONE_DIST_H
+#define FIO_ZONE_DIST_H
+
+void td_zone_gen_index(struct thread_data *td);
+void td_zone_free_index(struct thread_data *td);
+
+#endif