There are 3 main performance metrics when measuring a storage system performance:
Latency, which is the time from operation initiation to completion.
The number of IO operations that the system can process concurrently.
The bandwidth of data that the system can process concurrently.
Each of these performance metrics applies to read operations, write operations or a mixture of read and write operations.
When measuring the WekaIO system performance, different mount modes produce different performance characteristics. Additionally, client network configuration (using either space networking or kernel UDP) also have a significant effect on performance.
Note: All performance tests listed here are generic and not specific to the WekaIO system. They can be used to compare the WekaIO storage system to other storage systems or to a local storage device.
Note: There is a difference between single client performance to aggregated performance. When running the tests listed below from one client, the client will limit the performance of the test. In general, several clients will be required to maximize the performance of a WekaIO cluster.
The FIO Utility is a generic open source storage performance testing tool which can be defined as described here. In this documentation, the usage of FIO version 3.5 is assumed.
The following script will count the number of cores (n) in the client, create n-2 directories and layout one 10 GB file in each directory. Depending on the number of client cores, this may take a relatively long time since FIO lays out the files sequentially.
export WEKA_MOUNT=/mnt/wekaexport WEKA_CLIENT=`/bin/hostname`export BENCHMARK_ID=Create_Filesexport WORKING_DIR=$WEKA_MOUNT/$WEKA_CLIENT/export JOBS=$((`lscpu | grep ^"CPU(s)" | awk '{print $2}'` - 2))mkdir -p $WORKING_DIR && cd $WORKING_DIR && seq 0 $JOBS | xargs mkdir
fio --name=$BENCHMARK_ID --clocksource=gettimeofday --group_reporting \--directory=$WORKING_DIR --ioengine=posixaio --direct=1 \--filename_format='$jobnum/FIOfile' --size=10GB --rw=rw --rwmixread=0 \--blocksize=1m --numjobs=$JOBS --create_only=1
This test measures the client throughput for large (1 MB) reads. The scripts below will try to maximize the read throughput from a single client. The test utilizes multiple threads, each one performing 1 MB reads.
Note: If the client uses a 100 Gbps NIC or above, mounting the WekaIO filesystem with more than one core is required to maximize client throughput.
export WEKA_MOUNT=/mnt/wekaexport WEKA_CLIENT=`/bin/hostname`export BENCHMARK_ID=FioReads1MMultiThreadexport WORKING_DIR=$WEKA_MOUNT/$WEKA_CLIENT/export JOBS=$((`lscpu | grep ^"CPU(s)" | awk '{print $2}'` - 2))
fio --name=$BENCHMARK_ID --clocksource=gettimeofday --group_reporting \--directory=$WORKING_DIR --ioengine=posixaio --direct=1 \--filename_format='$jobnum/FIOfile' --size=10GB --runtime=60 \--time_based=1 --iodepth=1 --rw=randrw --rwmixread=100 --blocksize=1m \--numjobs=$JOBS
Note: Different hardware and networking configurations may yield different latency results, which can be as low as 150 microseconds for 100 Gbit networking and NVMe drives.
In this test output example, results show a bandwidth of 2.8 Gigabytes/second.
This test measures the ability of the client to deliver concurrent 4 KB reads. The following scripts will try to maximize the system read IOPS from a single client. The test utilizes multiple threads, each one performing 4 KB reads.
export WEKA_MOUNT=/mnt/wekaexport WEKA_CLIENT=`/bin/hostname`export BENCHMARK_ID=IOPSRead4KMultiThreadexport WORKING_DIR=$WEKA_MOUNT/$WEKA_CLIENT/export JOBS=$((`lscpu | grep ^"CPU(s)" | awk '{print $2}'` - 2))
Note: To maximize system throughput, multiple clients are required in most cases..
fio --name=$BENCHMARK_ID --clocksource=gettimeofday --group_reporting \--directory=$WORKING_DIR --ioengine=posixaio --direct=1 \--filename_format='$jobnum/FIOfile' --size=10GB --runtime=60 \--time_based=1 --iodepth=1 --rw=randrw --rwmixread=100 --blocksize=4k \--numjobs=$JOBS
In this test output example, results show an average IOPS of 127,402.
This test measures the minimal achievable read latency under a light load. The test measures the latency over a single threaded sequence of 4 KB reads across multiple files. Each read is executed only after the previous read has been served.
export WEKA_MOUNT=/mnt/wekaexport WEKA_CLIENT=`/bin/hostname`export BENCHMARK_ID=FioReads4KSingleThreadexport WORKING_DIR=$WEKA_MOUNT/$WEKA_CLIENT/export JOBS=1
fio --name=$BENCHMARK_ID --clocksource=gettimeofday --group_reporting \--directory=$WORKING_DIR --ioengine=posixaio --direct=1 \--filename_format='$jobnum/FIOfile' --size=10GB --runtime=60 \--time_based=1 --iodepth=1 --rw=randrw --rwmixread=100 --blocksize=4k \--numjobs=$JOBS
In this test output example, results show an average latency of 224 microseconds, where 99.5% of the writes terminated in 338 microseconds or less.
The following is an example of the test output for an AWS WekaIO cluster with 6 instances, type i3.16xlarge.
This test measures the client throughput for large (1 MB) writes. The scripts below will try to maximize the write throughput from a single client. The test utilizes multiple threads, each one performing 1 MB reads.
Note: If the client uses a 100 Gbps NIC or above, mounting the WekaIO filesystem with more than one core is required to maximize client throughput.
export WEKA_MOUNT=/mnt/wekaexport WEKA_CLIENT=`/bin/hostname`export BENCHMARK_ID=FioWrite1MMultiThreadexport WORKING_DIR=$WEKA_MOUNT/$WEKA_CLIENT/export JOBS=$((`lscpu | grep ^"CPU(s)" | awk '{print $2}'` - 2))
fio --name=$BENCHMARK_ID --clocksource=gettimeofday --group_reporting \--directory=$WORKING_DIR --ioengine=posixaio --direct=1 \--filename_format='$jobnum/FIOfile' --size=10GB --runtime=60 \--time_based=1 --iodepth=1 --rw=randrw --rwmixread=0 --blocksize=1m \--numjobs=$JOBS
In this test output example, results show a bandwidth of 2.8 Gigabytes/second.
This test measures the ability of the client to deliver concurrent 4 KB reads. The following scripts try to maximize the system read IOPS from a single client. The test utilizes multiple threads, each one performing 4 KB reads.
Note: To maximize system throughput, multiple clients are required in most cases.
export WEKA_MOUNT=/mnt/wekaexport WEKA_CLIENT=`/bin/hostname`export BENCHMARK_ID=IOPSWrite4KMultiThreadexport WORKING_DIR=$WEKA_MOUNT/$WEKA_CLIENT/export JOBS=$((`lscpu | grep ^"CPU(s)" | awk '{print $2}'` - 2))
fio --name=$BENCHMARK_ID --clocksource=gettimeofday --group_reporting \--directory=$WORKING_DIR --ioengine=posixaio --direct=1 \--filename_format='$jobnum/FIOfile' --size=10GB --runtime=60 \--time_based=1 --iodepth=1 --rw=randrw --rwmixread=0 --blocksize=4k \--numjobs=$JOBS
In this test output example, results show an average IOPS of 127,402.
This test measures the minimal achievable write latency under a light load. The test measures the latency over a single threaded sequence of 4 KB writes across multiple files. Each write is executed only after the previous write has been served.
export WEKA_MOUNT=/mnt/wekaexport WEKA_CLIENT=`/bin/hostname`export BENCHMARK_ID=FioWrites4KSingleThreadexport WORKING_DIR=$WEKA_MOUNT/$WEKA_CLIENT/export JOBS=1
fio --name=$BENCHMARK_ID --clocksource=gettimeofday --group_reporting \--directory=$WORKING_DIR --ioengine=posixaio --direct=1 \--filename_format='$jobnum/FIOfile' --size=10GB --runtime=60 \--time_based=1 --iodepth=1 --rw=randrw --rwmixread=0 --blocksize=4k \--numjobs=$JOBS
The following is an example of the test output for an AWS WekaIO cluster with 6 instances, type i3.16xlarge.
In this test output example, results show an average latency of 529 microseconds, where 99.5% of the writes terminated in 766 microseconds or less.
Note: Different hardware and networking configurations may yield different latency results, which can be as low as 150 microseconds for 100 Gbit networking and NVMe drives.
If it is preferred to run all the tests sequentially and review the results afterwards, follow the instructions below.
export WEKA_MOUNT=/mnt/wekaexport WEKA_CLIENT=`/bin/hostname`export WORKING_DIR=$WEKA_MOUNT/$WEKA_CLIENT/export JOBS=$((`lscpu | grep ^"CPU(s)" | awk '{print $2}'` - 2))mkdir -p $WORKING_DIR && cd $WORKING_DIR && seq 0 $JOBS | xargs mkdir
Copy the FIOmaster file to your host and run the benchmark using the the following command:
DIRECTORY=$WORKING_DIR fio FIOmaster --output=FIOmaster.out