WEKA performance tests

Measure the key performance metrics of a WEKA cluster—latency, IOPS, and bandwidth—using standardized testing procedures.

Overview

When measuring a storage system's performance, there are three primary metrics:

  • Latency: The time from the initiation of an operation to its completion.

  • IOPS: The number of I/O operations (such as read, write, or metadata) that the system can process concurrently.

  • Bandwidth: The amount of data that the system can process concurrently.

Each metric applies to read operations, write operations, or a mixture of both. Different mount modes can produce different performance characteristics. Additionally, the client's network configuration, such as using user-space DPDK networking or kernel UDP, significantly affects performance.

It is important to distinguish between single-client and aggregated performance. Running tests from a single client will likely be limited by the client's own performance capabilities. In general, maximizing the performance of a WEKA cluster requires running tests from several clients simultaneously.

To ensure that test results reflect the filesystem's ability to deliver data independent of client-side caching, the benchmarks are designed to negate the effects of caching where possible. This is achieved by using o_direct calls to bypass the client's cache for file testing and by flushing Linux caches between tests.

Testing WEKA performance with wekatester

Use the wekatester command-line utility to measure the performance of a WEKA cluster. This tool automates a series of standardized FIO tests to measure key performance indicators (KPIs), such as throughput, IOPS, and latency.

Using wekatester is the recommended approach for performance testing as it provides consistent, reproducible, and easy-to-interpret results.

Before you begin

Procedure

  1. Log in to a client with access to the WEKA cluster.

  2. Navigate to the directory containing the wekatester tool.

  3. Run the performance test suite using the following command:

    ./wekatester -c

    The tool automatically discovers the cluster and clients, prepares the hosts for testing, runs the full suite of performance tests, and reports the aggregated results.

Result example

The command displays a summary of the performance results, providing a clear overview of the cluster's capabilities.

read bandwidth: 434.52 GiB/s
total bandwidth: 434.52 GiB/s
average bandwidth: 27.16 GiB/s per host

write bandwidth: 258.49 GiB/s
total bandwidth: 258.49 GiB/s
average bandwidth: 16.16 GiB/s per host

read latency: 143 us

write latency: 134 us

read iops: 16,526,081/s
total iops: 16,526,081/s
average iops: 1,032,880/s per host

write iops: 4,089,720/s
total iops: 4,089,720/s
average iops: 255,607/s per host

Wekatester FIO job definitions

The wekatester tool uses a standardized set of Flexible I/O (FIO) tester jobs to ensure consistent and comparable results. These job definitions are provided for users who want to review the testing methodology or run the tests manually.

All jobs use a 2G file size for testing consistency.

Read throughput job definition

This job measures the maximum read bandwidth.

[global]
filesize=2G
time_based=1
startdelay=5
exitall_on_error=1
create_serialize=0
filename_format=$filenum/$jobnum
directory=/mnt/weka
group_reporting=1
clocksource=gettimeofday
runtime=30
ioengine=libaio
disk_util=0
direct=1
numjobs=32

[fio-createfiles-00]
blocksize=1Mi
description='pre-create files'
create_only=1

[fio-bandwidthSR-00]
stonewall
description='Sequential Read bandwidth workload'
blocksize=1Mi
rw=read
iodepth=1

Write throughput job definition

This job measures the maximum write bandwidth.

[global]
filesize=2G
time_based=1
startdelay=5
exitall_on_error=1
create_serialize=0
filename_format=$filenum/$jobnum
directory=/mnt/weka
group_reporting=1
clocksource=gettimeofday
runtime=30
ioengine=libaio
disk_util=0
direct=1
numjobs=32

[fio-createfiles-00]
stonewall
blocksize=1Mi
description='pre-create files'
create_only=1

[fio-bandwidthSW-00]
stonewall
description='Sequential Write bandwidth workload'
blocksize=1Mi
rw=write
iodepth=1

Read IOPS job definition

This job measures the maximum read IOPS using a 4k block size.

[global]
filesize=2G
time_based=1
startdelay=5
exitall_on_error=1
create_serialize=0
filename_format=$filenum/$jobnum
directory=/mnt/weka
group_reporting=1
clocksource=gettimeofday
runtime=30
ioengine=libaio
disk_util=0
direct=1
numjobs=64

[fio-createfiles-00]
blocksize=1Mi
description='pre-create files'
create_only=1

[fio-iopsR-00]
stonewall
description='Read iops workload'
iodepth=8
bs=4k
rw=randread

Read latency job definition

This job measures the maximum read IOPS using a 4k block size.

[global]
filesize=2G
time_based=1
startdelay=5
exitall_on_error=1
create_serialize=0
filename_format=$filenum/$jobnum
directory=/mnt/weka
group_reporting=1
clocksource=gettimeofday
runtime=30
ioengine=libaio
disk_util=0
direct=1
numjobs=64

[fio-createfiles-00]
blocksize=1Mi
description='pre-create files'
create_only=1

[fio-iopsR-00]
stonewall
description='Read iops workload'
iodepth=8
bs=4k
rw=randread

Write IOPS job definition

This job measures the maximum write IOPS using a 4k block size.

[global]
filesize=2G
time_based=1
startdelay=5
exitall_on_error=1
create_serialize=0
filename_format=$filenum/$jobnum
directory=/mnt/weka
group_reporting=1
clocksource=gettimeofday
runtime=30
ioengine=libaio
disk_util=0
direct=1
numjobs=64

[fio-createfiles-00]
blocksize=1Mi
description='pre-create files'
create_only=1

[fio-iopsW-00]
stonewall
description='Write iops workload'
iodepth=8
bs=4k
rw=randwrite

Read latency job definition

This job measures read latency using a 4k block size.

[global]
filesize=2G
time_based=1
startdelay=5
exitall_on_error=1
create_serialize=0
filename_format=$filenum/$jobnum
directory=/mnt/weka
group_reporting=1
clocksource=gettimeofday
runtime=30
ioengine=libaio
disk_util=0
direct=1
numjobs=1

[fio-createfiles-00]
blocksize=1Mi
description='pre-create files'
create_only=1

[fio-latencyR-00]
stonewall
description='Read latency workload'
bs=4k
rw=randread
iodepth=1

Write latency job definition

This job measures the maximum write IOPS using a 4k block size.

[global]
filesize=2G
time_based=1
startdelay=5
exitall_on_error=1
create_serialize=0
filename_format=$filenum/$jobnum
directory=/mnt/weka
group_reporting=1
clocksource=gettimeofday
runtime=30
ioengine=libaio
disk_util=0
direct=1
numjobs=1

[fio-createfiles-00]
blocksize=1Mi
description='pre-create files'
create_only=1

[fio-latencyW-00]
stonewall
description='Write latency workload'
bs=4k
rw=randwrite
iodepth=1

Testing metadata performance with MDTest

MDTest is an open-source tool designed to test metadata performance, measuring the rate of operations such as file creates, stats, and deletes across the cluster.

MDTest uses an MPI framework to coordinate jobs across multiple nodes. The examples shown here assume the use of MDTest version 1.9.3 with MPICH version 3.3.2 (MPITCH documentation).

Procedure

Run the MDTest benchmark from a client machine with access to the WEKA filesystem. The following command runs the test across multiple clients defined in a hostfile. It uses 8 clients with 136 threads each to test the performance on 20 million files.

Job definition

mpiexec -f <hostfile> -np 1088 mdtest-v-N 136i 3 n 18382 -F -u-d /mnt/weka/mdtest

Result example

The following table shows an example summary from three test iterations.

Operation
Max
Min
Mean
Std Dev

File creation

40784.448

40784.447

40784.448

0.001

File stat

2352915.997

2352902.666

2352911.311

6.121

File read

217236.252

217236.114

217236.162

0.064

File removal

44101.905

44101.896

44101.902

0.004

Tree creation

3.788

3.097

3.342

0.316

Tree removal

1.192

1.142

1.172

0.022

Performance test results summary

The following tables show example results from tests run in specific AWS and SuperMicro environments.

Single client results

Benchmark
AWS
SuperMicro

Read Throughput

8.9 GiB/s

21.4 GiB/s

Write Throughput

9.4 GiB/s

17.2 GiB/s

Read IOPS

393,333 ops/s

563,667 ops/s

Write IOPS

302,333 ops/s

378,667 ops/s

Read Latency

272 µs avg.&lt;br>99.5% completed under 459 µs

144.76 µs avg.&lt;br>99.5% completed under 260 µs

Write Latency

298 µs avg.&lt;br>99.5% completed under 432 µs

107.12 µs avg.&lt;br>99.5% completed under 142 µs

Aggregated cluster results (with multiple clients)

Benchmark
AWS
SuperMicro

Read Throughput

36.2 GiB/s

123 GiB/s

Write Throughput

11.6 GiB/s

37.6 GiB/s

Read IOPS

1,978,330 ops/s

4,346,330 ops/s

Write IOPS

404,670 ops/s

1,317,000 ops/s

Creates

79,599 ops/s

234,472 ops/s

Stats

1,930,721 ops/s

3,257,394 ops/s

Deletes

117,644 ops/s

361,755 ops/s

Last updated