W E K A
4.3
4.3
  • WEKA v4.3 documentation
    • Documentation revision history
  • WEKA System Overview
    • WEKA Data Platform introduction
      • WEKA system functionality features
      • Converged WEKA system deployment
      • Optimize redundancy in WEKA deployments
    • SSD capacity management
    • Filesystems, object stores, and filesystem groups
    • WEKA networking
    • Data lifecycle management
    • WEKA client and mount modes
    • WEKA containers architecture overview
    • Glossary
  • Planning and Installation
    • Prerequisites and compatibility
    • WEKA cluster installation on bare metal servers
      • Plan the WEKA system hardware requirements
      • Obtain the WEKA installation packages
      • Install the WEKA cluster using the WMS with WSA
      • Install the WEKA cluster using the WSA
      • Manually install OS and WEKA on servers
      • Manually prepare the system for WEKA configuration
        • Broadcom adapter setup for WEKA system
        • Enable the SR-IOV
      • Configure the WEKA cluster using the WEKA Configurator
      • Manually configure the WEKA cluster using the resource generator
      • Perform post-configuration procedures
      • Add clients to an on-premises WEKA cluster
    • WEKA Cloud Deployment Manager Web (CDM Web) User Guide
    • WEKA Cloud Deployment Manager Local (CDM Local) User Guide
    • WEKA installation on AWS
      • WEKA installation on AWS using Terraform
        • Terraform-AWS-WEKA module description
        • Deployment on AWS using Terraform
        • Required services and supported regions
        • Supported EC2 instance types using Terraform
        • WEKA cluster auto-scaling in AWS
        • Detailed deployment tutorial: WEKA on AWS using Terraform
      • WEKA installation on AWS using the Cloud Formation
        • Self-service portal
        • CloudFormation template generator
        • Deployment types
        • AWS Outposts deployment
        • Supported EC2 instance types using Cloud Formation
        • Add clients to a WEKA cluster on AWS
        • Auto scaling group
        • Troubleshooting
      • Install SMB on AWS
    • WEKA installation on Azure
    • WEKA installation on GCP
      • WEKA project description
      • GCP-WEKA deployment Terraform package description
      • Deployment on GCP using Terraform
      • Required services and supported regions
      • Supported machine types and storage
      • Auto-scale instances in GCP
      • Add clients to a WEKA cluster on GCP
      • Troubleshooting
      • Detailed deployment tutorial: WEKA on GCP using Terraform
      • Google Kubernetes Engine and WEKA over POSIX deployment
  • Getting Started with WEKA
    • Manage the system using the WEKA GUI
    • Manage the system using the WEKA CLI
      • WEKA CLI hierarchy
      • CLI reference guide
    • Run first IOs with WEKA filesystem
    • Getting started with WEKA REST API
    • WEKA REST API and equivalent CLI commands
  • Performance
    • WEKA performance tests
      • Test environment details
  • WEKA Filesystems & Object Stores
    • Manage object stores
      • Manage object stores using the GUI
      • Manage object stores using the CLI
    • Manage filesystem groups
      • Manage filesystem groups using the GUI
      • Manage filesystem groups using the CLI
    • Manage filesystems
      • Manage filesystems using the GUI
      • Manage filesystems using the CLI
    • Attach or detach object store buckets
      • Attach or detach object store bucket using the GUI
      • Attach or detach object store buckets using the CLI
    • Advanced data lifecycle management
      • Advanced time-based policies for data storage location
      • Data management in tiered filesystems
      • Transition between tiered and SSD-only filesystems
      • Manual fetch and release of data
    • Mount filesystems
      • Mount filesystems from Single Client to Multiple Clusters (SCMC)
    • Snapshots
      • Manage snapshots using the GUI
      • Manage snapshots using the CLI
    • Snap-To-Object
      • Manage Snap-To-Object using the GUI
      • Manage Snap-To-Object using the CLI
    • Quota management
      • Manage quotas using the GUI
      • Manage quotas using the CLI
  • Additional Protocols
    • Additional protocol containers
    • Manage the NFS protocol
      • Supported NFS client mount parameters
      • Manage NFS networking using the GUI
      • Manage NFS networking using the CLI
    • Manage the S3 protocol
      • S3 cluster management
        • Manage the S3 service using the GUI
        • Manage the S3 service using the CLI
      • S3 buckets management
        • Manage S3 buckets using the GUI
        • Manage S3 buckets using the CLI
      • S3 users and authentication
        • Manage S3 users and authentication using the CLI
        • Manage S3 service accounts using the CLI
      • S3 rules information lifecycle management (ILM)
        • Manage S3 lifecycle rules using the GUI
        • Manage S3 lifecycle rules using the CLI
      • Audit S3 APIs
        • Configure audit webhook using the GUI
        • Configure audit webhook using the CLI
        • Example: How to use Splunk to audit S3
      • S3 supported APIs and limitations
      • S3 examples using boto3
      • Access S3 using AWS CLI
    • Manage the SMB protocol
      • Manage SMB using the GUI
      • Manage SMB using the CLI
  • Operation Guide
    • Alerts
      • Manage alerts using the GUI
      • Manage alerts using the CLI
      • List of alerts and corrective actions
    • Events
      • Manage events using the GUI
      • Manage events using the CLI
      • List of events
    • Statistics
      • Manage statistics using the GUI
      • Manage statistics using the CLI
      • List of statistics
    • Insights
    • System congestion
    • Security management
      • Obtain authentication tokens
      • KMS management
        • Manage KMS using the GUI
        • Manage KMS using the CLI
      • TLS certificate management
        • Manage the TLS certificate using the GUI
        • Manage the TLS certificate using the CLI
      • CA certificate management
        • Manage the CA certificate using the GUI
        • Manage the CA certificate using the CLI
      • Account lockout threshold policy management
        • Manage the account lockout threshold policy using GUI
        • Manage the account lockout threshold policy using CLI
      • Manage the login banner
        • Manage the login banner using the GUI
        • Manage the login banner using the CLI
      • Manage Cross-Origin Resource Sharing
    • User management
      • Manage users using the GUI
      • Manage users using the CLI
    • Organizations management
      • Manage organizations using the GUI
      • Manage organizations using the CLI
      • Mount authentication for organization filesystems
    • Expand and shrink cluster resources
      • Add a backend server
      • Expand specific resources of a container
      • Shrink a cluster
    • Background tasks
      • Set up a Data Services container for background tasks
      • Manage background tasks using the GUI
      • Manage background tasks using the CLI
    • Upgrade WEKA versions
  • Licensing
    • License overview
    • Classic license
  • Monitor the WEKA Cluster
    • Deploy monitoring tools using the WEKA Management Station (WMS)
    • WEKA Home - The WEKA support cloud
      • Local WEKA Home overview
      • Deploy Local WEKA Home v3.0 or higher
      • Deploy Local WEKA Home v2.x
      • Explore cluster insights and statistics
      • Manage alerts and integrations
      • Enforce security and compliance
      • Optimize support and data management
    • Set up the WEKAmon external monitoring
    • Set up the SnapTool external snapshots manager
  • Support
    • Get support for your WEKA system
    • Diagnostics management
      • Traces management
        • Manage traces using the GUI
        • Manage traces using the CLI
      • Protocols debug level management
        • Manage protocols debug level using the GUI
        • Manage protocols debug level using the CLI
      • Diagnostics data management
  • Best Practice Guides
    • WEKA and Slurm integration
      • Avoid conflicting CPU allocations
    • Storage expansion best practice
  • WEKApod
    • WEKApod Data Platform Appliance overview
    • WEKApod servers overview
    • Rack installation
    • WEKApod initial system setup and configuration
    • WEKApod support process
  • Appendices
    • WEKA CSI Plugin
      • Deployment
      • Storage class configurations
      • Tailor your storage class configuration with mount options
      • Dynamic and static provisioning
      • Launch an application using WEKA as the POD's storage
      • Add SELinux support
      • NFS transport failback
      • Upgrade legacy persistent volumes for capacity enforcement
      • Troubleshooting
    • Convert cluster to multi-container backend
    • Create a client image
    • Update WMS and WSA
    • BIOS tool
Powered by GitBook
On this page
  • About WEKA performance tests
  • The FIO tool
  • MDTest
  • WEKA client performance tests
  • Results summary
  • Test read throughput
  • Test write throughput
  • Test read IOPS
  • Test write IOPS
  • Test read latency
  • Test write latency
  • Test metadata performance
  • Run all benchmark tests
  • Preparation
  • Run the benchmark
  1. Performance

WEKA performance tests

This page describes a series of tests for measuring performance after the installation of the WEKA system. The same tests can be used to test the performance of any other storage solution.

PreviousWEKA REST API and equivalent CLI commandsNextTest environment details

About WEKA performance tests

There are three main performance metrics when measuring a storage system's performance:

  1. Latency, which is the time from operation initiation to completion

  2. The number of different IO operations (read/write/metadata) that the system can process concurrently

  3. The bandwidth of data that the system can process concurrently

Each performance metric applies to read operations, write operations, or a mixture of read and write operations.

‌When measuring the WEKA system performance, different produce different performance characteristics. Additionally, client network configuration (using user-space DPDK networking or kernel UDP) significantly affects performance.

All performance tests listed here are generic and not specific to the WEKA system. They can be used to compare the WEKA storage system to other storage systems or a local storage device.

There is a difference between single-client performance to aggregated performance. When running the tests listed below from one client, the client will limit the test's performance. In general, several clients will be required to maximize the performance of a WEKA cluster.

The FIO tool

The is a generic open-source storage performance testing tool that can be defined as described . In this documentation, the usage of FIO version 3.20 is assumed.

All FIO testing is done using the client/server capabilities of FIO. This makes multiple-client testing easier since FIO reports aggregated results for all clients under the test. Single-client tests are run the same way to keep the results consistent.

Start the FIO server on every one of the clients:

fio --server --daemonize=/tmp/fio.pid

Run the test command from one of the clients, note, the clients need to be mounted to a WEKA filesystem.

An example of launching a test (sometest) on all clients in a file (clients.txt) using the server/client model:

fio --client=clients.txt sometest.txt

An example for the clients' file, when running multiple clients:

clients.txt
weka-client-01
weka-client-02
weka-client-03
weka-client-04
weka-client-05
weka-client-06
weka-client-07
weka-client-08

An example of aggregated test results:

All clients: (groupid=0, jobs=16): err= 0: pid=0: Wed Jun  3 22:10:46 2020
  read: IOPS=30.1k, BW=29.4Gi (31.6G)(8822GiB/300044msec)
    slat (nsec): min=0, max=228000, avg=6308.42, stdev=4988.75
    clat (usec): min=1132, max=406048, avg=16982.89, stdev=27664.80
     lat (usec): min=1147, max=406051, avg=16989.20, stdev=27664.25
   bw (  MiB/s): min= 3576, max=123124, per=93.95%, avg=28284.95, stdev=42.13, samples=287520
   iops        : min= 3576, max=123124, avg=28284.82, stdev=42.13, samples=287520
  lat (msec)   : 2=6.64%, 4=56.55%, 10=8.14%, 20=4.42%, 50=13.81%
  lat (msec)   : 100=7.01%, 250=3.44%, 500=0.01%
  cpu          : usr=0.11%, sys=0.09%, ctx=9039177, majf=0, minf=8088
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=9033447,0,0,0 short=0,0,0,0 dropped=0,0,0,0

The single-client or aggregated tests differ in the clients participating in the test, as defined in the clients.txt.

MDTest

MDTest is a generic open-source metadata performance testing tool. In this documentation, the usage of version 1.9.3 is assumed.

WEKA client performance tests

Overall, the tests contained on this page are designed to show off the sustainable peak performance of the filesystem. Care has been taken to make sure they are realistic and reproducible.

Where possible, the benchmarks try to negate the effects of caching. For file testing, o_direct calls are used to bypass the client's cache. In the case of metadata testing, each phase of testing uses different clients. Also, between each test, the Linux caches are flushed to ensure all data being accessed is not present in the cache. While applications will often take advantage of cached data and metadata, this testing focuses on the filesystem's ability to deliver data independent of caching on the client.

While we provide below the output of one iteration, we ran each test several times and provided the average results in the following results summary.

Results summary

Single client results

Benchmark
AWS
SuperMicro

Read Throughput

8.9 GiB/s

21.4 GiB/s

Write Throughput

9.4 GiB/s

17.2 GiB/s

Read IOPS

393,333 ops/s

563,667 ops/s

Write IOPS

302,333 ops/s

378,667 ops/s

Read Latency

272 µs avg.

99.5% completed under 459 µs

144.76 µs avg.

99.5% completed under 260 µs

Write Latency

298 µs avg.

99.5% completed under 432 µs

107.12 µs avg.

99.5% completed under 142 µs

Aggregated cluster results (with multiple clients)

Benchmark

Read Throughput

36.2 GiB/s

123 GiB/s

Write Throughput

11.6 GiB/s

37.6 GiB/s

Read IOPS

1,978,330 ops/s

4,346,330 ops/s

Write IOPS

404,670 ops/s

1,317,000 ops/s

Creates

79,599 ops/s

234,472 ops/s

Stats

1,930,721 ops/s

3,257,394 ops/s

Deletes

117,644 ops/s

361,755 ops/s

If the client uses a 100 Gbps NIC or above, mounting the WEKA filesystem with more than one core is required to maximize client throughput.

Test read throughput

This test measures the client throughput for large (1MB) reads. The job below tries to maximize the read throughput from a single client. The test utilizes multiple threads, each one performing 1 MB reads.

Job definition

read_throughput.txt
[global]
filesize=128G
time_based=1
numjobs=32
startdelay=5
exitall_on_error=1
create_serialize=0
filename_format=$jobnum/$filenum/bw.$jobnum.$filenum
directory=/mnt/weka/fio
group_reporting=1
clocksource=gettimeofday
runtime=300
ioengine=posixaio
disk_util=0
iodepth=1

[read_throughput]
bs=1m
rw=read
direct=1
new_group

Test output example

read_throughput: (groupid=0, jobs=32): err= 0: pid=70956: Wed Jul  8 13:27:48 2020
  read: IOPS=9167, BW=9167MiB/s (9613MB/s)(2686GiB/300004msec)
    slat (nsec): min=0, max=409000, avg=3882.55, stdev=3631.79
    clat (usec): min=999, max=14947, avg=3482.93, stdev=991.25
     lat (usec): min=1002, max=14949, avg=3486.81, stdev=991.16
    clat percentiles (usec):
     |  1.00th=[ 1795],  5.00th=[ 2147], 10.00th=[ 2376], 20.00th=[ 2671],
     | 30.00th=[ 2900], 40.00th=[ 3130], 50.00th=[ 3359], 60.00th=[ 3589],
     | 70.00th=[ 3851], 80.00th=[ 4178], 90.00th=[ 4752], 95.00th=[ 5342],
     | 99.00th=[ 6521], 99.50th=[ 7046], 99.90th=[ 8160], 99.95th=[ 8717],
     | 99.99th=[ 9896]
   bw (  MiB/s): min= 7942, max=10412, per=100.00%, avg=9179.14, stdev=12.41, samples=19168
   iops        : min= 7942, max=10412, avg=9179.14, stdev=12.41, samples=19168
  lat (usec)   : 1000=0.01%
  lat (msec)   : 2=2.76%, 4=72.16%, 10=25.07%, 20=0.01%
  cpu          : usr=0.55%, sys=0.34%, ctx=2751410, majf=0, minf=490
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=2750270,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

In this test output example, results show a bandwidth of 8.95 GiB/s from a single client.

Test write throughput

This test measures the client throughput for large (1MB) writes. The job below tries to maximize the write throughput from a single client. The test utilizes multiple threads, each one performing 1MB writes.

Job definition

write_throughput.txt
[global]
filesize=128G
time_based=1
numjobs=32
startdelay=5
exitall_on_error=1
create_serialize=0
filename_format=$jobnum/$filenum/bw.$jobnum.$filenum
directory=/mnt/weka/fio
group_reporting=1
clocksource=gettimeofday
runtime=300
ioengine=posixaio
disk_util=0
iodepth=1

[write_throughput]
bs=1m
rw=write
direct=1
new_group

Test output example

write_throughput: (groupid=0, jobs=32): err= 0: pid=71903: Wed Jul  8 13:43:15 2020
  write: IOPS=7034, BW=7035MiB/s (7377MB/s)(2061GiB/300005msec); 0 zone resets
    slat (usec): min=12, max=261, avg=39.22, stdev=12.92
    clat (usec): min=2248, max=20882, avg=4505.62, stdev=1181.45
     lat (usec): min=2318, max=20951, avg=4544.84, stdev=1184.64
    clat percentiles (usec):
     |  1.00th=[ 2769],  5.00th=[ 2999], 10.00th=[ 3163], 20.00th=[ 3458],
     | 30.00th=[ 3752], 40.00th=[ 4047], 50.00th=[ 4359], 60.00th=[ 4686],
     | 70.00th=[ 5014], 80.00th=[ 5407], 90.00th=[ 5997], 95.00th=[ 6587],
     | 99.00th=[ 8160], 99.50th=[ 8979], 99.90th=[10945], 99.95th=[12125],
     | 99.99th=[14746]
   bw (  MiB/s): min= 5908, max= 7858, per=100.00%, avg=7043.58, stdev= 9.37, samples=19168
   iops        : min= 5908, max= 7858, avg=7043.58, stdev= 9.37, samples=19168
  lat (msec)   : 4=38.87%, 10=60.90%, 20=0.22%, 50=0.01%
  cpu          : usr=1.34%, sys=0.15%, ctx=2114914, majf=0, minf=473
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,2110493,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

In this test output example, results show a bandwidth of 6.87 GiB/s.

Test read IOPS

This test measures the ability of the client to deliver concurrent 4KB reads. The job below tries to maximize the system read IOPS from a single client. The test utilizes multiple threads, each one performing 4KB reads.

Job definition

read_iops.txt
[global]
filesize=4G
time_based=1
numjobs=192
startdelay=5
exitall_on_error=1
create_serialize=0
filename_format=$jobnum/$filenum/iops.$jobnum.$filenum
directory=/mnt/weka/fio
group_reporting=1
clocksource=gettimeofday
runtime=300
ioengine=posixaio
disk_util=0
iodepth=1

[read_iops]
bs=4k
rw=randread
direct=1
new_group

Test output example

read_iops: (groupid=0, jobs=192): err= 0: pid=66528: Wed Jul  8 12:30:38 2020
  read: IOPS=390k, BW=1525MiB/s (1599MB/s)(447GiB/300002msec)
    slat (nsec): min=0, max=392000, avg=3512.56, stdev=2950.62
    clat (usec): min=213, max=15496, avg=486.61, stdev=80.30
     lat (usec): min=215, max=15505, avg=490.12, stdev=80.47
    clat percentiles (usec):
     |  1.00th=[  338],  5.00th=[  375], 10.00th=[  400], 20.00th=[  424],
     | 30.00th=[  445], 40.00th=[  465], 50.00th=[  482], 60.00th=[  498],
     | 70.00th=[  519], 80.00th=[  545], 90.00th=[  586], 95.00th=[  619],
     | 99.00th=[  685], 99.50th=[  717], 99.90th=[  783], 99.95th=[  816],
     | 99.99th=[ 1106]
   bw (  MiB/s): min= 1458, max= 1641, per=100.00%, avg=1525.52, stdev= 0.16, samples=114816
   iops        : min=373471, max=420192, avg=390494.54, stdev=40.47, samples=114816
  lat (usec)   : 250=0.01%, 500=60.20%, 750=39.60%, 1000=0.19%
  lat (msec)   : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%
  cpu          : usr=1.24%, sys=1.52%, ctx=117366459, majf=0, minf=3051
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=117088775,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

In this test output example, results show 390,494 IOPS from a single client.

Test write IOPS

This test measures the ability of the client to deliver concurrent 4KB writes. The job below tries to maximize the system write IOPS from a single client. The test utilizes multiple threads, each one performing 4KB writes.

Job definition

write_iops.txt
[global]
filesize=4G
time_based=1
numjobs=192
startdelay=5
exitall_on_error=1
create_serialize=0
filename_format=$jobnum/$filenum/iops.$jobnum.$filenum
directory=/mnt/weka/fio
group_reporting=1
clocksource=gettimeofday
runtime=300
ioengine=posixaio
disk_util=0
iodepth=1

[write_iops]
bs=4k
rw=randwrite
direct=1
new_group

Test output example

write_iops: (groupid=0, jobs=192): err= 0: pid=72163: Wed Jul  8 13:48:24 2020
  write: IOPS=288k, BW=1125MiB/s (1180MB/s)(330GiB/300003msec); 0 zone resets
    slat (nsec): min=0, max=2591.0k, avg=5030.10, stdev=4141.48
    clat (usec): min=219, max=17801, avg=659.20, stdev=213.57
     lat (usec): min=220, max=17803, avg=664.23, stdev=213.72
    clat percentiles (usec):
     |  1.00th=[  396],  5.00th=[  441], 10.00th=[  474], 20.00th=[  515],
     | 30.00th=[  553], 40.00th=[  594], 50.00th=[  627], 60.00th=[  668],
     | 70.00th=[  701], 80.00th=[  750], 90.00th=[  840], 95.00th=[  971],
     | 99.00th=[ 1450], 99.50th=[ 1614], 99.90th=[ 2409], 99.95th=[ 3490],
     | 99.99th=[ 4359]
   bw (  MiB/s): min= 1056, max= 1224, per=100.00%, avg=1125.96, stdev= 0.16, samples=114816
   iops        : min=270390, max=313477, avg=288215.11, stdev=40.70, samples=114816
  lat (usec)   : 250=0.01%, 500=15.96%, 750=63.43%, 1000=16.05%
  lat (msec)   : 2=4.41%, 4=0.14%, 10=0.02%, 20=0.01%
  cpu          : usr=1.21%, sys=1.49%, ctx=86954124, majf=0, minf=3055
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,86398871,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

In this test output example, results show 288,215 IOPS from a single client.

Test read latency

This test measures the minimal achievable read latency under a light load. The test measures the latency over a single-threaded sequence of 4KB reads across multiple files. Each read is executed only after the previous read has been served.

Job definition

read_latency.txt
[global]
filesize=4G
time_based=1
startdelay=5
exitall_on_error=1
create_serialize=0
filename_format=$jobnum/$filenum/iops.$jobnum.$filenum
directory=/mnt/weka/fio
group_reporting=1
clocksource=gettimeofday
runtime=300
ioengine=posixaio
disk_util=0
iodepth=1

[read_latency]
numjobs=1
bs=4k
rw=randread
direct=1
new_group

Example of Test Output

read_latency: (groupid=0, jobs=1): err= 0: pid=71741: Wed Jul  8 13:38:06 2020
  read: IOPS=4318, BW=16.9MiB/s (17.7MB/s)(5061MiB/300001msec)
    slat (nsec): min=0, max=53000, avg=1923.23, stdev=539.64
    clat (usec): min=160, max=1743, avg=229.09, stdev=44.80
     lat (usec): min=162, max=1746, avg=231.01, stdev=44.80
    clat percentiles (usec):
     |  1.00th=[  174],  5.00th=[  180], 10.00th=[  182], 20.00th=[  188],
     | 30.00th=[  190], 40.00th=[  196], 50.00th=[  233], 60.00th=[  245],
     | 70.00th=[  255], 80.00th=[  269], 90.00th=[  289], 95.00th=[  318],
     | 99.00th=[  330], 99.50th=[  334], 99.90th=[  355], 99.95th=[  437],
     | 99.99th=[  529]
   bw (  KiB/s): min=16280, max=17672, per=100.00%, avg=17299.11, stdev=195.37, samples=599
   iops        : min= 4070, max= 4418, avg=4324.78, stdev=48.84, samples=599
  lat (usec)   : 250=66.18%, 500=33.80%, 750=0.02%, 1000=0.01%
  lat (msec)   : 2=0.01%
  cpu          : usr=0.95%, sys=1.44%, ctx=1295670, majf=0, minf=13
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=1295643,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

In this test output example, results show an average latency of 229 microseconds, where 99.5% of the writes terminated in 334 microseconds or less.

Test write latency

This test measures the minimal achievable write latency under a light load. The test measures the latency over a single-threaded sequence of 4KB writes across multiple files. Each write is executed only after the previous write has been served.

Job definition

write_latency.txt
[global]
filesize=4G
time_based=1
startdelay=5
exitall_on_error=1
create_serialize=0
filename_format=$jobnum/$filenum/iops.$jobnum.$filenum
directory=/mnt/weka/fio
group_reporting=1
clocksource=gettimeofday
runtime=300
ioengine=posixaio
disk_util=0
iodepth=1

[write_latency]
numjobs=1
bs=4k
rw=randwrite
direct=1
new_group

Test output example

write_latency: (groupid=0, jobs=1): err= 0: pid=72709: Wed Jul  8 13:53:33 2020
  write: IOPS=4383, BW=17.1MiB/s (17.0MB/s)(5136MiB/300001msec); 0 zone resets
    slat (nsec): min=0, max=56000, avg=1382.96, stdev=653.78
    clat (usec): min=195, max=9765, avg=226.21, stdev=109.45
     lat (usec): min=197, max=9766, avg=227.59, stdev=109.46
    clat percentiles (usec):
     |  1.00th=[  208],  5.00th=[  212], 10.00th=[  215], 20.00th=[  217],
     | 30.00th=[  219], 40.00th=[  219], 50.00th=[  221], 60.00th=[  223],
     | 70.00th=[  225], 80.00th=[  229], 90.00th=[  233], 95.00th=[  243],
     | 99.00th=[  269], 99.50th=[  293], 99.90th=[  725], 99.95th=[ 2540],
     | 99.99th=[ 6063]
   bw (  KiB/s): min=16680, max=18000, per=100.00%, avg=17555.48, stdev=279.31, samples=599
   iops        : min= 4170, max= 4500, avg=4388.87, stdev=69.83, samples=599
  lat (usec)   : 250=96.27%, 500=3.61%, 750=0.03%, 1000=0.01%
  lat (msec)   : 2=0.03%, 4=0.03%, 10=0.03%
  cpu          : usr=0.93%, sys=1.52%, ctx=1315723, majf=0, minf=14
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,1314929,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

In this test output example, results show an average latency of 226 microseconds, where 99.5% of the writes terminated in 293 microseconds or less.

Test metadata performance

The test measures the rate of metadata operations (such as create, stat, delete) across the cluster. The test uses 20 million files: it uses 8 clients, and multiple threads per client are utilized (136), where each thread handles 18382 files. It is invoked 3 times and provides a summary of the iterations.

Job definition

mpiexec -f <hostfile> -np 1088 mdtest -v -N 136 -i 3 -n 18382 -F -u -d /mnt/weka/mdtest

Test output example

SUMMARY rate: (of 3 iterations)
   Operation                      Max            Min           Mean        Std Dev
   ---------                      ---            ---           ----        -------
   File creation     :      40784.448      40784.447      40784.448          0.001
   File stat         :    2352915.997    2352902.666    2352911.311          6.121
   File read         :     217236.252     217236.114     217236.162          0.064
   File removal      :      44101.905      44101.896      44101.902          0.004
   Tree creation     :          3.788          3.097          3.342          0.316
   Tree removal      :          1.192          1.142          1.172          0.022

Run all benchmark tests

If it is preferred to run all the tests sequentially and review the results afterward, follow the instructions below.

Preparation

From each client, create a mount point in /mnt/weka to a Weka filesystem and create there the following directories:

# create directories in the weka filesystem
mkdir /mnt/weka/fio
mkdir /mnt/weka/mdtest

Copy the FIOmaster.txt file to your server and create the clients.txt file with your clients' hostnames.

Run the benchmark

Run the benchmarks using the following commands:

# single client
fio FIOmaster.txt

# multiple clients
fio --client=clients.txt FIOmaster.txt

# mdtest
mpiexec -f clients.txt -np 1088 mdtest -v -N 136 -i 3 -n 18382 -F -u -d /mnt/weka/mdtest

MDTest uses an MPI framework to coordinate the job across multiple nodes. The results presented here were generated using the version 3.3.2 and can be defined as described . While it's possible to have variations with different MPI versions, most are based on the same ROMIO and will perform similarly.

mount modes
FIO tool
here
MPICH
here
AWS
SuperMicro
2KB
FIOmaster.txt