W E K A
4.3
4.3
  • WEKA v4.3 documentation
    • Documentation revision history
  • WEKA System Overview
    • WEKA Data Platform introduction
      • WEKA system functionality features
      • Converged WEKA system deployment
      • Optimize redundancy in WEKA deployments
    • SSD capacity management
    • Filesystems, object stores, and filesystem groups
    • WEKA networking
    • Data lifecycle management
    • WEKA client and mount modes
    • WEKA containers architecture overview
    • Glossary
  • Planning and Installation
    • Prerequisites and compatibility
    • WEKA cluster installation on bare metal servers
      • Plan the WEKA system hardware requirements
      • Obtain the WEKA installation packages
      • Install the WEKA cluster using the WMS with WSA
      • Install the WEKA cluster using the WSA
      • Manually install OS and WEKA on servers
      • Manually prepare the system for WEKA configuration
        • Broadcom adapter setup for WEKA system
        • Enable the SR-IOV
      • Configure the WEKA cluster using the WEKA Configurator
      • Manually configure the WEKA cluster using the resource generator
      • Perform post-configuration procedures
      • Add clients to an on-premises WEKA cluster
    • WEKA Cloud Deployment Manager Web (CDM Web) User Guide
    • WEKA Cloud Deployment Manager Local (CDM Local) User Guide
    • WEKA installation on AWS
      • WEKA installation on AWS using Terraform
        • Terraform-AWS-WEKA module description
        • Deployment on AWS using Terraform
        • Required services and supported regions
        • Supported EC2 instance types using Terraform
        • WEKA cluster auto-scaling in AWS
        • Detailed deployment tutorial: WEKA on AWS using Terraform
      • WEKA installation on AWS using the Cloud Formation
        • Self-service portal
        • CloudFormation template generator
        • Deployment types
        • AWS Outposts deployment
        • Supported EC2 instance types using Cloud Formation
        • Add clients to a WEKA cluster on AWS
        • Auto scaling group
        • Troubleshooting
      • Install SMB on AWS
    • WEKA installation on Azure
    • WEKA installation on GCP
      • WEKA project description
      • GCP-WEKA deployment Terraform package description
      • Deployment on GCP using Terraform
      • Required services and supported regions
      • Supported machine types and storage
      • Auto-scale instances in GCP
      • Add clients to a WEKA cluster on GCP
      • Troubleshooting
      • Detailed deployment tutorial: WEKA on GCP using Terraform
      • Google Kubernetes Engine and WEKA over POSIX deployment
  • Getting Started with WEKA
    • Manage the system using the WEKA GUI
    • Manage the system using the WEKA CLI
      • WEKA CLI hierarchy
      • CLI reference guide
    • Run first IOs with WEKA filesystem
    • Getting started with WEKA REST API
    • WEKA REST API and equivalent CLI commands
  • Performance
    • WEKA performance tests
      • Test environment details
  • WEKA Filesystems & Object Stores
    • Manage object stores
      • Manage object stores using the GUI
      • Manage object stores using the CLI
    • Manage filesystem groups
      • Manage filesystem groups using the GUI
      • Manage filesystem groups using the CLI
    • Manage filesystems
      • Manage filesystems using the GUI
      • Manage filesystems using the CLI
    • Attach or detach object store buckets
      • Attach or detach object store bucket using the GUI
      • Attach or detach object store buckets using the CLI
    • Advanced data lifecycle management
      • Advanced time-based policies for data storage location
      • Data management in tiered filesystems
      • Transition between tiered and SSD-only filesystems
      • Manual fetch and release of data
    • Mount filesystems
      • Mount filesystems from Single Client to Multiple Clusters (SCMC)
    • Snapshots
      • Manage snapshots using the GUI
      • Manage snapshots using the CLI
    • Snap-To-Object
      • Manage Snap-To-Object using the GUI
      • Manage Snap-To-Object using the CLI
    • Quota management
      • Manage quotas using the GUI
      • Manage quotas using the CLI
  • Additional Protocols
    • Additional protocol containers
    • Manage the NFS protocol
      • Supported NFS client mount parameters
      • Manage NFS networking using the GUI
      • Manage NFS networking using the CLI
    • Manage the S3 protocol
      • S3 cluster management
        • Manage the S3 service using the GUI
        • Manage the S3 service using the CLI
      • S3 buckets management
        • Manage S3 buckets using the GUI
        • Manage S3 buckets using the CLI
      • S3 users and authentication
        • Manage S3 users and authentication using the CLI
        • Manage S3 service accounts using the CLI
      • S3 rules information lifecycle management (ILM)
        • Manage S3 lifecycle rules using the GUI
        • Manage S3 lifecycle rules using the CLI
      • Audit S3 APIs
        • Configure audit webhook using the GUI
        • Configure audit webhook using the CLI
        • Example: How to use Splunk to audit S3
      • S3 supported APIs and limitations
      • S3 examples using boto3
      • Access S3 using AWS CLI
    • Manage the SMB protocol
      • Manage SMB using the GUI
      • Manage SMB using the CLI
  • Operation Guide
    • Alerts
      • Manage alerts using the GUI
      • Manage alerts using the CLI
      • List of alerts and corrective actions
    • Events
      • Manage events using the GUI
      • Manage events using the CLI
      • List of events
    • Statistics
      • Manage statistics using the GUI
      • Manage statistics using the CLI
      • List of statistics
    • Insights
    • System congestion
    • Security management
      • Obtain authentication tokens
      • KMS management
        • Manage KMS using the GUI
        • Manage KMS using the CLI
      • TLS certificate management
        • Manage the TLS certificate using the GUI
        • Manage the TLS certificate using the CLI
      • CA certificate management
        • Manage the CA certificate using the GUI
        • Manage the CA certificate using the CLI
      • Account lockout threshold policy management
        • Manage the account lockout threshold policy using GUI
        • Manage the account lockout threshold policy using CLI
      • Manage the login banner
        • Manage the login banner using the GUI
        • Manage the login banner using the CLI
      • Manage Cross-Origin Resource Sharing
    • User management
      • Manage users using the GUI
      • Manage users using the CLI
    • Organizations management
      • Manage organizations using the GUI
      • Manage organizations using the CLI
      • Mount authentication for organization filesystems
    • Expand and shrink cluster resources
      • Add a backend server
      • Expand specific resources of a container
      • Shrink a cluster
    • Background tasks
      • Set up a Data Services container for background tasks
      • Manage background tasks using the GUI
      • Manage background tasks using the CLI
    • Upgrade WEKA versions
  • Licensing
    • License overview
    • Classic license
  • Monitor the WEKA Cluster
    • Deploy monitoring tools using the WEKA Management Station (WMS)
    • WEKA Home - The WEKA support cloud
      • Local WEKA Home overview
      • Deploy Local WEKA Home v3.0 or higher
      • Deploy Local WEKA Home v2.x
      • Explore cluster insights and statistics
      • Manage alerts and integrations
      • Enforce security and compliance
      • Optimize support and data management
    • Set up the WEKAmon external monitoring
    • Set up the SnapTool external snapshots manager
  • Support
    • Get support for your WEKA system
    • Diagnostics management
      • Traces management
        • Manage traces using the GUI
        • Manage traces using the CLI
      • Protocols debug level management
        • Manage protocols debug level using the GUI
        • Manage protocols debug level using the CLI
      • Diagnostics data management
  • Best Practice Guides
    • WEKA and Slurm integration
      • Avoid conflicting CPU allocations
    • Storage expansion best practice
  • WEKApod
    • WEKApod Data Platform Appliance overview
    • WEKApod servers overview
    • Rack installation
    • WEKApod initial system setup and configuration
    • WEKApod support process
  • Appendices
    • WEKA CSI Plugin
      • Deployment
      • Storage class configurations
      • Tailor your storage class configuration with mount options
      • Dynamic and static provisioning
      • Launch an application using WEKA as the POD's storage
      • Add SELinux support
      • NFS transport failback
      • Upgrade legacy persistent volumes for capacity enforcement
      • Troubleshooting
    • Convert cluster to multi-container backend
    • Create a client image
    • Update WMS and WSA
    • BIOS tool
Powered by GitBook
On this page
  • Overview
  • Performance-optimized networking (DPDK)
  • CPU-optimized networking
  • Typical WEKA configuration
  • Backend servers
  • Clients
  • Configuration guidelines
  • High Availability (HA)
  • RDMA and GPUDirect Storage
  • Requirements and considerations for enabling RDMA and GPUDirect Storage
  1. WEKA System Overview

WEKA networking

Explore network technologies in WEKA, including DPDK, SR-IOV, CPU-optimized networking, UDP mode, high availability, and RDMA/GPUDirect Storage, with configuration guidelines.

PreviousFilesystems, object stores, and filesystem groupsNextData lifecycle management

Last updated 7 months ago

Overview

The WEKA system supports the following types of networking technologies:

  • ‌InfiniBand (IB)

  • Ethernet

‌The networking infrastructure dictates the choice between the two. If a WEKA cluster is connected to both infrastructures, it is possible to connect WEKA clients from both networks to the same cluster.

The WEKA system networking can be configured as performance-optimized or CPU-optimized. In networking, the CPU cores are dedicated to WEKA, and the networking uses DPDK. In networking, the CPU cores are not dedicated to WEKA, and the networking uses DPDK (when supported by the NIC drivers) or in-kernel (UDP mode).

Performance-optimized networking (DPDK)

For performance-optimized networking, the WEKA system does not use standard kernel-based TCP/IP services but a proprietary infrastructure based on the following:

  • Use to map the network device in the user space and use it without any context switches and with zero-copy access. This bypassing of the kernel stack eliminates the consumption of kernel resources for networking operations. It applies to backends and clients and lets the WEKA system saturate network links (including, for example, 200 Gbps or 400 Gbps).

  • Implementing a proprietary WEKA protocol over UDP, i.e., the underlying network, may involve routing between subnets or any other networking infrastructure that supports UDP.

The use of DPDK delivers operations with extremely low latency and high throughput. Low latency is achieved by bypassing the kernel and sending and receiving packages directly from the NIC. High throughput is achieved because multiple cores in the same server can work in parallel without a common bottleneck.

Before proceeding, it is important to understand several key terms used in this section, namely DPDK and SR-IOV.

DPDK

‌ is a set of libraries and network drivers for highly efficient, low-latency packet processing. This is achieved through several techniques, such as kernel TCP/IP bypass, NUMA locality, multi-core processing, and device access via polling to eliminate the performance overhead of interrupt processing. In addition, DPDK ensures transmission reliability, handles retransmission, and controls congestion.

DPDK implementations are available from several sources. OS vendors like and provide DPDK implementations through distribution channels. (Mellanox OFED), a suite of libraries, tools, and drivers supporting Mellanox NICs, offers its own DPDK implementation.

The WEKA system relies on the DPDK implementation provided by Mellanox OFED on servers equipped with Mellanox NICs. For servers equipped with Intel NICs, DPDK support is through the Intel driver for the card.‌

SR-IOV

Single Root I/O Virtualization (SR-IOV) extends the PCI Express (PCIe) specification that enables PCIe virtualization. It allows a PCIe device, such as a network adapter, to appear as multiple PCIe devices or functions.

There are two function categories:

  • Physical Function (PF): PF is a full-fledged PCIe function that can also be configured.

  • Virtual Function (VF): VF is a virtualized instance of the same PCIe device created by sending appropriate commands to the device PF.

Typically, there are many VFs, but only one PF per physical PCIe device. Once a new VF is created, it can be mapped by an object such as a virtual machine, container, or, in the WEKA system, by a 'compute' process.

To take advantage of SR-IOV technology, the software and hardware must be supported. The Linux kernel provides SR-IOV software support. The computer BIOS and the network adapter provide hardware support (by default, SR-IOV is disabled and must be enabled before installing WEKA).

CPU-optimized networking

For CPU-optimized networking, WEKA can yield CPU resources to other applications. That is useful when the extra CPU cores are needed for other purposes. However, the lack of CPU resources dedicated to the WEKA system comes with the expense of reduced overall performance.

DPDK without the core dedication

UDP mode

WEKA can also use in-kernel processing and UDP as the transport protocol. This operation mode is commonly referred to as UDP mode.

UDP mode is compatible with older platforms that lack support for kernel offloading technologies (DPDK) or virtualization (SR-IOV) due to its use of in-kernel processing. This includes legacy hardware, such as the Mellanox CX3 family of NICs.

Typical WEKA configuration

Backend servers

In a typical WEKA system configuration, the WEKA backend servers access the network function in two different methods:

  • Standard TCP/UDP network for management and control operations.

  • High-performance network for data-path traffic.

The WEKA system maintains a separate ARP database for its IP addresses and virtual functions and does not use the kernel or operating system ARP services.

Clients

While WEKA backend servers must include DPDK and SR-IOV, WEKA clients in application servers have the flexibility to use either DPDK or UDP modes. DPDK mode is the preferred choice for newer, high-performing platforms that support it. UDP mode is available for clients without SR-IOV or DPDK support or when there is no need for low-latency and high-throughput I/O.

Configuration guidelines

  • DPDK backends and clients using NICs supporting shared networking:

    • Require one IP address per client for both management and data plane.

    • SR-IOV enabled is not required.

  • DPDK backends and clients using NICs supporting dedicated networking:

    • IP address for management: One per NIC (configured before WEKA installation).

      • Ensure the device supports a maximum number of VFs greater than the number of physical cores on the server.

      • Set the number of VFs to match the cores you intend to dedicate to WEKA.

      • Note that some BIOS configurations may be necessary.

    • SR-IOV: Enabled in BIOS.

  • UDP clients:

    • Use a shared networking IP address for all purposes.

High Availability (HA)

To support HA, the WEKA system must be configured with no single component representing a single point of failure. Multiple switches are required, and servers must have one leg on each.

HA for servers is achieved either through implementing two network interfaces on the same server or by LACP (ethernet only, mode 4). A non-LACP approach sets a redundancy that enables the WEKA software to use two interfaces for HA and bandwidth.

HA performs failover and failback for reliability and load balancing on both interfaces and is operational for Ethernet and InfiniBand. Not using LACP requires doubling the number of IPs on both the backend containers and the IO processes.

When working with HA networking, labeling the system to send data between servers through the same switch is helpful rather than using the ISL or other paths in the fabric. This can reduce the overall traffic in the network. To label the system for identifying the switch and network port, use the label parameter in the weka cluster container net add command.

LACP (link aggregation, also known as bond interfaces) is currently supported between ports on a single Mellanox NIC and is not supported when using VFs (virtual functions).

RDMA and GPUDirect Storage

GPUDirect Storage establishes a direct data path between storage and GPU memory, bypassing unnecessary data copies through the CPU's memory. This approach allows a Direct Memory Access (DMA) engine near the NIC or storage to transfer data directly to or from GPU memory without involving the CPU or GPU.

When RDMA and GPUDirect Storage are enabled, the WEKA system automatically uses the RDMA data path and GPUDirect Storage in supported environments. The system dynamically detects when RDMA is available, both in UDP and DPDK modes, and applies it to workloads that can benefit from RDMA (typically for I/O sizes of 32KB or larger for reads and 256KB or larger for writes).

By leveraging RDMA and GPUDirect Storage, you can achieve enhanced performance. A UDP client, which doesn't require dedicating a core to the WEKA system, can deliver significantly higher performance. Additionally, a DPDK client can experience an extra performance boost, or you can assign fewer cores to the WEKA system while maintaining the same level of performance in DPDK mode.

Requirements and considerations for enabling RDMA and GPUDirect Storage

To enable RDMA and GPUDirect Storage technology, ensure the following requirements are met:

  • Cluster requirements

    • RDMA networking: All servers in the cluster must support RDMA networking.

  • Client requirements

    • GPUDirect Storage: The InfiniBand (IB) interfaces added to the NVIDIA GPUDirect configuration must support RDMA.

    • RDMA: All InfiniBand Host Channel Adapters (HCAs) used by WEKA must support RDMA networking.

  • Encrypted filesystems

    • RDMA and GPUDirect Storage are not utilized for encrypted filesystems. In these cases, the system reverts to standard I/O operations without RDMA or GPUDirect Storage.

  • HCA requirements for RDMA networking An HCA is considered to support RDMA networking if the following conditions are met:

    • For GPUDirect Storage: The network must be InfiniBand. While using an Ethernet network may be possible, this configuration is not supported.

Installation notes

  • GPUDirect Storage: Install the OFED with the --upstream-libs and --dpdk options.

  • Kernel bypass: GPUDirect Storage bypasses the kernel and does not use the page cache. However, standard RDMA clients still use the page cache.

Unsupported configuration

  • Mixed networking clusters: RDMA and GPUDirect Storage are not supported in clusters using a mix of InfiniBand and Ethernet networking.

Verification

  • To verify if RDMA is used, run the weka cluster processes command.

Example:

# weka cluster processes
PROCESS ID  HOSTNAME  CONTAINER   IPS         STATUS  ROLES       NETWORK      CPU  MEMORY   UPTIME
0           weka146   default     10.0.1.146  UP      MANAGEMENT  UDP                        16d 20:07:42h
1           weka146   default     10.0.1.146  UP      FRONTEND    DPDK / RDMA  1    1.47 GB  16d 23:29:00h
2           weka146   default     10.0.3.146  UP      COMPUTE     DPDK / RDMA  12   6.45 GB  16d 23:29:00h
3           weka146   default     10.0.1.146  UP      COMPUTE     DPDK / RDMA  2    6.45 GB  16d 23:29:00h
4           weka146   default     10.0.3.146  UP      COMPUTE     DPDK / RDMA  13   6.45 GB  16d 23:29:00h
5           weka146   default     10.0.1.146  UP      COMPUTE     DPDK / RDMA  3    6.45 GB  16d 22:28:58h
6           weka146   default     10.0.3.146  UP      COMPUTE     DPDK / RDMA  14   6.45 GB  16d 23:29:00h
7           weka146   default     10.0.3.146  UP      DRIVES      DPDK / RDMA  18   1.49 GB  16d 23:29:00h
8           weka146   default     10.0.1.146  UP      DRIVES      DPDK / RDMA  8    1.49 GB  16d 23:29:00h
9           weka146   default     10.0.3.146  UP      DRIVES      DPDK / RDMA  19   1.49 GB  16d 23:29:00h
10          weka146   default     10.0.1.146  UP      DRIVES      DPDK / RDMA  9    1.49 GB  16d 23:29:00h
11          weka146   default     10.0.3.146  UP      DRIVES      DPDK / RDMA  20   1.49 GB  16d 23:29:07h
12          weka147   default     10.0.1.147  UP      MANAGEMENT  UDP                        16d 22:29:02h
13          weka147   default     10.0.1.147  UP      FRONTEND    DPDK / RDMA  1    1.47 GB  16d 23:29:00h
14          weka147   default     10.0.3.147  UP      COMPUTE     DPDK / RDMA  12   6.45 GB  16d 23:29:00h
15          weka147   default     10.0.1.147  UP      COMPUTE     DPDK / RDMA  2    6.45 GB  16d 23:29:00h
16          weka147   default     10.0.3.147  UP      COMPUTE     DPDK / RDMA  13   6.45 GB  16d 23:29:00h
17          weka147   default     10.0.1.147  UP      COMPUTE     DPDK / RDMA  3    6.45 GB  16d 23:29:00h
18          weka147   default     10.0.3.147  UP      COMPUTE     DPDK / RDMA  14   6.45 GB  16d 23:29:00h
19          weka147   default     10.0.3.147  UP      DRIVES      DPDK / RDMA  18   1.49 GB  16d 23:29:00h
20          weka147   default     10.0.1.147  UP      DRIVES      DPDK / RDMA  8    1.49 GB  16d 23:29:00h
21          weka147   default     10.0.3.147  UP      DRIVES      DPDK / RDMA  19   1.49 GB  16d 23:29:07h
22          weka147   default     10.0.1.147  UP      DRIVES      DPDK / RDMA  9    1.49 GB  16d 23:29:00h
23          weka147   default     10.0.3.147  UP      DRIVES      DPDK / RDMA  20   1.49 GB  16d 23:29:07h
. . .

For CPU-optimized networking, when , it is possible to use DPDK networking without dedicating cores. This mode is recommended when available and supported by the NIC drivers. The DPDK networking uses RX interrupts instead of dedicating the cores in this mode.

This mode is supported in most NIC drivers. Consult for compatibility.

AWS (ENA drivers) does not support this mode. Hence, in CPU-optimized networking in AWS, use the .

To run both functions on the same physical interface, contact the .

The high-performance network used to connect all the backend servers must be DPDK-based. This internal WEKA network also requires a separate IP address space. For details, see and .

IP address for data plane: One per in each server (applied during cluster initialization).

(VFs):

NIC compatibility: The Network Interface Card (NIC) must support RDMA. Ensure the appropriate OFED version is installed. For more information, see .

GPUDirect Storage is automatically enabled and detected by the system. To enable or disable RDMA networking for the cluster or a specific client, contact the .

https://doc.dpdk.org/guides/nics/overview.html
Virtual Functions
Data Plane Development Kit (DPDK)
Red Hat
Ubuntu
Mellanox OpenFabrics Enterprise Distribution for Linux
performance-optimized
CPU-optimized
DPDK
UDP mode
InfiniBand drivers and configurations
Customer Success Team
Customer Success Team
WEKA core
Network planning
Configure the networking
mounting filesystems using stateless clients