W E K A
4.2
4.2
  • WEKA v4.2 documentation
    • Documentation revision history
  • WEKA System Overview
    • Introduction
      • WEKA system functionality features
      • Converged WEKA system deployment
      • Optimize redundancy in WEKA deployments
    • SSD capacity management
    • Filesystems, object stores, and filesystem groups
    • WEKA networking
    • Data lifecycle management
    • WEKA client and mount modes
    • WEKA containers architecture overview
    • Glossary
  • Planning and Installation
    • Prerequisites and compatibility
    • WEKA cluster installation on bare metal servers
      • Plan the WEKA system hardware requirements
      • Obtain the WEKA installation packages
      • Install the WEKA cluster using the WMS with WSA
      • Install the WEKA cluster using the WSA
      • Manually install OS and WEKA on servers
      • Manually prepare the system for WEKA configuration
        • Broadcom adapter setup for WEKA system
        • Enable the SR-IOV
      • Configure the WEKA cluster using the WEKA Configurator
      • Manually configure the WEKA cluster using the resources generator
      • Perform post-configuration procedures
      • Add clients
    • WEKA installation on AWS
      • WEKA installation on AWS using Terraform
        • Terraform-AWS-WEKA module description
        • Deployment on AWS using Terraform
        • Required services and supported regions
        • Supported EC2 instance types using Terraform
        • WEKA cluster auto-scaling in AWS
        • Detailed deployment tutorial: WEKA on AWS using Terraform
      • WEKA installation on AWS using the Cloud Formation
        • Self-service portal
        • CloudFormation template generator
        • Deployment types
        • AWS Outposts deployment
        • Supported EC2 instance types using Cloud Formation
        • Add clients
        • Auto scaling group
        • Troubleshooting
    • WEKA installation on Azure
    • WEKA installation on GCP
      • WEKA project description
      • GCP-WEKA deployment Terraform package description
      • Deployment on GCP using Terraform
      • Required services and supported regions
      • Supported machine types and storage
      • Auto-scale instances in GCP
      • Add clients
      • Troubleshooting
  • Getting Started with WEKA
    • Manage the system using the WEKA GUI
    • Manage the system using the WEKA CLI
      • WEKA CLI hierarchy
      • CLI reference guide
    • Run first IOs with WEKA filesystem
    • Getting started with WEKA REST API
    • WEKA REST API and equivalent CLI commands
  • Performance
    • WEKA performance tests
      • Test environment details
  • WEKA Filesystems & Object Stores
    • Manage object stores
      • Manage object stores using the GUI
      • Manage object stores using the CLI
    • Manage filesystem groups
      • Manage filesystem groups using the GUI
      • Manage filesystem groups using the CLI
    • Manage filesystems
      • Manage filesystems using the GUI
      • Manage filesystems using the CLI
    • Attach or detach object store buckets
      • Attach or detach object store bucket using the GUI
      • Attach or detach object store buckets using the CLI
    • Advanced data lifecycle management
      • Advanced time-based policies for data storage location
      • Data management in tiered filesystems
      • Transition between tiered and SSD-only filesystems
      • Manual fetch and release of data
    • Mount filesystems
      • Mount filesystems from Single Client to Multiple Clusters (SCMC)
      • Manage authentication across multiple clusters with connection profiles
    • Snapshots
      • Manage snapshots using the GUI
      • Manage snapshots using the CLI
    • Snap-To-Object
      • Manage Snap-To-Object using the GUI
      • Manage Snap-To-Object using the CLI
    • Quota management
      • Manage quotas using the GUI
      • Manage quotas using the CLI
  • Additional Protocols
    • Additional protocol containers
    • Manage the NFS protocol
      • Supported NFS client mount parameters
      • Manage NFS networking using the GUI
      • Manage NFS networking using the CLI
    • Manage the S3 protocol
      • S3 cluster management
        • Manage the S3 service using the GUI
        • Manage the S3 service using the CLI
      • S3 buckets management
        • Manage S3 buckets using the GUI
        • Manage S3 buckets using the CLI
      • S3 users and authentication
        • Manage S3 users and authentication using the CLI
        • Manage S3 service accounts using the CLI
      • S3 rules information lifecycle management (ILM)
        • Manage S3 lifecycle rules using the GUI
        • Manage S3 lifecycle rules using the CLI
      • Audit S3 APIs
        • Configure audit webhook using the GUI
        • Configure audit webhook using the CLI
        • Example: How to use Splunk to audit S3
      • S3 supported APIs and limitations
      • S3 examples using boto3
    • Manage the SMB protocol
      • Manage SMB using the GUI
      • Manage SMB using the CLI
  • Operation Guide
    • Alerts
      • Manage alerts using the GUI
      • Manage alerts using the CLI
      • List of alerts and corrective actions
    • Events
      • Manage events using the GUI
      • Manage events using the CLI
      • List of events
    • Statistics
      • Manage statistics using the GUI
      • Manage statistics using the CLI
      • List of statistics
    • Insights
    • System congestion
    • Security management
      • Obtain authentication tokens
      • KMS management
        • Manage KMS using the GUI
        • Manage KMS using the CLI
      • TLS certificate management
        • Manage the TLS certificate using the GUI
        • Manage the TLS certificate using the CLI
      • CA certificate management
        • Manage the CA certificate using the GUI
        • Manage the CA certificate using the CLI
      • Account lockout threshold policy management
        • Manage the account lockout threshold policy using GUI
        • Manage the account lockout threshold policy using CLI
      • Manage the login banner
        • Manage the login banner using the GUI
        • Manage the login banner using the CLI
    • User management
      • Manage users using the GUI
      • Manage users using the CLI
    • Organizations management
      • Manage organizations using the GUI
      • Manage organizations using the CLI
      • Mount authentication for organization filesystems
    • Expand and shrink cluster resources
      • Add a backend server
      • Expand specific resources of a container
      • Shrink a cluster
    • Background tasks
      • Manage background tasks using the GUI
      • Manage background tasks using the CLI
    • Upgrade WEKA versions
  • Billing & Licensing
    • License overview
    • Classic license
  • Monitor the WEKA Cluster
    • Deploy monitoring tools using the WEKA Management Station (WMS)
    • WEKA Home - The WEKA support cloud
      • Local WEKA Home overview
      • Deploy Local WEKA Home v3.0 or higher
      • Deploy Local WEKA Home v2.x
      • Explore cluster insights and statistics
      • Manage alerts and integrations
      • Enforce security and compliance
      • Optimize support and data management
    • Set up the WEKAmon external monitoring
    • Set up the SnapTool external snapshots manager
  • Support
    • Get support for your WEKA system
    • Diagnostics management
      • Traces management
        • Manage traces using the GUI
        • Manage traces using the CLI
      • Protocols debug level management
        • Manage protocols debug level using the GUI
        • Manage protocols debug level using the CLI
      • Diagnostics data management
  • Best Practice Guides
    • WEKA and Slurm integration
      • Avoid conflicting CPU allocations
    • Storage expansion best practice
  • Appendices
    • WEKA CSI Plugin
      • Deployment
      • Storage class configurations
      • Tailor your storage class configuration with mount options
      • Dynamic and static provisioning
      • Launch an application using WEKA as the POD's storage
      • Add SELinux support
      • NFS transport failback
      • Upgrade legacy persistent volumes for capacity enforcement
      • Troubleshooting
    • Convert cluster to multi-container backend
    • Create a client image
Powered by GitBook
On this page
  • Before you begin
  • Workflow
  • 1. Remove the default container
  • 2. Generate the resource files
  • 3. Create drive containers
  • 4. Create a cluster
  • 5. Configure the SSD drives
  • 6. Create compute containers
  • 7. Create frontend containers
  • 8. Configure the number of data and parity drives
  • 9. Configure the number of hot spares
  • 10. Name the cluster
  • What to do next?
  1. Planning and Installation
  2. WEKA cluster installation on bare metal servers

Manually configure the WEKA cluster using the resources generator

Detailed workflow for manually configuring the WEKA cluster using the resources generator in a multi-container backend architecture.

PreviousConfigure the WEKA cluster using the WEKA ConfiguratorNextPerform post-configuration procedures

Last updated 1 month ago

Perform this workflow using the resources generator only if you are not using the automated WMS, WSA, or WEKA Configurator.

The resources generator generates three resource files on each server in the /tmp directory: drives0.json, compute0.json, and frontend0.json. Then, you create the containers using these generated files of the cluster servers.

Before you begin

  1. Download the resources generator from the GitHub repository to your local server: .

Example:

wget https://raw.githubusercontent.com/weka/tools/master/install/resources_generator.py
  1. Copy the resources generator from your local server to all servers in the cluster.

Example for a cluster with 8 servers:

for i in {0..7}; do scp resources_generator.py weka0-$i:/tmp/resources_generator.py; done
  1. To enable execution, change the mode of the resources generator on all servers in the cluster.

Example for a cluster with 8 servers:

pdsh -R ssh -w "weka0-[0-7]" 'chmod +x /tmp/resources_generator.py'

Workflow

  1. Remove the default container

  2. Generate the resource files

  3. Create drive containers

  4. Create a cluster

  5. Configure the SSD drives

  6. Create compute containers

  7. Create frontend containers

  8. Configure the number of data and parity drives

  9. Configure the number of hot spares

  10. Name the cluster

1. Remove the default container

Command: weka local stop default && weka local rm -f default

Stop and remove the auto-created default container created on each server.

2. Generate the resource files

Command: resources_generator.py

To generate the resource files for the drive, compute, and frontend processes, run the following command on each backend server:

./resources_generator.py --net <net-devices> [options]

The resources generator allocates the number of cores, memory, and other resources according to the values specified in the parameters.

The best practice for resources allocation is as follows:

  • 1 drive core per NVMe device (SSD).

  • 2-3 compute cores per drive core.

  • 1-2 frontend cores if deploying a protocol container. If there is a spare core, it is used for a frontend container.

  • Minimum of 1 core for the OS.

Example 1: according to the best practice

For a server with 24 cores and 6 SSDs, allocate 6 drive cores and 12 compute cores, and optionally you can use 2 cores of the remaining cores for the frontend container. The OS uses the remaining 4 cores.

Run the following command line: ./resources_generator.py --net eth1 eth2 --drive-dedicated-cores 6 --compute-dedicated-cores 12 --frontend-dedicated-cores 2

Example 2: a server with a limited number of cores

For a server with 14 cores and 6 SSDs, allocate 6 drive cores and 6 compute cores, and optionally you can use 1 core of the remaining cores for the frontend container. The OS uses the remaining 1 core.

Run the following command line: ./resources_generator.py --net eth1 eth2 --drive-dedicated-cores 6 --compute-dedicated-cores 6 --frontend-dedicated-cores 1

Contact Professional Services for the recommended resource allocation settings for your system.

Parameters

Name
Value
Default

compute-core-ids

Specify the CPUs to allocate for the compute processes. Format: space-separated numbers

compute-dedicated-cores

Specify the number of cores to dedicate for the compute processes.

The maximum available cores

compute-memory

Specify the total memory to allocate for the compute processes.

Format: value and unit without a space.

Examples: 1024B, 10GiB, 5TiB.

The maximum available memory

core-ids

Specify the CPUs to allocate for the WEKA processes. Format: space-separated numbers.

drive-core-ids

Specify the CPUs to allocate for the drive processes. Format: space-separated numbers.

drive-dedicated-cores

Specify the number of cores to dedicate for the drive processes.

1 core per each detected drive

drives

Specify the drives to use.

This option overrides automatic detection. Format: space-separated strings.

All unmounted NVME devices

frontend-core-ids

Specify the CPUs to allocate for the frontend processes. Format: space-separated numbers.

-

frontend-dedicated-cores

Specify the number of cores to dedicate for the frontend processes.

1

max-cores-per-container

Override the default maximum number of cores per container for IO processes (19). If provided, the new value must be lower.

19

minimal-memory

Set each container's hugepages memory to 1.4 GiB * number of IO processes on the container.

net*

Specify the network devices to use. Format: space-separated strings.

no-rdma

Don't take RDMA support into account when computing memory requirements.

False

num-cores

Override the auto-deduction of the number of cores.

All available cores

path

Specify the path to write the resource files.

'.'

spare-cores

Specify the number of cores to leave for OS and non-WEKA processes.

1

spare-memory

Specify the memory to reserve for non-WEKA requirements.

Argument format: a value and unit without a space.

Examples: 10GiB, 1024B, 5TiB.

The maximum between 8 GiB and 2% of the total RAM

weka-hugepages-memory

Specify the memory to allocate for compute, frontend, and drive processes.

Argument format: a value and unit without a space.

Examples: 10GiB, 1024B, 5TiB.

The maximum available memory

3. Create drive containers

Command: weka local setup container

For each server in the cluster, create the drive containers using the resource generator output file drives0.json.

The drives JSON file includes all the required values for creating the drive containers. Only the path to the JSON resource file is required (before cluster creation, the optional parameter join-ips is not relevant).

weka local setup container --resources-path <resources-path>/drives0.json

Parameters

Name
Value

resources-path*

A valid path to the resource file.

4. Create a cluster

Command: weka cluster create

To create a cluster of the allocated containers, use the following command:

weka cluster create <hostnames> [--host-ips <ips | ip+ip+ip+ip>]

Parameters

Name
Value
Default

hostnames*

Hostnames or IP addresses. If port 14000 is not the default for the drives, you can specify hostnames:port or ips:port. Minimum cluster size: 6 Format: space-separated strings

host-ips

IP addresses of the management interfaces. Use a list of ip+ip addresses pairs of two cards for HA configuration. In case the cluster is connected to both IB and Ethernet, it is possible to set up to 4 management IPs for redundancy of both the IB and Ethernet networks using a list of ip+ip+ip+ip addresses. The same number of values as in hostnames. Format: comma-separated IP addresses.

IP of the first network device of the container

Notes:

  • It is possible to use a hostname or an IP address. This string serves as the container's identifier in subsequent commands.

  • If a hostname is used, ensure the hostname to IP resolution mechanism is reliable.

  • Once the cluster creation is successfully completed, the cluster is in the initialization phase, and some commands can only run in this phase.

  • To configure high availability (HA), at least two cards must be defined for each container.

  • On successful completion of the formation of the cluster, every container receives a container-ID. To display the list of the containers and IDs, run weka cluster container.

  • In IB installations the --containers-ips parameter must specify the IP addresses of the IPoIB interfaces.

5. Configure the SSD drives

Command: weka cluster drive add

To configure the SSD drives on each server in the cluster, or add multiple drive paths, use the following command:

weka cluster drive add <container-id> <device-paths>

Parameters

Name
Value

container-id*

The Identifier of the drive container to add the local SSD drives.

device-paths*

List of block devices that identify local SSDs. It must be a valid Unix network device name. Format: Space-separated strings. Example, /dev/nvme0n1 /dev/nvme1n1

6. Create compute containers

Command: weka local setup container

For each server in the cluster, create the compute containers using the resource generator output file compute0.json.

weka local setup container --join-ips <IP addresses> --resources-path <resources-path>/compute0.json

Parameters

Name
Value

resources-path*

A valid path to the resource file.

join-ips

IP:port pairs for the management processes to join the cluster. In the absence of a specified port, the command defaults to using the standard WEKA port 14000. Set the values, only if you want to customize the port. Format: comma-separated IP addresses. Example: --join-ips 10.10.10.1,10.10.10.2,10.10.10.3:15000

7. Create frontend containers

Command: weka local setup container

For each server in the cluster, create the frontend containers using the resource generator output file frontend0.json.

weka local setup container --join-ips <IP addresses> --resources-path <resources-path>/frontend0.json

Parameters

Name
Value

resources-path*

A valid path to the resource file.

join-ips

IP:port pairs for the management processes to join the cluster. In the absence of a specified port, the command defaults to using the standard WEKA port 14000. Set the values, only if you want to customize the port. Format: comma-separated IP addresses. Example: --join-ips 10.10.10.1,10.10.10.2,10.10.10.3:15000

8. Configure the number of data and parity drives

Command: weka cluster update --data-drives=<count> --parity-drives=<count>

Example: weka cluster update --data-drives=4 --parity-drives=2

9. Configure the number of hot spares

Command: weka cluster hot-spare <count>

Example: weka cluster hot-spare 1

10. Name the cluster

Command: weka cluster update --cluster-name=<cluster name>

What to do next?

Perform post-configuration procedures

https://github.com/weka/tools/blob/master/install/resources_generator.py