W E K A
4.0
4.0
  • WEKA v4.0 documentation
  • WEKA System Overview
    • About the WEKA system
    • SSD capacity management
    • Filesystems, object stores, and filesystem groups
    • Weka networking
    • Data lifecycle management
    • Weka client and mount modes
    • Weka containers architecture overview
    • Glossary
  • Getting Started with Weka
    • Quick installation guide
    • Manage the system using the Weka CLI
    • Manage the system using the Weka GUI
    • Run first IOs with WekaFS
    • Getting started with Weka REST API
  • Planning & Installation
    • Prerequisites for installation
    • Weka installation on bare metal
      • Planning a Weka System Installation
      • Prepare the system for Weka installation
        • SR-IOV enablement
      • Obtain the Weka software installation package
      • Weka cluster installation
      • WEKA legacy system installation process
      • Add clients
    • Weka installation on AWS
      • Self-service portal
      • CloudFormation template generator
      • Deployment types
      • AWS outposts deployment
      • Supported EC2 instance types
      • Add clients
      • Auto scaling group
      • Troubleshooting
  • Performance
    • Weka performance tests
      • Test environment details
  • WekaFS Filesystems & Object Stores
    • Manage object stores
      • Manage object stores using the GUI
      • Manage object stores using the CLI
    • Manage filesystem groups
      • Manage filesystem groups using the GUI
      • Manage filesystem groups using the CLI
    • Manage filesystems
      • Manage filesystems using the GUI
      • Manage filesystems using the CLI
    • Attach or detach object store buckets
      • Attach or detach object store bucket using the GUI
      • Attach or detach object store buckets using the CLI
    • Advanced data lifecycle management
      • Advanced time-based policies for data storage location
      • Data management in tiered filesystems
      • Transition between tiered and SSD-only filesystems
      • Manual fetch and release of data
    • Mount filesystems
    • Snapshots
      • Manage snapshots using the GUI
      • Manage snapshots using the CLI
    • Snap-To-Object
      • Manage Snap-To-Object using the GUI
      • Manage Snap-To-Object using the CLI
    • Quota management
  • Additional Protocols
    • NFS
      • Manage NFS networking using the GUI
      • Manage NFS networking using the CLI
    • SMB
      • Manage SMB using the GUI
      • Manage SMB using the CLI
    • S3
      • S3 cluster management
        • Manage the S3 service using the GUI
        • Manage the S3 service using the CLI
      • S3 buckets management
        • Manage S3 buckets using the GUI
        • Manage S3 buckets using the CLI
      • S3 users and authentication
        • Manage S3 users and authentication using the CLI
        • Manage S3 service accounts using the CLI
      • S3 rules information lifecycle management (ILM)
        • Manage S3 rules using the CLI
      • Audit S3 APIs
        • Configure audit webhook using the GUI
        • Configure audit webhook using the CLI
        • Example: How to use Splunk to audit S3
      • S3 supported APIs and limitations
      • S3 examples using boto3
  • Operation Guide
    • Alerts
      • Manage alerts using the GUI
      • Manage alerts using the CLI
      • List of alerts and corrective actions
    • Events
      • Manage events using the GUI
      • Manage events using the CLI
      • List of events
    • Statistics
      • Manage statistics using the GUI
      • Manage statistics using the CLI
      • List of statistics
    • System congestion
    • Security management
      • Obtain authentication tokens
      • KMS management
        • Manage KMS using the GUI
        • Manage KMS using the CLI
      • TLS certificate management
        • Manage the TLS certificate using the GUI
        • Manage the TLS certificate using the CLI
      • CA certificate management
        • Manage the CA certificate using the GUI
        • Manage the CA certificate using the CLI
      • Account lockout threshold policy management
        • Manage the account lockout threshold policy using GUI
        • Manage the account lockout threshold policy using CLI
      • Manage the login banner
        • Manage the login banner using the GUI
        • Manage the login banner using the CLI
    • User management
      • Manage users using the GUI
      • Manage users using the CLI
    • Organizations management
      • Manage organizations using the GUI
      • Manage organizations using the CLI
      • Mount authentication for organization filesystems
    • Expand and shrink cluster resources
      • Expand and shrink overview
      • Workflow: Add a backend host
      • Expansion of specific resources
      • Shrink a Cluster
    • Background tasks
    • Upgrade Weka versions
  • Billing & Licensing
    • License overview
    • Classic license
    • Pay-As-You-Go license
  • Support
    • Prerequisites and compatibility
    • Get support for your Weka system
    • Diagnostics management
      • Traces management
        • Manage traces using the GUI
        • Manage traces using the CLI
      • Protocols debug level management
        • Manage protocols debug level using the GUI
        • Manage protocols debug level using the CLI
      • Collect and upload diagnostics data
    • Weka Home - The Weka support cloud
      • Local Weka Home overview
      • Local Weka Home deployment
      • Set the Local Weka Home to send alerts or events
      • Download the Usage Report or Analytics
  • Appendix
    • Weka CSI Plugin
    • Set up the Weka-mon external monitoring
    • Set up the SnapTool external snapshots manager
  • REST API Reference Guide
Powered by GitBook
On this page
  • Total SSD net capacity and performance planning
  • SSD resource planning
  • Memory resource planning
  • Backend hosts memory requirements
  • Client hosts memory requirements
  • CPU resource planning
  • CPU allocation strategy
  • Backend hosts
  • Client hosts
  • Network planning
  • Backend servers
  • Clients
  1. Planning & Installation
  2. Weka installation on bare metal

Planning a Weka System Installation

PreviousWeka installation on bare metalNextPrepare the system for Weka installation

Last updated 6 months ago

The planning of a Weka system is essential prior to the actual installation process. It involves the planning of the following:

  1. Total SSD net capacity and performance requirements

  2. SSD resources

  3. Memory resources

  4. CPU resources

  5. Network

Note: When implementing an AWS configuration, it is possible to go to the in order to automatically map capacity and performance requirements into various configurations.

Total SSD net capacity and performance planning

A Weka system cluster runs on a group of hosts with local SSDs. To plan these hosts, the following information must be clarified and defined:

  1. Capacity: Plan your net SSD capacity. Note that data management to object stores can be added after the installation. In the context of the planning stage, only the SSD capacity is required.

  2. Redundancy scheme: Define the optimal redundancy scheme required for the Weka system, as explained in .

  3. Failure domains: Determine whether failure domains are going to be used (this is optional) and if yes determine the number of failure domains and the potential number of hosts in each failure domain, as described in , and plan accordingly.

  4. Hot spare: Define the required hot spare count, as described in .

Once all this data is clarified, you can plan the SSD net storage capacity accordingly, as defined in the . You should also have the following information which will be used during the installation process:

  1. Cluster size (number of hosts).

  2. SSD capacity for each host, e.g., 12 hosts with a capacity of 6 TB each.

  3. Planned protection scheme, e.g., 6+2.

  4. Planned failure domains (optional).

  5. Planned hot spare.

Note: This is an iterative process. Depending on the scenario, some options can be fixed constraints while others are flexible.

SSD resource planning

SSD resource planning involves how the defined capacity is going to be implemented for the SSDs. For each host, the following has to be determined:

  • The number of SSDs and capacity for each SSD (where the multiplication of the two should satisfy the required capacity per host).

  • The technology to be used (NVME, SAS, or SATA) and the specific SSD models, which have implications on SSD endurance and performance.

Note: For on-premises planning, it is possible to consult with the Weka Support Team in order to map between performance requirements and the recommended Weka system configuration.

Memory resource planning

Backend hosts memory requirements

The total per host memory requirements is the sum of the following requirements:

Purpose

Per host memory

Fixed

2.3 GB

Frontend cores

2.3 GB x # of Frontend cores

Compute cores

3.3 GB x # of Compute cores

Drive cores

2.3 GB x # of Drive cores

SSD capacity management

HostSSDSize/10,000 (HostSSDSize = Total SSD raw capacity / # of hosts)

Operating System

The maximum between 8 GB and 2% from the total RAM

Additional protocols (NFS/SMB/S3)

8 GB

RDMA

2 GB

Metadata (pointers)

Example 1: A system with large files

A system with 16 hosts with the following details:

  • Number of Frontend cores: 1

  • Number of Compute cores: 13

  • Number of Drive cores: 6

  • Total raw capacity: 983 TB

  • Total net capacity: 725 TB

  • NFS/SMB services

  • RDMA

  • Average file size: 1 MB (potentially up to 755 million files for all hosts; ~47 million files per host)

Calculations:

  • Frontend cores: 1 x 2.3 = 2.3 GB

  • Compute cores: 13 x 3.3 = 33.9 GB

  • Drive cores: 6 x 2.3 = 13.8 GB

  • SSD capacity management: 983 TB / 16 / 10K = ~6.3 GB

  • Metadata: 20 Bytes x 47 million files x 2 units = ~1.9 GB

Total memory requirement per host = 2.3 + 2.3 + 33.9 + 13.8 + 6.3 + 8 + 2 + 1.9 = ~71 GB

Example 2: A system with small files

For the same system as in example 1, but with smaller files, the required memory for metadata would be larger.

For an average file size of 64 KB, the number of files is potentially up to ~12 billion files for all hosts; ~980 million files per host.

Required memory for metadata: 20 Bytes x 980 million files x 1 unit = ~19.6 GB

Total memory requirement per host = 2.3 + 2.3 + 33.9 + 13.8 + 6.3 + 8 + 2 + 19.6 = ~88 GB

Client hosts memory requirements

The Weka software on a client host requires 4 GB of additional memory.

CPU resource planning

CPU allocation strategy

The Weka system implements a Non-Uniform Memory Access (NUMA) aware CPU allocation strategy to maximize the overall performance of the system. The allocation of cores utilizes all NUMAs equally to balance memory usage from all NUMA nodes.

The following should be noted with regards to the CPU allocation strategy:

  • The code allocates CPU resources by assigning individual cores to tasks in a cgroup

  • Cores in a Weka cgroup won't be available to run any other user processes

  • On systems with Intel hyperthreading enabled, the corresponding sibling cores will be placed into a cgroup along with the physical ones.

Backend hosts

The number of physical cores dedicated to the Weka software should be planned according to the following guidelines and limitations:

  • At least one physical core should be dedicated to the operating system; the rest can be allocated to the Weka software.

    • In general, it is recommended to allocate as many cores as possible to the Weka system.

    • No more than 19 physical cores can be assigned to Weka system processes.

  • Enough cores should be allocated to support performance targets.

    • In general, use 1 drive core per SSD for up to 6 SSDs and 1 drive core per 2 SSDs for more, with a ratio of 2 compute cores per SSD core.

    • For finer tuning, please contact the Weka Support Team.

  • Enough memory should be allocated to match core allocation, as discussed above.

  • The running of other applications on the same host (converged Weka system deployment) is supported. However, this is not covered in this documentation. For further information, contact the Weka Support Team.

Client hosts

On a client host, by default, the Weka software consumes a single physical core. If the client host is configured with hyper-threading, the Weka software will consume two logical cores.

If the client networking is defined as based on UDP, there is no allocation of core resources and the CPU resources are allocated to the Weka processes by the operating system as any other process.

Network planning

Backend servers

WEKA backend servers support connections to both InfiniBand and Ethernet networks. When deploying backend servers, ensure that all servers in the WEKA system are connected using the same network technology for each type of network.

If both InfiniBand and Ethernet connections are configured, the WEKA system prioritizes InfiniBand links for data traffic. However, if there is a connectivity issue with the InfiniBand network, the system automatically switches to using Ethernet links as a fallback. Clients can connect to the WEKA system over either InfiniBand or Ethernet.

A network port can be dedicated exclusively to the WEKA system or shared between the WEKA system and other applications.

Clients

Clients can be configured with networking as described above to achieve the highest performance and lowest latency; however, this setup requires compatible hardware and dedicated CPU core resources. If compatible hardware is not available or a dedicated CPU core cannot be allocated to the WEKA system, client networking can instead be configured to use the kernel’s UDP service. This configuration results in reduced performance and increased latency.

20 Bytes x # Metadata units per host See .

Note: The memory requirements are conservative and can be reduced in some situations, such as in systems with mostly large files or a system with files 4 KB in size. Contact the to receive an estimate for your specific configuration.

Self-Service Portal in start.weka.io
Failure Domains
Hot Spare
SSD Capacity Management formula
Customer Success Team
Selecting a Redundancy Scheme
Metadata units calculation