W E K A
4.1
4.1
  • WEKA v4.1 documentation
  • WEKA System Overview
    • About the WEKA system
    • SSD capacity management
    • Filesystems, object stores, and filesystem groups
    • WEKA networking
    • Data lifecycle management
    • WEKA client and mount modes
    • WEKA containers architecture overview
    • Glossary
  • Getting Started with WEKA
    • Quick installation guide
    • Manage the system using the WEKA CLI
    • Manage the system using the WEKA GUI
    • Run first IOs with WEKA filesystem
    • Getting started with WEKA REST API
  • Planning and Installation
    • Prerequisites for installation
    • WEKA installation on bare metal
      • Plan the WEKA system Installation
      • Prepare the system for WEKA software installation
        • Enable the SR-IOV
      • Obtain the WEKA software installation package
      • WEKA cluster installation
        • WEKA legacy system installation process
      • Add clients
    • WEKA installation on AWS
      • Self-service portal
      • CloudFormation template generator
      • Deployment types
      • AWS outposts deployment
      • Supported EC2 instance types
      • Add clients
      • Auto scaling group
      • Troubleshooting
    • WEKA installation on Azure
    • WEKA installation on GCP
      • WEKA project description
      • Deployment on GCP using Terraform
      • GCP Terraform package description
      • Required services and supported regions
      • Supported machine types and storage
      • Auto-scale instances in GCP
      • Add clients
      • Troubleshooting
  • Performance
    • WEKA performance tests
      • Test environment details
  • WEKA Filesystems & Object Stores
    • Manage object stores
      • Manage object stores using the GUI
      • Manage object stores using the CLI
    • Manage filesystem groups
      • Manage filesystem groups using the GUI
      • Manage filesystem groups using the CLI
    • Manage filesystems
      • Manage filesystems using the GUI
      • Manage filesystems using the CLI
    • Attach or detach object store buckets
      • Attach or detach object store bucket using the GUI
      • Attach or detach object store buckets using the CLI
    • Advanced data lifecycle management
      • Advanced time-based policies for data storage location
      • Data management in tiered filesystems
      • Transition between tiered and SSD-only filesystems
      • Manual fetch and release of data
    • Mount filesystems
    • Snapshots
      • Manage snapshots using the GUI
      • Manage snapshots using the CLI
    • Snap-To-Object
      • Manage Snap-To-Object using the GUI
      • Manage Snap-To-Object using the CLI
    • Quota management
      • Manage quotas using the GUI
      • Manage quotas using the CLI
  • Additional Protocols
    • Manage the NFS protocol
      • Supported NFS client mount options
      • Manage NFS networking using the GUI
      • Manage NFS networking using the CLI
    • Manage the SMB protocol
      • Manage SMB using the GUI
      • Manage SMB using the CLI
    • Manage the S3 protocol
      • S3 cluster management
        • Manage the S3 service using the GUI
        • Manage the S3 service using the CLI
      • S3 buckets management
        • Manage S3 buckets using the GUI
        • Manage S3 buckets using the CLI
      • S3 users and authentication
        • Manage S3 users and authentication using the CLI
        • Manage S3 service accounts using the CLI
      • S3 rules information lifecycle management (ILM)
        • Manage S3 lifecycle rules using the GUI
        • Manage S3 lifecycle rules using the CLI
      • Audit S3 APIs
        • Configure audit webhook using the GUI
        • Configure audit webhook using the CLI
        • Example: How to use Splunk to audit S3
      • S3 supported APIs and limitations
      • S3 examples using boto3
  • Operation Guide
    • Alerts
      • Manage alerts using the GUI
      • Manage alerts using the CLI
      • List of alerts and corrective actions
    • Events
      • Manage events using the GUI
      • Manage events using the CLI
      • List of events
    • Statistics
      • Manage statistics using the GUI
      • Manage statistics using the CLI
      • List of statistics
    • System congestion
    • Security management
      • Obtain authentication tokens
      • KMS management
        • Manage KMS using the GUI
        • Manage KMS using the CLI
      • TLS certificate management
        • Manage the TLS certificate using the GUI
        • Manage the TLS certificate using the CLI
      • CA certificate management
        • Manage the CA certificate using the GUI
        • Manage the CA certificate using the CLI
      • Account lockout threshold policy management
        • Manage the account lockout threshold policy using GUI
        • Manage the account lockout threshold policy using CLI
      • Manage the login banner
        • Manage the login banner using the GUI
        • Manage the login banner using the CLI
    • User management
      • Manage users using the GUI
      • Manage users using the CLI
    • Organizations management
      • Manage organizations using the GUI
      • Manage organizations using the CLI
      • Mount authentication for organization filesystems
    • Expand and shrink cluster resources
      • Add a backend server in a multiple containers architecture
      • Add a backend server in a legacy architecture
      • Expand specific resources of a container
      • Shrink a cluster
    • Background tasks
    • Upgrade WEKA versions
  • Billing & Licensing
    • License overview
    • Classic license
    • Pay-As-You-Go license
  • Support
    • Prerequisites and compatibility
    • Get support for your WEKA system
    • Diagnostics management
      • Traces management
        • Manage traces using the GUI
        • Manage traces using the CLI
      • Protocols debug level management
        • Manage protocols debug level using the GUI
        • Manage protocols debug level using the CLI
      • Diagnostics data management
    • Weka Home - The WEKA support cloud
      • Local Weka Home overview
      • Local Weka Home deployment
      • Set the Local Weka Home to send alerts or events
      • Download the Usage Report or Analytics
  • Appendix
    • WEKA CSI Plugin
    • Set up the WEKAmon external monitoring
    • Set up the SnapTool external snapshots manager
  • REST API Reference Guide
Powered by GitBook
On this page
  • About filesystems
  • Thin provisioning
  • Filesystem limits
  • Encrypted filesystems
  • Metadata limitations
  • About object stores
  • About filesystem groups
  1. WEKA System Overview

Filesystems, object stores, and filesystem groups

This page describes the three types of entities relevant to data storage in the WEKA system: filesystems, object stores and filesystem groups.

About filesystems

A WEKA filesystem is similar to a regular on-disk filesystem while distributed across all the servers in the cluster. Consequently, filesystems are not associated with any physical object in the Weka system and act as root directories with space limitations.

The system supports a total of up to 1024 filesystems. All of which are equally balanced on all SSDs and CPU cores assigned to the system. This means that allocating a new filesystem or resizing a filesystem are instant management operations performed without constraints.

A filesystem has a defined capacity limit associated with a predefined filesystem group. A filesystem that belongs to a tiered filesystem group must have a total capacity limit and an SSD capacity cap. All filesystems' available SSD capacity cannot exceed the total SSD net capacity.

Thin provisioning

Thin provisioning is a method of on-demand SSD capacity allocation based on user requirements. In thin provisioning, the filesystem capacity is defined by a minimum guaranteed capacity and a maximum capacity (virtually can be more than the available SSD capacity).

The system allocates more capacity (up to the total available SSD capacity) for users who consume their allocated minimum capacity. Alternatively, when they free up space by deleting files or transferring data, the idle space is reclaimed, repurposed, and used for other workloads that need SSD capacity.

Thin provisioning is beneficial in various use cases:

  • Tiered filesystems: On tiered filesystems, available SSD capacity is leveraged for extra performance and released to the object store once needed by other filesystems.

  • Auto-scaling groups: When using auto-scaling groups, thin provisioning can help to automatically expand and shrink the filesystem's SSD capacity for extra performance.

  • Separation of projects to filesystems: If it is required to create a separate filesystem for each project, and the administrator doesn't expect all filesystems to be fully utilized simultaneously, creating a thin provisioned filesystem for each project is a good solution. Each filesystem is allocated with a minimum capacity but can consume more when needed based on the actual available SSD capacity.

Filesystem limits

  • Number of files or directories: Up to 6.4 trillion (6.4 * 10^12)

  • Number of files in a single directory: Up to 6.4 billion (6.4 * 10^9)

  • Total capacity with object store: Up to 14 EB

  • Total SSD capacity: Up to 512 PB

  • File size: Up to 4 PB

Encrypted filesystems

Both data at rest (residing on SSD and object store) and data in transit can be encrypted. This is achieved by enabling the filesystem encryption feature. A decision on whether a filesystem is to be encrypted is made when creating the filesystem.

To create encrypted filesystems, deploy a Key Management System (KMS).

Note: You can only set the data encryption when creating a filesystem.

Related topics

KMS management

Metadata limitations

In addition to the capacity limitation, each filesystem has a limitation on the amount of metadata. The system-wide metadata limit is determined by the SSD capacity allocated to the WEKA system and the RAM resources allocated to the WEKA system processes.

The WEKA system keeps tracking metadata units in the RAM. If it reaches the RAM limit, it pages these metadata tracking units to the SSD and alerts. This leaves enough time for the administrator to increase system resources, as the system keeps serving IOs with a minimal performance impact.

By default, the metadata limit associated with a filesystem is proportional to the filesystem SSD size. It is possible to override this default by defining a filesystem-specific max-files parameter. The filesystem limit is a logical limit to control the specific filesystem usage and can be updated by the administrator when necessary.

The total metadata limits for all the filesystems can exceed the entire system metadata information that can fit in the RAM. For minimal impact, in such a case, the least-recently-used units are paged to disk, as necessary.

Metadata units calculation

Each metadata unit consumes 4 KB of SSD space (not tiered) and 20 bytes of RAM.

Throughout this documentation, the metadata limitation per filesystem is referred to as the max-files parameter, which specifies the number of metadata units (not the number of files). This parameter encapsulates both the file count and the file sizes.

The following table specifies the required metadata units according to the file size. These specifications apply to files residing on SSDs or tiered to object stores.

File size
Number of metadata units
Example

< 0.5 MB

1

A filesystem with 1 billion files of 64 KB each requires 1 billion metadata units.

0.5 MB - 1 MB

2

A filesystem with 1 million files of 750 KB each, requires 2 million metadata units.

> 1 MB

2 for the first 1 MB plus 1 per MB for the rest MBs

  • A filesystem with 1 million files of 129 MB each requires 130 million metadata units. 2 units for the first 1 MB plus 1 unit per MB for 128 MB.

  • A filesystem with 10 million files of 1.5 MB each requires 30 million units.

  • A filesystem with 10 million files of 3 MB each requires 40 million units.

Each directory requires two metadata units instead of one for a small file.

Related topics

About object stores

In the Weka system, object stores represent an optional external storage media, ideal for storing warm data. Object stores used in tiered WEKA system configurations can be cloud-based, located in the same location (local), or at a remote location.

WEKA supports object stores for tiering (tiering and local snapshots) and backup (snapshots only). Both tiering and backup can be used for the same filesystem.

Using object store buckets optimally is achieved when a cost-effective data storage tier is required at a price point that cannot be satisfied by server-based SSDs.

An object store bucket definition contains the object store DNS name, bucket identifier, and access credentials. The bucket must be dedicated to the WEKA system and not be accessible by other applications.

Filesystem connectivity to object store buckets can be used in the data lifecycle management and Snap-to-Object features.

Related topics

Manage object stores

Data lifecycle management

Snap-To-Object

About filesystem groups

In the WEKA system, filesystems are grouped into a maximum of eight filesystem groups.

Each filesystem group has tiering control parameters. While tiered filesystems have their object store, the tiering policy is the same for each tiered filesystem under the same filesystem group.

Related topics

Manage filesystem groups

PreviousSSD capacity managementNextWEKA networking

Last updated 2 years ago

Memory resource planning