W E K A
4.3
4.3
  • WEKA v4.3 documentation
    • Documentation revision history
  • WEKA System Overview
    • WEKA Data Platform introduction
      • WEKA system functionality features
      • Converged WEKA system deployment
      • Optimize redundancy in WEKA deployments
    • SSD capacity management
    • Filesystems, object stores, and filesystem groups
    • WEKA networking
    • Data lifecycle management
    • WEKA client and mount modes
    • WEKA containers architecture overview
    • Glossary
  • Planning and Installation
    • Prerequisites and compatibility
    • WEKA cluster installation on bare metal servers
      • Plan the WEKA system hardware requirements
      • Obtain the WEKA installation packages
      • Install the WEKA cluster using the WMS with WSA
      • Install the WEKA cluster using the WSA
      • Manually install OS and WEKA on servers
      • Manually prepare the system for WEKA configuration
        • Broadcom adapter setup for WEKA system
        • Enable the SR-IOV
      • Configure the WEKA cluster using the WEKA Configurator
      • Manually configure the WEKA cluster using the resource generator
      • Perform post-configuration procedures
      • Add clients to an on-premises WEKA cluster
    • WEKA Cloud Deployment Manager Web (CDM Web) User Guide
    • WEKA Cloud Deployment Manager Local (CDM Local) User Guide
    • WEKA installation on AWS
      • WEKA installation on AWS using Terraform
        • Terraform-AWS-WEKA module description
        • Deployment on AWS using Terraform
        • Required services and supported regions
        • Supported EC2 instance types using Terraform
        • WEKA cluster auto-scaling in AWS
        • Detailed deployment tutorial: WEKA on AWS using Terraform
      • WEKA installation on AWS using the Cloud Formation
        • Self-service portal
        • CloudFormation template generator
        • Deployment types
        • AWS Outposts deployment
        • Supported EC2 instance types using Cloud Formation
        • Add clients to a WEKA cluster on AWS
        • Auto scaling group
        • Troubleshooting
      • Install SMB on AWS
    • WEKA installation on Azure
    • WEKA installation on GCP
      • WEKA project description
      • GCP-WEKA deployment Terraform package description
      • Deployment on GCP using Terraform
      • Required services and supported regions
      • Supported machine types and storage
      • Auto-scale instances in GCP
      • Add clients to a WEKA cluster on GCP
      • Troubleshooting
      • Detailed deployment tutorial: WEKA on GCP using Terraform
      • Google Kubernetes Engine and WEKA over POSIX deployment
  • Getting Started with WEKA
    • Manage the system using the WEKA GUI
    • Manage the system using the WEKA CLI
      • WEKA CLI hierarchy
      • CLI reference guide
    • Run first IOs with WEKA filesystem
    • Getting started with WEKA REST API
    • WEKA REST API and equivalent CLI commands
  • Performance
    • WEKA performance tests
      • Test environment details
  • WEKA Filesystems & Object Stores
    • Manage object stores
      • Manage object stores using the GUI
      • Manage object stores using the CLI
    • Manage filesystem groups
      • Manage filesystem groups using the GUI
      • Manage filesystem groups using the CLI
    • Manage filesystems
      • Manage filesystems using the GUI
      • Manage filesystems using the CLI
    • Attach or detach object store buckets
      • Attach or detach object store bucket using the GUI
      • Attach or detach object store buckets using the CLI
    • Advanced data lifecycle management
      • Advanced time-based policies for data storage location
      • Data management in tiered filesystems
      • Transition between tiered and SSD-only filesystems
      • Manual fetch and release of data
    • Mount filesystems
      • Mount filesystems from Single Client to Multiple Clusters (SCMC)
    • Snapshots
      • Manage snapshots using the GUI
      • Manage snapshots using the CLI
    • Snap-To-Object
      • Manage Snap-To-Object using the GUI
      • Manage Snap-To-Object using the CLI
    • Quota management
      • Manage quotas using the GUI
      • Manage quotas using the CLI
  • Additional Protocols
    • Additional protocol containers
    • Manage the NFS protocol
      • Supported NFS client mount parameters
      • Manage NFS networking using the GUI
      • Manage NFS networking using the CLI
    • Manage the S3 protocol
      • S3 cluster management
        • Manage the S3 service using the GUI
        • Manage the S3 service using the CLI
      • S3 buckets management
        • Manage S3 buckets using the GUI
        • Manage S3 buckets using the CLI
      • S3 users and authentication
        • Manage S3 users and authentication using the CLI
        • Manage S3 service accounts using the CLI
      • S3 rules information lifecycle management (ILM)
        • Manage S3 lifecycle rules using the GUI
        • Manage S3 lifecycle rules using the CLI
      • Audit S3 APIs
        • Configure audit webhook using the GUI
        • Configure audit webhook using the CLI
        • Example: How to use Splunk to audit S3
      • S3 supported APIs and limitations
      • S3 examples using boto3
      • Access S3 using AWS CLI
    • Manage the SMB protocol
      • Manage SMB using the GUI
      • Manage SMB using the CLI
  • Operation Guide
    • Alerts
      • Manage alerts using the GUI
      • Manage alerts using the CLI
      • List of alerts and corrective actions
    • Events
      • Manage events using the GUI
      • Manage events using the CLI
      • List of events
    • Statistics
      • Manage statistics using the GUI
      • Manage statistics using the CLI
      • List of statistics
    • Insights
    • System congestion
    • Security management
      • Obtain authentication tokens
      • KMS management
        • Manage KMS using the GUI
        • Manage KMS using the CLI
      • TLS certificate management
        • Manage the TLS certificate using the GUI
        • Manage the TLS certificate using the CLI
      • CA certificate management
        • Manage the CA certificate using the GUI
        • Manage the CA certificate using the CLI
      • Account lockout threshold policy management
        • Manage the account lockout threshold policy using GUI
        • Manage the account lockout threshold policy using CLI
      • Manage the login banner
        • Manage the login banner using the GUI
        • Manage the login banner using the CLI
      • Manage Cross-Origin Resource Sharing
    • User management
      • Manage users using the GUI
      • Manage users using the CLI
    • Organizations management
      • Manage organizations using the GUI
      • Manage organizations using the CLI
      • Mount authentication for organization filesystems
    • Expand and shrink cluster resources
      • Add a backend server
      • Expand specific resources of a container
      • Shrink a cluster
    • Background tasks
      • Set up a Data Services container for background tasks
      • Manage background tasks using the GUI
      • Manage background tasks using the CLI
    • Upgrade WEKA versions
  • Licensing
    • License overview
    • Classic license
  • Monitor the WEKA Cluster
    • Deploy monitoring tools using the WEKA Management Station (WMS)
    • WEKA Home - The WEKA support cloud
      • Local WEKA Home overview
      • Deploy Local WEKA Home v3.0 or higher
      • Deploy Local WEKA Home v2.x
      • Explore cluster insights and statistics
      • Manage alerts and integrations
      • Enforce security and compliance
      • Optimize support and data management
    • Set up the WEKAmon external monitoring
    • Set up the SnapTool external snapshots manager
  • Support
    • Get support for your WEKA system
    • Diagnostics management
      • Traces management
        • Manage traces using the GUI
        • Manage traces using the CLI
      • Protocols debug level management
        • Manage protocols debug level using the GUI
        • Manage protocols debug level using the CLI
      • Diagnostics data management
  • Best Practice Guides
    • WEKA and Slurm integration
      • Avoid conflicting CPU allocations
    • Storage expansion best practice
  • WEKApod
    • WEKApod Data Platform Appliance overview
    • WEKApod servers overview
    • Rack installation
    • WEKApod initial system setup and configuration
    • WEKApod support process
  • Appendices
    • WEKA CSI Plugin
      • Deployment
      • Storage class configurations
      • Tailor your storage class configuration with mount options
      • Dynamic and static provisioning
      • Launch an application using WEKA as the POD's storage
      • Add SELinux support
      • NFS transport failback
      • Upgrade legacy persistent volumes for capacity enforcement
      • Troubleshooting
    • Convert cluster to multi-container backend
    • Create a client image
    • Update WMS and WSA
    • BIOS tool
Powered by GitBook
On this page
  • Overview
  • SSD space reclamation in tiered filesystems
  • Object store space reclamation in tiered filesystems
  • View object store bucket capacity details
  • Object tagging
  1. WEKA Filesystems & Object Stores
  2. Advanced data lifecycle management

Data management in tiered filesystems

This page describes the system behavior when tiering, accessing or deleting data in tiered filesystems.

Overview

In tiered filesystems, the WEKA system optimizes storage efficiency and manages storage resources effectively by:

  • Tiering only infrequently accessed portions of files (warm data), keeping hot data on SSDs.

  • Efficiently bundling subsets of different files (to 64 MB objects) and tiering them to object stores, resulting in significant performance enhancements.

  • Retrieving only the necessary data from the object store when accessing it, regardless of the entire object it was originally tiered with.

  • Reclaiming logically freed data occurs when data is modified or deleted and is not used by any snapshots. Reclamation is a process of freeing up storage space that was previously allocated to data that is no longer needed.

Only data that is not logically freed is considered for licensing purposes.

SSD space reclamation in tiered filesystems

For logically freed data that resides on the SSD, the WEKA system immediately deletes the data from the SSD, leaving the physical space reclamation for the SSD erasure technique.

Object store space reclamation in tiered filesystems

Object store space reclamation is an important process that efficiently manages data stored on object storage.

In the WEKA system, object store space reclamation is only relevant for object store buckets used for tiering (local) and not for buckets used for backup only (remote).

WEKA organizes sections of files into objects for tiering, with the default maximum object size capped at 64 MB. Each object can contain data from multiple files. Files smaller than 1 MB are consolidated into a single object, while larger files are distributed across multiple objects. When a file is deleted (or updated and is not used by any snapshots), the space within one or more objects is marked as available for reclamation. However, these objects are only deleted under specific conditions.

Deleting related objects happens when all associated files are deleted, allowing for complete space reclamation within the object or during the reclamation process. Reclamation entails reading an eligible object from object storage and packing the active portions (representing data from undeleted files) with sections from other files that must be written to the object store. The resulting object is then written back to the object store, freeing up reclaimed space.

WEKA automates the reclamation process by monitoring the filesystems. When the reclaimable space within a filesystem exceeds 13%, the reclamation process begins. It continues until the total reclaimable space drops below 7%. This mechanism prevents write amplifications, allows time for higher portions of eligible 64 MB objects to become logically free, and prevents unnecessary object storage workload for small space reclamation. It's important to note that reclamation is only executed for objects with reclaimable space exceeding 5% within that object.

To calculate the amount of space that can be reclaimed, consider the following examples:

  1. If we write 1 TB of data, and 15% of that space can be reclaimed, we have 150 GB of reclaimable space.

  2. If we write 10 TB of data, and 5% of that space can be reclaimed, we have 500 GB of reclaimable space.

The starting point for the reclamation process differs in each example. In example 1, reclamation begins at 130 GB (13%), while in example 2, it doesn't start. This is important to note because even though there is more total reclaimable space in example 2, the process starts later.

For regular filesystems where files are frequently deleted or updated, this behavior can result in the consumption of 7% to 13% more object store space than initially expected based on the total size of all files written to that filesystem. When planning object storage capacity or configuring usage alerts, it's essential to account for this additional space. Remember that this percentage may increase during periods of high object store usage or when data/snapshots are frequently deleted. Over time, it will return to the normal threshold as the load/burst is reduced.

If the filesystem was created from a snapshot, only the data uploaded to the object store after the new filesystem was created can be reclaimed. Pre-existing data from the original snapshot is unreclaimable. To ensure all data is reclaimable, migrate the restored filesystem to a new bucket. For details, see Attach or detach object store buckets.

If tuning of the system interaction with the object store is required, such as object size, reclamation threshold numbers, or the object store space reclamation is not fast enough for the workload, contact the Customer Success Team.

View object store bucket capacity details

Run the weka fs tier capacity command to retrieve a comprehensive listing of data capacities associated with object store buckets per filesystem.

If the filesystem was created from an uploaded snapshot, data from the original filesystem is not accounted for in the displayed capacity.

Example:

$ weka fs tier capacity
FILESYSTEM  BUCKET               TOTAL CONSUMED CAPACITY   USED CAPACITY   RECLAIMABLE%   RECLAIMABLE THRESHOLD%
bmrb        wekalow-bmrb         0 B                       0 B             0.00           10.00
cam_archive wekalow-archive      20.39 TB                  18.80 TB        7.79           10.00
nmr_backup  wekalow-nmrbackup    519.07 GB                 518.05 GB       0.19           10.00

To list the data capacities of a specific filesystem, add the option --filesystem <filesystem name>.

Example:

$ weka fs tier capacity --filesystem cam_archive
FILESYSTEM  BUCKET               TOTAL CONSUMED CAPACITY   USED CAPACITY   RECLAIMABLE%   RECLAIMABLE THRESHOLD%
cam_archive wekalow-archive      20.39 TB                  18.80 TB        7.79           10.00

Object tagging

When WEKA uploads objects to the object store, it assigns tags to categorize them. These tags are crucial because they enable the customer to implement specific lifecycle management rules in the object store based on the assigned tags.

To enable upload tags, set it when adding or updating the object store bucket. For details, see the following:

  • Using the GUI:

  • Using the CLI:

The following table indicates the additional tags WEKA adds to the object when using object tagging:

Tag
Description

wekaBlobType

The WEKA-internal type representation of the object.

Possible values:

DATA, METADATA, METAMETADATA, LOCATOR, RELOCATIONS

wekaFsId

A unique filesystem ID (a combination of the filesystem ID and the cluster GUID).

wekaGuid

The cluster GUID.

wekaFsName

The filesystem name that uploaded this object.

The object store must support S3 object-tagging and might require additional permissions to use object tagging.

For example, the following extra permissions are required in AWS S3:

  • s3:PutObjectTagging

  • s3:DeleteObjectTagging

Additional charges may apply by your cloud service provider.

PreviousAdvanced time-based policies for data storage locationNextTransition between tiered and SSD-only filesystems

Last updated 9 months ago

For example, you can transfer objects of a specific filesystem when interacting with the .

, or

by selecting Enable Upload Tags in the Advanced section.

, or

by setting the enable-upload-tags parameter in weka fs tier s3 add/update commands.

S3 Glacier Deep Archive
Add an object store bucket
Edit an object store bucket
Object store space reclamation
Add an object store bucket
Edit an object store bucket