Comment on page
Data management in tiered filesystems
This page describes the system behavior when tiering, accessing or deleting data in tiered filesystems.
In tiered filesystems, the WEKA system optimizes storage efficiency and manages storage resources effectively by:
- Tiering only infrequently accessed portions of files (warm data), keeping hot data on SSDs.
- Efficiently bundling subsets of different files (to 64 MB objects) and tiering them to object stores, resulting in significant performance enhancements.
- Retrieving only the necessary data from the object store when accessing it, regardless of the entire object it was originally tiered with.
- Reclaiming logically freed data, which occurs when data is modified or deleted and is not used by any snapshots. Reclamation is a process of freeing up storage space that was previously allocated to data that is no longer needed.
Only data that is not logically freed is considered for licensing purposes.
For logically freed data that resides on SSD, the WEKA system immediately deletes the data from the SSD, leaving the physical space reclamation for the SSD erasure technique.
Object store space reclamation is an important process that involves efficiently managing data stored on object storage.
In the WEKA system, object store space reclamation is only relevant for object store buckets used for tiering (defined as
local) and not for buckets used for backup-only (defined as
WEKA organizes files into 64 MB objects for tiering. Each object can contain data from multiple files. Files smaller than 1 MB are consolidated into a single 64 MB object. For larger files, their parts are distributed across multiple objects. As a result, when a file is deleted (or updated and is not used by any snapshots), the space within one or more objects is marked as available for reclamation. However, the deletion of these objects only occurs under specific conditions.
The deletion of related objects happens when either all associated files are deleted, allowing for complete space reclamation within the object, or during the reclamation process. Reclamation entails reading an eligible object from object storage and packing the active portions (representing data from undeleted files) with sections from other files that need to be written to the object store. The resulting object is then written back to object store, freeing up reclaimed space.
WEKA automates the reclamation process by monitoring the filesystems. When the reclaimable space within a filesystem exceeds 13%, the reclamation process begins. It continues until the total reclaimable space drops below 7%. This mechanism prevents write amplifications, allows time for higher portions of eligible 64 MB objects to become logically free, and prevents unnecessary object storage workload for small space reclamation. It's important to note that reclamation is only executed for objects with reclaimable space exceeding 5% within that object.
To calculate the amount of space that can be reclaimed, consider the following examples:
- 1.If we write 1 TB of data, and 15% of that space can be reclaimed, we have 150 GB of reclaimable space.
- 2.If we write 10 TB of data, and 5% of that space can be reclaimed, we have 500 GB of reclaimable space.
The starting point for the reclamation process differs in each example. In example 1, reclamation begins at 130 GB (13%), while in example 2, it doesn't start. This is important to note because even though there is more total reclaimable space in example 2, the process starts at a later point.
For regular filesystems where files are frequently deleted or updated, this behavior can result in the consumption of 7% to 13% more object store space than initially expected based on the total size of all files written to that filesystem. When planning object storage capacity or configuring usage alerts, it's essential to account for this additional space. Keep in mind that this percentage may increase during periods of high object store usage or when data/snapshots are frequently deleted. Over time, it will return to the normal threshold as the load/burst is reduced.
If tuning of the system interaction with the object store is required, such as object size, reclamation threshold numbers, or the object store space reclamation is not fast enough for the workload, contact the Customer Success Team.
Object store space reclamation
You can show the filesystem tired capacity details.
[root@jack-0 ~] 2023-10-21 10:06:46 $ weka fs tier capacity
FILESYSTEM BUCKET TOTAL CONSUMED CAPACITY USED CAPACITY RECLAIMABLE% RECLAIMABLE THRESHOLD%
default prod 13.29 GB 13.29 GB 0.00 10.00
fs01 logs 1.05 GB 1.05 GB 0.00 10.00
When WEKA uploads objects to the object store, it assigns tags to categorize them. These tags serve a crucial purpose because they enable the customer to implement specific lifecycle management rules in the object store based on the assigned tags.
To enable upload tags, set it when adding or updating the object store bucket. For details, see the following:
The following table indicates the additional tags WEKA adds to the object when using object tagging:
The object store must support S3 object-tagging and might require additional permissions to use object tagging.
For example, the following extra permissions are required in AWS S3:
Additional charges may apply by your cloud service provider.