Comment on page
This page describes the Snap-To-Object feature, which enables the movement of all the data of a specific snapshot to an object store.
The Snap-To-Object feature enables the committing of all the data of a specific snapshot, including file system metadata, every file, and all associated data to an object store. You can use the full snapshot data to restore the data on the WEKA cluster or another cluster.
The Snap-To-Object feature is helpful for a range of use cases, as follows:
Suppose it is required to recover data stored on a WEKA filesystem due to a complete or partial loss of the data within it. You can use a data snapshot saved to an object store to recreate the same data in the snapshot on the same or another WEKA cluster.
This use case supports backup in any of the following WEKA system deployment modes:
- Local object store: The WEKA cluster and object store are close to each other and will be highly performant during data recovery operations. The WEKA cluster can recover a filesystem from any snapshot on the object store for which it has a reference locator.
- Remote object store: The WEKA cluster and object store are located in different geographic locations, typically with longer latencies between them. In such a deployment, you can send snapshots to both local and remote object stores.
- Local object store replicating to a remote object store: A local object store in one datacenter replicates data to another object store using the object store system features, such as AWS S3 cross-region replication. This deployment provides both integrated tiering and Snap-To-Object local high performance between the WEKA object store and the additional object store. The object store manages the data replication, enabling data survival in multiple regions.
This deployment requires ensuring that the object store system perfectly replicates all objects on time to ensure consistency across regions.
The periodic creation of snapshots and uploading of the snapshots to an object store generates an archive, allowing the accessing of past copies of data.
When any compliance or application requirement occurs, it is possible to make the relevant snapshot available on a WEKA cluster and view the content of past versions of data.
Combining a local cluster with a replicated object store in another data center allows for the following use cases:
- Disaster recovery: where you can take the replicated data and make it available to applications in the destination location.
- Backup: where you can take multiple snapshots and create point-in-time images of the data that can be mounted and specific files may be restored.
In a public cloud, with a WEKA cluster running on compute instances with local SSDs, sometimes the data needs to be retained, even though ongoing access to the WEKA cluster is unnecessary. In such cases, using Snap-To-Object can save the costs of compute instances running the Weka system.
To pause a cluster, you need to take a snapshot of the data and then use Snap-To-Object to upload the snapshot to an S3 compliant object store. When the upload process is complete, the WEKA cluster instances can be stopped, and the data is safe on the object store.
To re-enable access to the data, you need to form a new cluster or use an existing one and download the snapshot from the object store.
This use case ensures data protection against cloud availability zone failures in the various clouds: AWS Availability Zones, Google Cloud Platform (GCP) Zones, and Oracle Cloud Infrastructure (OCI) Availability Domains.
In AWS, for example, the WEKA cluster can run on a single availability zone, providing the best performance and no cross-AZ bandwidth charges. Using Snap-To-Object, you can take and upload snapshots of the cluster to S3 (which is a cross-AZ service). If an AZ failure occurs, a new WEKA cluster can be created on another AZ, and the last snapshot uploaded to S3 can be downloaded to this new cluster.
Using WEKA snapshots uploaded to S3 combined with S3 cross-region replication enables the migration of a filesystem from one region to another.
On-premises WEKA deployments can often benefit from cloud elasticity to consume large quantities of computation power for short periods.
Cloud bursting requires the following steps:
- 1.Take a snapshot of an on-premises WEKA filesystem.
- 2.Upload the data snapshot to S3 at AWS using Snap-To-Object.
- 3.Create a WEKA cluster in AWS and make the data uploaded to S3 available to the newly-formed cluster at AWS.
- 4.Process the data in-cloud using cloud compute resources.
Optionally, you may also promote data back to on-premises by doing the following:
- 1.Take a snapshot of the WEKA filesystem in the cloud on completion of cloud processing.
- 2.Upload the cloud snapshot to the on-premises WEKA cluster.
When uploading a snapshot to an object store, adhere to the following requirements:
To achieve fast upload and prevent bandwidth competition with other snapshot uploads to the same object store, you can upload only one snapshot at a time to the object store. However, you can upload one snapshot to a local object store and one to a remote object store in parallel.
A writeable snapshot is a clone of the live filesystem or other snapshots at a specific time, and its data keeps changing. Therefore, its data is tiered according to the tiering policies, but it cannot be uploaded to the object store as a read-only snapshot.
For space and bandwidth efficiency, it is highly recommended to upload snapshots in chronological order to the remote object store.
Uploading all snapshots or the same snapshots to a local object store is not required. However, once a snapshot is uploaded to the remote object store (for example, a monthly snapshot), it is inefficient to upload a previous snapshot (for example, the daily snapshot before it) to the remote object store.
You cannot delete a snapshot parallel to one uploaded to the same filesystem. Because uploading a snapshot to a remote object store can take a while, it is recommended to delete the required snapshots before uploading to the remote object store.
This requirement is more important when uploading snapshots to the local and remote object stores in parallel. Consider the following:
- 1.A remote upload is in progress.
- 2.A snapshot is deleted.
- 3.Later a snapshot is uploaded to the local object store.
In this scenario, the local snapshot upload waits for the pending deletion of the snapshot, which occurs only once the remote snapshot upload is done.
Synchronous snapshots are point-in-time backups for filesystems. When taken, they consist only of the changes since the last snapshot. When you download and restore a synchronous snapshot to a live filesystem, the system reconstructs the filesystem on-the-fly with the changes since the previous snapshot.
This capability for filesystem snapshots potentially makes them more cost-effective because you do not have to update the entire filesystem with each snapshot. You update only the changes since the last snapshot.
It is recommended to download the synchronous snapshots in chronological order. Only snapshots uploaded from a 4.0 version or above can be downloaded as increments.
Deleting a snapshot from a filesystem that uploaded it removes all of its data from the local object store bucket. It does not remove any data from a remote object store bucket.
If the snapshot has been (or is) downloaded and used by a different filesystem, that filesystem stops functioning correctly, data can be unavailable, and errors can occur when accessing the data.
Before deleting the downloaded snapshot, it is recommended to either un-tier or migrate the filesystem to a different object store bucket.
Snap-To-Object and tiering use SSDs and object stores for the storage of data. The WEKA system uses the same paradigm for holding SSD and object store data for both Snap-To-Object and tiering to save storage and performance resources.
You can implement this paradigm for each filesystem using one of the following use cases:
- Data resides on the SSDs only, and the object store is used only for the various Snap-To-Object use cases, such as backup, archiving, and bursting: The allocated SSD capacity must be identical to the filesystem size (total capacity) for each filesystem. The drive retention period must be defined as the longest time possible (which is 60 months). The Tiering Cue must be defined using the same considerations based on IO patterns. In this case, the applications always work with a high-performance SSD storage system and use the object store only as a backup device.
- Snap-To-Object on filesystems is used with active tiering between the SSDs and the object store: Objects in the object store are used for tiering all data and backup using Snap-To-Object. If possible, the WEKA system uses the same object for both purposes, eliminating the unnecessary need to acquire additional storage and copy data.
When using Snap-To-Object to promote data from an object store, some of the metadata may still be in the object store until it is accessed for the first time.