S3 bucket migration between clusters
S3 bucket migration enables non-disruptive transfer of S3 buckets from one WEKA S3 cluster (source) to another (target) while maintaining continuous availability and transparent access for S3 clients.
Overview
WEKA’s S3 bucket migration feature enables seamless, non-disruptive transfer of S3 buckets between WEKA clusters, allowing continuous access by S3 clients without requiring any changes on the client side. This capability is essential for operations such as cluster upgrades, data center consolidation, performance rebalancing, capacity expansion, or geographic data relocation.
The migration mechanism operates at the individual bucket level, combining intelligent traffic redirection with background data migration. During migration, client requests are progressively routed through the target cluster using defined forwarding modes, while the actual data transfer occurs concurrently.
To support efficient and consistent migration, the system uses two coordinated components:
The target cluster’s native in-line migration path, which handles live S3 client traffic and request forwarding.
The WEKA S3 Migrator (
s3migrate
), a high-performance utility that runs out-of-band to migrate data from source to target buckets in the background. This tool performs full and differential copies, ensuring the target becomes fully authoritative with minimal disruption.
Key characteristics include:
Non-disruptive operation: Client applications continue to read/write throughout the migration.
Transparent to clients: No reconfiguration of S3 clients is required.
Per-bucket control: Migrations are configured and managed on a per-bucket basis.
Support for differently sized clusters and buckets.
Dry-run support: Enables safe testing of the forwarding configuration with selected clients.
Rollback support: Migration can be safely rolled back during early phases before commitment.
Strict 1:1 mapping: Each source bucket maps to one target bucket with the same name.
This approach ensures operational continuity, supports detailed validation before cutover, and minimizes migration risks through controlled state transitions and rollback-safe procedures.
Scope and considerations
To ensure a successful migration, it is important to understand the scope of the feature. The following points outline the responsibilities and boundaries of the migration process.
Migration scope: The feature is designed to migrate buckets between two WEKA clusters. It does not support migrating buckets from non-WEKA sources.
Migration unit: Migration operates on a full bucket basis. The process transfers the entire contents of a source bucket to its destination and does not support selective or partial bucket migration.
DNS configuration: Successful traffic redirection relies on manual DNS updates. You are responsible for configuring the necessary DNS records to point S3 clients to the target cluster.
Target bucket setup: The target bucket must be configured on the destination cluster before initiating the migration. The process does not automatically create or configure the target bucket.
Metadata transfer: The migration transfers object data and standard S3 metadata. POSIX-specific metadata and in-progress multipart upload (MPU) information are not migrated. It is recommended to complete or abort ongoing multipart uploads before the final cutover.
Security settings: Security principals, such as service accounts and STS configurations, are not migrated. You must recreate these security settings on the target cluster to maintain access control.
Migration performance: The duration of the migration process is primarily determined by two factors: the total quantity of objects in the source bucket and the available network bandwidth between the clusters. Plan for a longer synchronization period for buckets containing a high volume of objects.
Migration modes
The S3 bucket migration process is managed by a sequence of operational modes on the target cluster. These modes dictate how the target cluster handles client requests and coordinates the data transition from the source cluster.
The migration progresses through the following sequence of modes:

This structured progression ensures a controlled migration, from initial setup and traffic redirection to final data transfer and cutover.
Ready mode: This is the initial and final state of the target cluster.
Before migration: The target cluster is a standard, standalone cluster, ready to be configured as a migration target.
After migration: The target cluster is fully operational, serving all client traffic directly after the migration is complete and the source is decommissioned.
Forward mode: In this mode, the target cluster actively forwards specific client requests to the source bucket. It typically handles read operations (like GET) by fetching data from the source, while write operations (like PUT or DELETE) may be restricted or handled locally, depending on the configuration. This mode allows you to redirect client traffic to the target's endpoint before the bulk data transfer begins. A rollback from
Forward
mode back toReady
is possible, which effectively cancels the migration before any significant data transfer occurs.Migrate mode: This is the primary data transfer phase. The system copies data from the source bucket to the target bucket. During this process, the target cluster continues to serve client requests, ensuring service continuity.
Each mode transition is controlled by an administrator using the weka s3 bucket migrate update --mode
command.
S3 bucket migration workflow overview
The S3 WEKA bucket migration workflow consists of four structured phases designed to ensure a seamless, secure, and controlled transition from a source cluster to a target cluster. Each phase includes validation and monitoring steps to ensure data consistency, minimize risk, and provide clear rollback or recovery points when applicable.
Key operational principles of s3migrate
s3migrate
The s3migrate
tool is built on several core principles that ensure migrations are efficient, resilient, and manageable at scale. Understanding these principles helps you effectively control the migration process.
Resumable migrations
The tool ensures operational resilience by recording every handled object in a sorted migrate.log
file. In the event of an interruption, such as a process failure or manual stop (CTRL-C), the migration can be resumed precisely from the point of failure.
To continue an interrupted job, retrieve the last object key from the migrate.log
file and use it with the --start-after
parameter in the next run. This avoids re-processing objects that have already been transferred successfully.
Concurrency control
You can manage the resource impact of the migration by controlling the number of simultaneous object transfers. The number of worker threads, configured with the --threads
parameter, determines the level of concurrency. Adjusting this value allows you to balance migration speed against the load on your systems and network.
Distributed migration
For very large datasets, the migration workload can be distributed across multiple hosts to run in parallel. This is achieved using two parameters:
--hash-count
: Defines the total number of hosts participating in the migration.--hash-value
: Assigns a unique index to each host, which then processes its designated portion of the objects.
Targeted failure handling
The tool provides a streamlined process for retrying objects that fail due to transient issues, such as network interruptions or temporary disk space shortages.
All failed object transfers are recorded in a failed.log
file. Using the --failed-to-folder
parameter, the tool processes this log and organizes the keys of the failed objects into a new source folder. This prepares a targeted second run that attempts to migrate only the objects that failed previously.
S3 bucket migration procedures
Phase 1: Preparation (ready mode)
Prepare the target environment:
Deploy a WEKA cluster that supports S3 bucket migration (version TBD).
Obtain and install the WEKA S3 Migrator (
s3migrate
) on either a server within the source cluster or on a standalone server with network connectivity to both the source and target clusters. For parallel execution, the WEKA S3 Migrator can be installed on multiple servers. Contact the Customer Success Team for access credentials and detailed installation instructions.
Create and validate the target bucket:
Create a new, empty S3 bucket on the target cluster.
The target bucket must use the same name as the source bucket to preserve seamless access for clients.
Capacity planning: Ensure the target cluster has sufficient capacity to store all objects from the source bucket, with at least an additional 10% overhead or more if growth is expected.
Performance validation: Confirm that the target bucket can sustain the peak load of the source bucket, including regular client access and migration throughput.
Disable ILM on the target bucket:
Ensure that S3 Lifecycle Management (ILM) is disabled on the target bucket during migration.
ILM settings are not migrated automatically and must be reconfigured after finalization if required.
Replicate users and IAM policies: Manually replicate all relevant S3 users, bucket policies, and IAM policies from the source to the target cluster.
Adjust DNS TTL for the source bucket: Reduce the TTL (Time-To-Live) value of the source bucket’s DNS record to a low value (for example, 30 seconds) to allow fast DNS propagation when traffic is later redirected.
Configure a persistent source bucket DNS record: Create a DNS host record that points to the IPs of the source bucket. This record must remain fixed and associated with the source cluster throughout the migration process.
Configure a persistent target bucket DNS record: Create a dedicated DNS host record that resolves to the target bucket. This record must remain constant and will serve as the long-term access point after migration is complete.
Provision S3 migration users:
Create two users on the source cluster and one on the target cluster to support both inline and background migration operations:
On the source cluster:
s3_ro_migrate: Read-only access for the WEKA S3 Migrator.
s3_rw_migrate: Read/write access used by the target cluster for inline forwarding operations.
On the target cluster:
s3_rw_migrate: Read/write access for the WEKA S3 Migrator to write data into the target bucket.
Ensure credentials for these users are securely configured on the systems that require them:
The target cluster requires s3_rw_migrate credentials for the source bucket.
The WEKA S3 Migrator requires:
s3_ro_migrate for the source bucket.
s3_rw_migrate for the target bucket.
Configure and validate TLS on the target cluster: Ensure that the target cluster presents a valid TLS certificate that covers both the source bucket hostname and the target bucket hostname. This is necessary to support seamless client redirection and avoid TLS validation errors during migration.
Phase 2: Forward mode configuration and validation
Attach the target bucket to the source bucket: Link the target bucket to the source bucket, allowing forwarding behavior to be initiated. The target and source buckets must exist and be accessible using the provided S3 credentials. Use the following command on the target cluster. For command details, see weka s3 bucket migrate attach.
weka s3 bucket migrate attach <target-bucket> <source-url> <s3_key> <s3_secret> <client_cert_path> [--source <source-bucket>]
Set the target bucket to forward mode: The forward mode redirects all S3 client requests arriving at the target cluster to the source bucket, ensuring no disruption during the DNS switchover. Use the following command to set the target bucket to
forward
mode. For command details, see weka s3 bucket migrate update.
weka s3 bucket migrate update --url <source-url> --mode forward
Update DNS or load balancer configuration:
Repoint the S3 bucket hostname to the target cluster’s IP addresses.
Ensure DNS TTL is low (for example, 30 seconds) to allow fast client redirection.
Monitor DNS propagation and client behavior:
Allow time for DNS changes to take effect.
Some clients may continue using cached DNS entries and still contact the source.
System behavior in forward mode:
All incoming requests (GET, PUT, DELETE, etc.) to the target cluster are forwarded to the source bucket.
Operations received directly at the source continue to be processed natively.
This guarantees that all modifications are centralized on the source bucket, ensuring full consistency.
Validate the forward mode functionality:
Point test clients to the target bucket hostname and verify functional access.
Confirm TLS certificate resolution by temporarily adding an entry in
/etc/hosts
mapping the source hostname to the target IP.Perform functional checks (for example, read/write operations), then revert the
/etc/hosts
changes.Notify internal users that the target bucket is available for early testing and validation.
Phase 3: Migrate mode and access control configuration
This phase transitions the migration from traffic forwarding to active data transfer. The target cluster assumes responsibility for processing client operations while the WEKA S3 Migrator begins copying data from the source bucket.
Migrate source bucket hostname to the target cluster:
Update the TLS certificate on the target cluster to include the source bucket hostname in the Subject Alternative Name (SAN) field, along with the target bucket hostname.
Modify the DNS or load balancer record for the source bucket to point to the target cluster IPs.
Prepare to handle residual traffic from clients using cached IP addresses still pointing to the source.
Verify source bucket is not accessed by non-migration access:
Monitor the source bucket and confirm that no S3 clients (other than the migration client or the target system) are actively accessing it.
Do not proceed until the source is fully idle, ensuring a clean cutover.
Restrict access to the source cluster: To prevent direct access to source data during the migration, apply one of the following access control approaches based on your operational preference:
This step is crucial and must be completed successfully before you proceed. Restricting access prevents clients from updating the source data. Any updates made to the source after this point might not be migrated to the target, which can lead to data inconsistencies.
Run the
s3find
script on both clusters: Use the script to locate and list the bucket objects on both the source and target systems. For details, see Find and filter S3 objects using the s3find script.Run the WEKA S3 Migrator in dry-run: To ensure that the WEKA S3 Migrator can access and interpret data correctly from the source cluster before performing a full operation, access the server installed with WEKA S3 Migrator and run it to simulate processing the first 10,000 records using the WEKA S3 Migrator command with the
--dry-run
and--first 10000
options. For details, see WEKA S3 Migrator: s3migrate command.
After this step, the migration process becomes non-reversible. The target will begin to accept writes and serve as the system of record. Any rollback attempt may lead to data inconsistency.
Set the target bucket to migrate mode: Use the following command to change the migration mode to
migrate
. For command details, see weka s3 bucket migrate update.
weka s3 bucket migrate update --url <source-url> --mode migrate
Run the WEKA S3 Migrator to start data transfer: If the WEKA S3 Migrator is installed on multiple servers, you can run each instance in parallel. To use the the WEKA S3 Migrator, see WEKA S3 Migrator: s3migrate command.
The WEKA S3 Migrator performs:
A full copy of all objects from source to target.
Iterative differential passes to identify and copy any remaining objects.
Conditional PUTs to avoid overwriting newer objects.
Optional resume support for interrupted jobs.
System behavior in migrate
mode:
New writes (PUT/DELETE/MPU): Handled entirely by the target cluster and stored in the target bucket.
Reads (GET/HEAD):
Served from the target bucket if the object exists there.
Fallback to the source bucket if the object has not yet been migrated.
List operations: Present a unified, deduplicated view from both source and target.
Multi-Part Uploads (MPUs):
In-flight MPUs (initiated before the transition) complete on the source.
New MPUs are handled by the target.
Delete operations: Applied to both source and target for consistency.
Copy operations: Executed on the target if the object exists there; otherwise, the source is used.
Phase 4: Finalization (ready mode – final)
In this final phase, the migration is completed and the bucket operates independently on the target cluster. All data and operational control are fully transitioned, and the source bucket is no longer involved.
Verify completion of data migration:
Use the WEKA S3 Migrator to confirm that:
All objects have been successfully copied to the target bucket.
No data remains to be transferred.
All in-flight multi-part uploads (MPUs) on the source have been completed or aborted.
Perform application-level validation to confirm full consistency between expected and actual object states in the target bucket.
Remove migration configuration: When validation is complete, detach the migration configuration using the following CLI command. This detachment finalizes the migration and decouples the target bucket from the source. For command details, see weka s3 bucket migrate detach.
weka s3 bucket migrate detach <target-bucket>
Re-enable and configure ILM on the target bucket: If S3 Lifecycle Management (ILM) was configured on the source bucket:
Manually copy the ILM policies and rules to the target bucket.
Re-enable ILM on the target once policy validation is complete.
(Optional) Decommission or repurpose the source bucket:
The source bucket is no longer serving traffic for the migrated bucket.
Perform any final backups, audits, or access removal steps as needed.
You may now decommission, archive, or repurpose the source bucket and its configuration.
System behavior in ready
mode (final):
ready
mode (final):The bucket is now a standalone operational unit on the target WEKA S3 cluster.
All S3 client operations (reads, writes, deletes) are fully handled by the target bucket.
The source cluster is no longer involved in any operation related to this bucket.
Find and filter S3 objects using the s3find script
Use the s3find
script to efficiently locate files within a directory of a mounted filesystem. The script searches for files, divides the results into chunks of 100,000 entries, and provides an option to filter for recently modified files before sorting each chunk in parallel.
Before you begin
Ensure you have POSIX access to the objects in the S3 bucket. To enable this, mount the corresponding filesystem on a WEKA client.
Procedure
Navigate to the directory containing the script.
Run the
s3find
command with the required parameters.s3find <directory> <output_dir> [-later-than <N_days>]
Parameters
directory
*
Specifies the source directory within the mounted filesystem to search for objects.
output_dir
*
Specifies the output directory for the generated file lists. This directory is then used as the input for the --src-folder
parameter in the s3migrate
tool.
later-than <N_days>
Filters the results to include only files modified within the last <N_days>
.
Monitor and check health
Use the WEKA S3 Migrator to monitor detailed metrics throughout the migration process. This ensures optimal performance, maintains data consistency, and supports timely completion. Monitoring allows you to validate progress, proactively identify issues, and mitigate operational risks before they affect workloads.
Monitor S3 migration metrics
Track the following key metrics exposed by the WEKA S3 Migrator:
Throughput: Bytes transferred per second.
Object rate: Number of objects processed per second.
Remaining objects estimate: Estimated number of objects left to migrate.
Per-Iteration job summary:
Number of objects copied
Number of bytes copied
Execution time
Error count and error details
Latency in forwarding modes
Migration modes that use request forwarding inherently add latency to client operations. This is because each request involves an additional network round trip: the target cluster receives the client request and then forwards it to the source cluster before relaying the response.
As a general guideline, this extra step can approximately double the request latency compared to a direct operation. It is important to assess this performance impact in your specific environment by measuring request latency before and during the use of a forwarding mode.
For the specific command options, see WEKA S3 Migrator: s3migrate command.
Interpret the s3migrate progress output
The s3migrate
tool provides a real-time, single-line progress indicator to monitor the status of the data transfer. Understand each field in the output to track the migration effectively.
Sample output
The following is an example of the progress indicator line:
obj=1234 :: ign=56 exist=89 work=123 [45% ~2h:34m:12s] :: fail=2 skip(dn=3 up=4) dnld=567 (12.3/s) dl.byt=1.2GB (45MB/s) upld=560 (12.1/s) ul.byt=1.1GB (43MB/s)
Field descriptions
The following table describes each field that appears in the progress output string.
obj
The total number of objects discovered in the source bucket for processing.
ign
The number of objects ignored. This applies when using hash-based parallel processing (--hash-count
and --hash-value
).
exist
The number of objects skipped because they were found to already exist on the target bucket.
work
The number of objects currently being processed by active worker threads.
[%% ~time]
The completion percentage and the estimated time remaining for the migration.
fail
The total number of objects that failed to migrate. For details, check the error fields in the log files.
skip(dn=X up=Y)
The number of objects skipped during the download (dn
) or upload (up
) phases.
dn: A download skip typically occurs when the object is not found on the source bucket.
up: An upload skip typically occurs when a conditional PUT operation fails on the target, often because the object already exists.
dnld
The total number of objects successfully downloaded from the source. The value in parentheses indicates the download rate in objects per second.
dl.byt
The total volume of data downloaded from the source. The value in parentheses indicates the current download speed (for example, in MB/s).
upld
The total number of objects successfully uploaded to the target. The value in parentheses indicates the upload rate in objects per second.
ul.byt
The total volume of data uploaded to the target. The value in parentheses indicates the current upload speed (for example, in MB/s).
WEKA S3 Migrator: s3migrate command
The WEKA S3 Migrator facilitates object migration between S3-compatible storage systems. It supports parallel execution, detailed reporting, and fine-grained control through various parameters.
Example usage
./s3migrate \
--src-url http://host-source:9000 \
--src-bucket default \
--src-key user1 \
--src-secret password123 \
--src-folder /path/to/file/lists
--dest-url http://host-dest:9000 \
--dest-bucket dest \
--dest-key user1 \
--dest-secret password123 \
--dest-folder /path/to/dest/lists
All parameters used in the example are mandatory, as indicated by an asterisk (*) in the following table.
CLI parameters
The following table describes the parameters used in the example. For a comprehensive list of all available command-line options and usage examples, refer to the README
file included with the s3migrate
tool.
--src-url
*
URL of the source S3 endpoint.
https://s3.amazonaws.com
--src-bucket
*
Name of the source S3 bucket.
my-source-bucket
--src-key
*
Access key for the source bucket.
AKIAIOSFODNN7EXAMPLE
--src-secret
*
Secret key for the source bucket.
wJalrXUtnFEMI/K7MDENG/...
--src-folder
*
Path to the directory containing file lists from the source (file-based listing).
/path/to/src/lists
--dest-url
*
URL of the destination S3 endpoint.
https://s3.us-west-2.amazonaws.com
--dest-bucket
*
Name of the destination S3 bucket.
my-dest-bucket
--dest-key
*
Access key for the destination bucket.
AKIAIOSFODNN7EXAMPLE
--dest-secret
*
Secret key for the destination bucket.
wJalrXUtnFEMI/K7MDENG/...
--dest-folder
*
Path to the directory for destination-related file lists or outputs (file-based listing).
/path/to/dest/lists
Rollback procedures
The S3 bucket migration feature provides defined rollback paths to ensure you can safely abort the process with minimal disruption. The appropriate procedure depends on the current migration mode.
Rollback from forward mode to ready mode
This procedure reverts the target bucket to a standalone state from the forward
mode. It applies for situations where the migration needs to be canceled before any significant data transfer has occurred.
When to use
Initiate this rollback if:
You detect issues, such as application errors or misconfigurations, during the
forward
mode.You decide to postpone the migration before transitioning to
migrate
mode.
Prerequisites
The source bucket must still be the single source of truth.
No significant data has been written to the target bucket.
Rollback steps
Redirect traffic to source: Update your DNS or load balancer configuration to point the S3 bucket’s hostname back to the source cluster's IP addresses. After DNS propagation, all client traffic will be handled by the source cluster.
Reset target bucket to ready mode: Run the following CLI command to transition the target bucket out of
forward
mode. The target cluster stops forwarding requests.weka s3 bucket migrate update --url <target-cluster-url> --mode ready
Example:
weka s3 bucket migrate update --url https://192.168.1.100:9000 --mode ready
Rollback from migration mode to using the source
This procedure is a critical recovery path for aborting the migration after it has entered migrate
mode and data has been copied to the target. Use this as a last resort to make the source bucket primary again.
When to use
Initiate this rollback if you discover critical issues after the data migration has started and you must completely abandon the migration process.
Rollback steps
Block access to the target bucket: Immediately restrict all client access to the target S3 bucket to prevent further writes and ensure a stable state for data migration. During this process, the S3 service will be inaccessible to clients.
Migrate data from target to source: Use a WEKA S3 Migrator to copy any new or modified data from the target bucket back to the source bucket. This step requires an S3 user with read-write permissions on the source cluster.
Reload source IAM policies: Once data migration is complete, reload the IAM policies on the source bucket to restore its original permissions.
Redirect traffic to source: Update your DNS or load balancer configuration to point the S3 bucket’s hostname back to the source cluster.
Reset target bucket to ready mode: After traffic is flowing to the source, reset the target bucket to a standalone
ready
state using the following command:weka s3 bucket migrate update --url <target-cluster-url> --mode ready
Migration CLI reference
WEKA provides commands to manage the lifecycle of S3 bucket migration.
weka s3 bucket migrate attach
Attach a target bucket to a source bucket to begin the migration process.
Usage
weka s3 bucket migrate attach <bucket> <url> <tls-cert> [--source source] [--s3-key s3-key] [--s3-secret s3-secret]
Parameters
bucket
*
Name of the target bucket.
url
*
URL of the source WEKA S3 cluster.
tls-cert
*
Source TLS certificate file.
source
Explicit name of the source bucket if different from the target.
s3-key
Source access key for validating that the source bucket exists.
s3-secret
Source access secret for validating that the source bucket exists.
Example
weka s3 bucket migrate attach target-bucket https://192.168.1.100:9000 ~/cert.pem --s3-key S3_key --s3-secret S3_secret
weka s3 bucket migrate detach
Detach migration configuration from a target S3 bucket.
Usage
weka s3 bucket migrate detach <bucket>
Parameters
bucket
*
Name of the target bucket.
Example
weka s3 bucket migrate detach target-bucket
weka s3 bucket migrate update
Manages the S3 bucket migration modes: ready, forward, and migrate.
Usage
weka s3 bucket migrate update [--mode mode] [--url url] [--bucket bucket]
Parameters
mode
Migration modes.
Possible values: ready
, forward
, and migrate
.
url
Update all buckets pointing to the specified S3 endpoint.
bucket
Update the bucket name with the specified name.
all
Update all buckets.
Examples
weka s3 bucket migrate update --url https://192.168.1.100:9000 --mode forward
weka s3 bucket migrate update --bucket target-bucket --mode migrate
weka s3 bucket migrate update --all --mode forward
weka s3 bucket migrate show
Show the details of an S3 bucket migration configuration.
Usage
weka s3 bucket migrate show <bucket>
Parameters
bucket
*
Name of the bucket.
Example
weka s3 bucket migrate show target-bucket
weka s3 bucket migrate list
Show all the S3 bucket migration configurations on the cluster.
Usage
weka s3 bucket migrate list
Last updated