Configure S3 Performance Buckets

Learn how to configure global defaults and individual bucket settings for S3 Performance Buckets to optimize S3 operations.

S3 Performance Buckets optimize S3 operations for high-throughput and low-latency environments, such as analytics and AI/ML pipelines.

This optimization is achieved by modifying three S3 behaviors that can introduce performance overhead:

  • ETag algorithm: Replaces the computationally intensive MD5 hash with a simple unique ID (UID) for object ETags.

  • Integrity mode: Ignores client-side checksum validation requests to reduce processing overhead.

  • Sorting mode: Returns unsorted LIST results directly from the filesystem, which is significantly faster for large object trees.

You can manage these settings at two levels:

  • Global configuration: Sets the default behavior for all S3 Performance Buckets on the cluster.

  • Individual bucket configuration: Overrides the global defaults for a specific bucket.

Before you begin

The configuration commands in this topic only apply to buckets of type PERFORMANCE.

To create an S3 Performance Bucket, run the following command:

weka s3 bucket add <bucket-name> --type PERFORMANCE

Manage global performance bucket defaults

Use these commands to manage the default performance settings inherited by all S3 Performance Buckets.

Set the global defaults

To set the default ETag, integrity, and sorting behavior for all S3 Performance Buckets, use the following command:

weka s3 cluster update performance-bucket --etag-alg <etag-alg> --integrity-mode <integrity-mode> --sorting <sorting>

Changing these settings can significantly affect the WEKA cluster's performance and S3 compatibility.

Parameters

Parameter
Description

--etag-alg

Sets the default ETag generation algorithm.

  • md5: Computes a standard MD5 hash.

  • uid: (Default) Generates a unique ID (UID). This is faster and avoids hashing.

--integrity-mode

Sets the default integrity handling mode.

  • client_defined: (Default) Respects client checksum validation requests (standard S3 behavior).

  • disabled: Ignores client checksum requests. This is faster and reduces overhead.

--sorting

Sets the default sorting behavior for LIST operations.

  • sorted: Returns objects in standard lexicographical order

  • unsorted: (Default) Returns objects as they appear in the filesystem. This is significantly faster for large buckets.

Example

To set the global defaults for maximum performance:

weka s3 cluster update performance-bucket --etag-alg uid --integrity-mode disabled --sorting unsorted

View the global defaults

To display the current global settings for all S3 Performance Buckets, run:

weka s3 cluster performance-bucket

View details of all buckets

To display the parameter details of all buckets, run:

weka s3 bucket list -v

Reset all buckets to global defaults

To reset the performance settings for all existing S3 Performance Buckets, forcing them to inherit the cluster's global defaults, run the relevant command.

  • To reset the ETag algorithm for all buckets:

    weka s3 cluster etag-alg reset

  • To reset the integrity mode for all buckets:

    weka s3 cluster integrity-mode reset

  • To reset the sorting mode for all buckets:

    weka s3 cluster sorting reset

Configure an individual performance bucket

Use these commands to override the global defaults for a specific S3 Performance Bucket.

Set a specific bucket's behavior

To set a specific performance behavior for an individual bucket, use the relevant command.

Changing these settings can significantly affect the WEKA cluster's performance and S3 compatibility.

  • ETag algorithm:

    weka s3 bucket etag-alg set <bucket-name> <md5 | uid>

  • Integrity mode:

    weka s3 bucket integrity-mode set <bucket-name> <client_defined | disabled>

  • Sorting mode:

    weka s3 bucket sorting set <bucket-name> <sorted | unsorted>

For the parameter descriptions, see Set the global defaults.

Example

To configure my-analytics-bucket for unsorted LIST operations while leaving its other settings to inherit the global defaults:

weka s3 bucket sorting set my-analytics-bucket unsorted

Reset a specific bucket to global defaults

To remove an individual bucket's custom settings and make it inherit the global performance defaults, run the relevant command:

  • To reset the bucket's ETag algorithm:

    weka s3 bucket etag-alg reset <bucket-name>

  • To reset the bucket's integrity mode:

    weka s3 bucket integrity-mode reset <bucket-name>

  • To reset the bucket's sorting mode:

    weka s3 bucket sorting reset <bucket-name>

Last updated