Documentation revision history

WEKA version
Description of changes

4.4.10

This release contains LTS-scoped proactive updates for gaps and defects discovered in the field, including:

  • Graceful continuation of large IOs during client hot upgrades.

  • Fixes for NFS floating IP health check failures.

  • Dramatically faster failover mechanism for WEKA HA in switch failure scenarios with physically diverse paths.

  • Support for the NVIDIA GB200 and Linux kernel 6.14.

  • weka local ps now works better for large clusters, and can intervene with processes that need help.

  • Memory leaks in the WEKA S3 service, leading to container restarts.

  • SMB-W improved domain controller selection and handling of LDAP servers unavailability.

  • Support for RHEL 9.6 on backends and clients.

4.4.9

This release contains LTS-scoped proactive updates for gaps and defects discovered in the field, including:

  • Upgrade improvements, including hot-upgrade of clients.

  • An assortment of performance enhancements, from client shutdown time to WEKA S3 listObject.

  • Improved statistics for customers operating at scale, including error counters for the S3 API.

  • Enablement of S3 bucket notifications for 4.4 LTS and associated alerts, events, and statistics related to notifications themselves (such as the health of Kafka targets).

  • Support for Rocky Linux 9.6 on backends and clients.

4.4.8

  • Response code updates for some types of invalid objects in WEKA S3.

  • Reduction of the severity of some transient failure alerts.

  • Remediation of IOMMU incompatibilities with Broadcom P2100/P2200 NICs.

  • Resolution of erroneous service unavailability responses delivered by WEKA S3.

  • Added support for i8ge models in the Backend EC2 instances section for AWS deployments.

4.4.7

  • Support for RDMA when operating in mixed networks (Ethernet and Infiniband).

  • Support for NetVSC, the recommended PMD for use with DPDK in Azure VMs.

  • Substantial performance improvements and reliability enhancements when operating WEKA S3 at scale.

  • Completion of the removal of the deprecated WEKA SMB service (in favor of WEKA SMB-W as announced earlier in the year).

  • Added WEKA client support for Amazon Linux 2023 with ARM kernel distribution.

Kubernetes: New topic: Deploy the WEKA client on an Amazon EKS cluster. >>>

Mount filesystem: Added a section for monitoring active mounts on a specific container at /proc/wekafs/<container-name>/interface. >>>

4.4.6

This release begins the long-term support (LTS) phase of the 4.4 series, which focuses on stability, scale, currency, and security.

General enhancements

  • Added support for the ARM platform for WEKA clients, with native support for 64kB memory pages.

  • Introduced a new fs_stats category to improve observability at scale.

  • Implemented CLI command synonyms for actions (add/create/new) and entities (process/processes/node/nodes). >>>

  • Enhanced weka security policy with read-only parameters for backend enforcement of client behavior.

  • Added capability to allocate more cores to WEKA containers without deactivation using -o remount with core count or core IDs. Core configurations can be controlled locally with weka local resources or remotely with weka cluster container.

  • Introduced a new command and API to simulate container deactivation to assess whether the operation can be performed safely without impacting the cluster. CLI: weka cluster container deactivation-check <Container-ids> API: POST​/containers​/deactivation-check

  • The OBS_DETACH background task is now named OBS_DETACH2 and supports the abort action.

S3 enhancements

  • Added two new events for SLB container state monitoring: SLBContainerStatusInactiveEvent and SLBContainerStatusActiveEvent.

  • Optimized large-scale S3 deployments with up to 20% improvement in small object performance.

SMB-W enhancements

  • Added support for Active Directory forest transitive trust relationships between sibling domains.

Platform support

  • Validated WEKA client support for new Linux distributions on x86:

    • SLES15 SP6

    • Rocky 9.5

    • Ubuntu 22.04.5

    • Ubuntu 24.04.1

    • Debian 12 with Linux kernel 6.6

  • Validated WEKA client and backend support for new versions of WEKA Linux.

Deprecations

  • The /fileSystems/{inode_context}/resolve API will be deprecated in an upcoming release.

  • Support for Intel E810 network adapters will be deprecated in an upcoming 5.0 release (will remain supported in 4.4 LTS).

WEKA Operator day-2 operations: Added a topic: WekaContainer lifecycle management. >>>

Local WEKA Home 3.2.14: Added a topic: Check Local WEKA Home health. >>>

Manage the S3 protocol: Added a topic: Configure and use AWS CLI with WEKA S3 storage. >>

Kubernetes: Added a topic: Composable clusters for multi-tenancy in Kubernetes, detailing how Kubernetes-driven deployment ensures full resource isolation and optimal performance. >>>

Local WEKA Home 3.2.13:

  • Dual-stack networking: Added IPv4/IPv6 support. >>>

  • Cluster overview: Added alert start time and severity.

  • Event dashboard: Added event description column.

AWS solutions: Added a topic: AWS ParallelCluster and WEKA Integration. >>>

Local WEKA Home 3.2.12: Added GitHub SSO integration to centralize and secure organizational access through GitHub Single Sign-On. >>>

4.4.4

This release consists of minor bug fixes and is in preparation for the final enhancement release of the 4.4 series, 4.4.5.

  • Azure solutions: Introduced an Azure solutions section, starting with the Azure CycleCloud for SLURM and WEKA Integration. This section provides guidance on integrating Azure CycleCloud with the WEKA Data Platform and SLURM scheduler. The integration enables streamlined HPC cluster management and delivers high-performance, scalable solutions for AI, ML, and analytics workloads. >>>

  • AWS solutions: Added a topic: Add WEKA to an existing SageMaker HyperPod cluster. >>>

4.4.3

This release introduces several updates and enhancements to clients, protocols, security, usability, and system upgrade processes. Key highlights include simplified client operations for cloud users, expanded cluster limits, a new GUI for snapshot policies, and client driver updates for improved resilience and control.

Improvements

  • Snapshot management: Snapshots no longer require the supplemental SnapTool. Snapshot management is now fully integrated with WEKA and accessible through the CLI, GUI, and API, offering flexible and robust policy options. >>>

  • Simplified client operations for cloud users: Network settings are no longer mandatory for stateless client mounts or when creating WEKA containers. By default, the system attempts to allocate and use a network device when possible.

    • For UDP mounts, the explicit -o net=udp option is now required.

    • Network device specification remains mandatory for SCMC deployments.

  • Maximum processes increased: The maximum number of backend processes, drive processes, management processes, and total processes have all been increased.

  • Enhanced security: Strengthened cipher support has been implemented for the WEKA HTTP server.

  • High Availability (HA): Improved link recovery for large HA clusters.

  • Client driver enhancements: Updates include:

    • Reduced load average per frontend.

    • Increased resilience to network fluctuations and container state changes.

    • Reduced operational wait times.

    • Improved backward compatibility.

NFS-W enhancements

  • Enhanced NFS statistics: NFS statistics, previously available on a per-client basis, are now accessible per client per export using the CLI.

S3 enhancements

  • Expanded metrics: New metrics for monitoring include:

    • Response codes.

    • Read/write counters.

    • Response times (e.g., average time to first and last byte).

    • Filesystem operations, including resultant read/write bytes and operation counts with associated response codes.

SMB-W enhancements

  • Improved configuration handling: The "Share [+ Create]" button is now disabled during SMB cluster directory service joining and configuration completion. This ensures functionality before additional actions can be performed.

Additional enhancements

  • Extended client support: WEKA client support now includes:

    • Proxmox 8.1.4 and 8.2.

  • IAM permission requirement for AWS clients: The ec2:CreateTags IAM permission is required for each WEKA client instance in AWS deployments. This is because WEKA automatically tags network interfaces it creates during mount operations to facilitate resource management.

Kubernetes: The following updates are introduced:

  • WEKA Operator deployment: A detailed guide that explains the deployment, scaling, and management of the WEKA Data Platform on Kubernetes. The guide focuses on enabling high-performance storage for compute-intensive workloads, such as AI and HPC applications. >>>

  • WEKA Operator day-2 operations: A guide that covers ongoing system management tasks, including hardware management, cluster scaling, and resource optimization to maintain system stability and ensure optimal performance. >>>

4.4.2

This release introduces security and usability enhancements and new features for SMB-W parity with legacy SMB. Key updates include support for tagged VLANs for customers using network segregation in tenancy models, enhanced role-based access controls for the Kubernetes CSI driver, and automatic client removal timeouts for short-lifecycle clients. Additionally, users can choose their preferred numeric display format (Base-2 or Base-10) in the GUI.

Improvements

  • New CSI operator role: A new csi operator role has been introduced, allowing essential tasks with elevated privileges, such as creating filesystems, monitoring cluster status, and managing snapshots. This role provides necessary functionality without granting full ClusterAdmin or OrgAdmin access. >>>

  • Tagged VLAN IDs: Tagged VLANs enable per-NIC VLAN assignments for containers, allowing advanced network configurations and integration with diverse setups when mounting filesystems. >>>

Deprecated features

  • Discontinuation of Legacy SMB support

    As announced in the release notes of V4.2.8, support for legacy SMB is being discontinued. It will be removed from V4.4.7. If legacy SMB is enabled, upgrading to 4.4.7 fails.

S3 enhancements:

  • Updated status values: WEKA S3 clusters now use updated status labels:

    • Down is now Offline.

    • Not Ready is now Faulty.

    • Ready is now Online. A new status, Saturated, indicates temporary service disruption caused by an overload of S3 requests. The Faulty status reflects more severe issues than Saturated.

SMB-W enhancements

  • Improved Active Directory integration: The SMB-W service now supports the --server and --create-computer options for joining SMB-W clusters to Active Directory. These updates also facilitate the migration of legacy SMB configurations, ensuring improved compatibility. >>>

Additional enhancements

  • Dynamic client lifecycle support: The weka local setup container command now supports --auto-remove-timeout and --client options, streamlining management for short-lifecycle clients.

  • Customizable numeric display format: A new GUI option allows users to switch between Base-2 (binary) and Base-10 (decimal) numeric display formats. This provides flexibility in viewing capacities and metrics based on user preferences. >>>

  • New security APIs: Added REST APIs corresponding to the CIDR-based security policy CLI commands introduced in a previous release. >>>

  • Amazon Linux 2023: Added WEKA backend and client support for Amazon Linux 2023 with x86_64 kernel distribution.

  • AWS solutions: Introduced a new AWS solutions section, starting with the Integrate SageMaker HyperPod with WEKA using Slurm guide, which covers architecture and deployment workflow for the integration. >>>

  • Synchronous Snap: Updated the note to specify that only snapshots uploaded from version 4.3 or later can be downloaded using Synchronous Snap. Previously, the note indicated version 4.0 or later.

4.4.1

This release includes a variety of optimizations to benefit customers operating the WEKA Data Platform at scale.

Improvements

  • Added weka driver commands: Introduced a set of tools for managing drivers, including capabilities for building and signing drivers, import/export functionality for easier administration, and readiness checks to enhance reliability. >>>

  • Added snapshot size estimations: Snapshot listings now include size estimations, helping customers understand the capacity occupied by chains of snapshots. >>>

  • Expanded CIDR-based security policies: In the previous release, CIDR-based security policies were introduced to manage access to WEKA clusters by client IP address ranges, enhancing security and simplifying administration for Organizations users. This release extends the security policy functionality to include filesystems. >>>

  • Added support for HashiCorp Vault AppRole: WEKA now supports HashiCorp Vault's security best practice, AppRole. Unlike the traditional token system with long-lived tokens requiring manual refresh, AppRole uses a RoleID and SecretID to retrieve short-lived tokens for accessing secrets. This enhances the security posture of the KMS configuration in WEKA. >>>

  • Added support for unique KMS Configuration per filesystem: WEKA now supports a separate KMS configuration for each filesystem, replacing the previous cluster-wide configuration. This feature enables customers to use distinct key hierarchies for individual filesystems. >>>

NFS-W enhancements

  • Added Support for NFSv4 ACLs in NFS-W: NFS-W now supports NFSv4 access control lists (ACLs) up to the extended attribute limit of inodes on WEKA clusters. The mode of NFS-related ACLs can be viewed with weka nfs global-config show and enabled through weka nfs global-config set --acl on. >>>

S3 enhancements

  • Enhanced WEKA S3 response statistics: WEKA S3 response statistics now include byte counts, response code distribution per minute, and the average time to first byte, providing deeper insights into S3 performance.

  • Added WEKA S3 events for problem and recovery scenarios: New events for WEKA S3 now cover problem and recovery scenarios. The S3ContainerStatusSaturatedEvent indicates that capacity thresholds have been met, while the S3ContainerStatusOnlineEvent signals that the container is online and available.

  • Updated weka s3 cluster -v command output: The weka s3 cluster -v command now includes SLB request output, highlighting the limits used in the new saturation event for improved visibility.

SMB-W enhancements

  • Enhanced UI for ACLs: Access control models that were previously available only via CLI are now visible and selectable in the UI during the Add SMB Share process. >>>

  • Added direct object store sync option in add SMB share: The new obs-direct option in the add smb share command allows customers to bypass time-based file retention policies. When files are created or written to a share with this option enabled, they are prioritized for immediate release. >>>

Additional enhancements

  • Graceful behavior by default for weka local commands: The weka local stop, restart, and apply resources commands now execute gracefully by default, eliminating the need to use the --graceful argument. >>>

  • GCP update: Added gVNIC support in DPDK mode, in addition to UDP. >>>

  • Added support for GCP regions asia-southeast2 and europe-central2 in Terraform configuration.

  • CDM Local version 1.2 updates: Supports automated Terraform deployment, removes the Windows installation package, updates the launch process, and enhances information gathering options. >>>

  • WEKA CSI Plugin version 2.5.0 updates: Provides NFS transport support designed for non-performance-critical scenarios or environments where installing the WEKA client is not feasible. >>>

  • Certified object stores: Added Dell PowerScale S3 (version 9.8.0.0 and higher) to the certified object stores. >>>

4.4.0

This release includes a variety of optimizations to benefit customers operating the WEKA Data Platform at scale.

Improvements

  • weka stats performance enhancements reduce the latency of metric reporting, especially on large clusters.

  • To limit access to POSIX filesystems in Organizations, define access lists by network and role by weka security policy hierarchy. Attach, detach, and test policies by weka org security policy.

  • Customers using clients with multiple clusters can now store profile tokens and reference them by name for convenience. The profile token location is $HOME/.weka/auth-token-<profile>.json, and invocation by the new profile parameter, such as weka user login [—-profile <profile-name>]. >>>

SMB-W enhancements

  • WEKA now supports configuring shares with user permissions in advance of validating the user with directory services

  • Commands share add and share update now support allow-guest-access.

  • Added the ACLs feature to enable or disable Windows Access-Control Lists for the share, offering options for POSIX, Windows, or Hybrid (default: POSIX) and allowing interoperability by prioritizing the most recent permission based on timestamps.

S3 enhancement

  • Introduced an updated health-check URL, /wekas3api/health/ready, for load balancers to use in assessing the health of S3 servers, improving monitoring and load balancing capabilities.

Additional enhancements

  • WEKA client support is extended to: Debian 10, Rocky 8.6, Rocky 8.7, Rocky 8.8, Oracle Linux 9, and SLES 15 SP5.

  • Added a verification step for LLQ and WC in the upgrade workflow. To ensure proper LLQ functionality after upgrades, verify that Write Combining (WC) is enabled in the igb_uio driver. >>>

  • Added a CLI reference guide, which is generated from the output of running the weka command with the help option. It provides detailed descriptions of available commands, arguments, and options.

Last updated