W E K A
4.4
4.4
  • WEKA v4.4 documentation
    • Documentation revision history
  • WEKA System Overview
    • Introduction
      • WEKA system functionality features
      • Converged WEKA system deployment
      • Redundancy optimization in WEKA
    • SSD capacity management
    • Filesystems, object stores, and filesystem groups
    • WEKA networking
    • Data lifecycle management
    • WEKA client and mount modes
    • WEKA containers architecture overview
    • Glossary
  • Planning and Installation
    • Prerequisites and compatibility
    • WEKA cluster installation on bare metal servers
      • Plan the WEKA system hardware requirements
      • Obtain the WEKA installation packages
      • Install the WEKA cluster using the WMS with WSA
      • Install the WEKA cluster using the WSA
      • Manually install OS and WEKA on servers
      • Manually prepare the system for WEKA configuration
        • Broadcom adapter setup for WEKA system
        • Enable the SR-IOV
      • Configure the WEKA cluster using the WEKA Configurator
      • Manually configure the WEKA cluster using the resources generator
        • VLAN tagging in the WEKA system
      • Perform post-configuration procedures
      • Add clients to an on-premises WEKA cluster
    • WEKA Cloud Deployment Manager Web (CDM Web) User Guide
    • WEKA Cloud Deployment Manager Local (CDM Local) User Guide
    • WEKA installation on AWS
      • WEKA installation on AWS using Terraform
        • Terraform-AWS-WEKA module description
        • Deployment on AWS using Terraform
        • Required services and supported regions
        • Supported EC2 instance types using Terraform
        • WEKA cluster auto-scaling in AWS
        • Detailed deployment tutorial: WEKA on AWS using Terraform
      • WEKA installation on AWS using the Cloud Formation
        • Self-service portal
        • CloudFormation template generator
        • Deployment types
        • AWS Outposts deployment
        • Supported EC2 instance types using Cloud Formation
        • Add clients to a WEKA cluster on AWS
        • Auto scaling group
        • Troubleshooting
    • WEKA installation on Azure
      • Azure-WEKA deployment Terraform package description
      • Deployment on Azure using Terraform
      • Required services and supported regions
      • Supported virtual machine types
      • Auto-scale virtual machines in Azure
      • Add clients to a WEKA cluster on Azure
      • Troubleshooting
      • Detailed deployment tutorial: WEKA on Azure using Terraform
    • WEKA installation on GCP
      • WEKA project description
      • GCP-WEKA deployment Terraform package description
      • Deployment on GCP using Terraform
      • Required services and supported regions
      • Supported machine types and storage
      • Auto-scale instances in GCP
      • Add clients to a WEKA cluster on GCP
      • Troubleshooting
      • Detailed deployment tutorial: WEKA on GCP using Terraform
      • Google Kubernetes Engine and WEKA over POSIX deployment
    • WEKA installation on OCI
  • Getting Started with WEKA
    • Manage the system using the WEKA GUI
    • Manage the system using the WEKA CLI
      • WEKA CLI hierarchy
      • CLI reference guide
    • Run first IOs with WEKA filesystem
    • Getting started with WEKA REST API
    • WEKA REST API and equivalent CLI commands
  • Performance
    • WEKA performance tests
      • Test environment details
  • WEKA Filesystems & Object Stores
    • Manage object stores
      • Manage object stores using the GUI
      • Manage object stores using the CLI
    • Manage filesystem groups
      • Manage filesystem groups using the GUI
      • Manage filesystem groups using the CLI
    • Manage filesystems
      • Manage filesystems using the GUI
      • Manage filesystems using the CLI
    • Attach or detach object store buckets
      • Attach or detach object store bucket using the GUI
      • Attach or detach object store buckets using the CLI
    • Advanced data lifecycle management
      • Advanced time-based policies for data storage location
      • Data management in tiered filesystems
      • Transition between tiered and SSD-only filesystems
      • Manual fetch and release of data
    • Mount filesystems
      • Mount filesystems from Single Client to Multiple Clusters (SCMC)
      • Manage authentication across multiple clusters with connection profiles
    • Snapshots
      • Manage snapshots using the GUI
      • Manage snapshots using the CLI
    • Snap-To-Object
      • Manage Snap-To-Object using the GUI
      • Manage Snap-To-Object using the CLI
    • Snapshot policies
      • Manage snapshot policies using the GUI
      • Manage snapshot policies using the CLI
    • Quota management
      • Manage quotas using the GUI
      • Manage quotas using the CLI
  • Additional Protocols
    • Additional protocol containers
    • Manage the NFS protocol
      • Supported NFS client mount parameters
      • Manage NFS networking using the GUI
      • Manage NFS networking using the CLI
    • Manage the S3 protocol
      • S3 cluster management
        • Manage the S3 service using the GUI
        • Manage the S3 service using the CLI
      • S3 buckets management
        • Manage S3 buckets using the GUI
        • Manage S3 buckets using the CLI
      • S3 users and authentication
        • Manage S3 users and authentication using the CLI
        • Manage S3 service accounts using the CLI
      • S3 lifecycle rules management
        • Manage S3 lifecycle rules using the GUI
        • Manage S3 lifecycle rules using the CLI
      • Audit S3 APIs
        • Configure audit webhook using the GUI
        • Configure audit webhook using the CLI
        • Example: How to use Splunk to audit S3
        • Example: How to use S3 audit events for tracking and security
      • S3 supported APIs and limitations
      • S3 examples using boto3
      • Configure and use AWS CLI with WEKA S3 storage
    • Manage the SMB protocol
      • Manage SMB using the GUI
      • Manage SMB using the CLI
  • Security
    • WEKA security overview
    • Obtain authentication tokens
    • Manage token expiration
    • Manage account lockout threshold policy
    • Manage KMS
      • Manage KMS using GUI
      • Manage KMS using CLI
    • Manage TLS certificates
      • Manage TLS certificates using GUI
      • Manage TLS certificates using CLI
    • Manage Cross-Origin Resource Sharing
    • Manage CIDR-based security policies
    • Manage login banner
  • Secure cluster membership with join secret authentication
  • Licensing
    • License overview
    • Classic license
  • Operation Guide
    • Alerts
      • Manage alerts using the GUI
      • Manage alerts using the CLI
      • List of alerts and corrective actions
    • Events
      • Manage events using the GUI
      • Manage events using the CLI
      • List of events
    • Statistics
      • Manage statistics using the GUI
      • Manage statistics using the CLI
      • List of statistics
    • Insights
    • System congestion
    • User management
      • Manage users using the GUI
      • Manage users using the CLI
    • Organizations management
      • Manage organizations using the GUI
      • Manage organizations using the CLI
      • Mount authentication for organization filesystems
    • Expand and shrink cluster resources
      • Add a backend server
      • Expand specific resources of a container
      • Shrink a cluster
    • Background tasks
      • Set up a Data Services container for background tasks
      • Manage background tasks using the GUI
      • Manage background tasks using the CLI
    • Upgrade WEKA versions
    • Manage WEKA drivers
  • Monitor the WEKA Cluster
    • Deploy monitoring tools using the WEKA Management Station (WMS)
    • WEKA Home - The WEKA support cloud
      • Local WEKA Home overview
      • Deploy Local WEKA Home v3.0 or higher
      • Deploy Local WEKA Home v2.x
      • Explore cluster insights
      • Explore performance statistics in Grafana
      • Manage alerts and integrations
      • Enforce security and compliance
      • Optimize support and data management
      • Export cluster metrics to Prometheus
    • Set up WEKAmon for external monitoring
    • Set up the SnapTool external snapshots manager
  • Kubernetes
    • Composable clusters for multi-tenancy in Kubernetes
    • WEKA Operator deployment
    • WEKA Operator day-2 operations
  • WEKApod
    • WEKApod Data Platform Appliance overview
    • WEKApod servers overview
    • Rack installation
    • WEKApod initial system setup and configuration
    • WEKApod support process
  • AWS Solutions
    • Amazon SageMaker HyperPod and WEKA Integrations
      • Deploy a new Amazon SageMaker HyperPod cluster with WEKA
      • Add WEKA to an existing Amazon SageMaker HyperPod cluster
    • AWS ParallelCluster and WEKA Integration
  • Azure Solutions
    • Azure CycleCloud for SLURM and WEKA Integration
  • Best Practice Guides
    • WEKA and Slurm integration
      • Avoid conflicting CPU allocations
    • Storage expansion best practice
  • Support
    • Get support for your WEKA system
    • Diagnostics management
      • Traces management
        • Manage traces using the GUI
        • Manage traces using the CLI
      • Protocols debug level management
        • Manage protocols debug level using the GUI
        • Manage protocols debug level using the CLI
      • Diagnostics data management
  • Appendices
    • WEKA CSI Plugin
      • Deployment
      • Storage class configurations
      • Tailor your storage class configuration with mount options
      • Dynamic and static provisioning
      • Launch an application using WEKA as the POD's storage
      • Add SELinux support
      • NFS transport failback
      • Upgrade legacy persistent volumes for capacity enforcement
      • Troubleshooting
    • Convert cluster to multi-container backend
    • Create a client image
    • Update WMS and WSA
    • BIOS tool
Powered by GitBook
On this page
  • Overview
  • Workflow
  • 1. Prepare OCI bare metal infrastructure for WEKA
  • 2. Install add-ons using templates for OCI HPC Images
  • 3. Install WEKA on the OCI bare metal infrastructure
  • 4. Configure the WEKA cluster
  • 5. Add clients or use converged mode
  • What to do next
  1. Planning and Installation

WEKA installation on OCI

WEKA Data Platform deployment on Oracle Cloud Infrastructure (OCI) follows bare-metal installation with cloud-specific considerations.

PreviousGoogle Kubernetes Engine and WEKA over POSIX deploymentNextManage the system using the WEKA GUI

Last updated 4 days ago

Overview

The WEKA Data Platform deployment on OCI follows a process similar to bare-metal installation, with adaptations for cloud-specific architecture. This implementation allows you to leverage WEKA's high-performance storage capabilities within Oracle's cloud environment.

OCI provides the necessary infrastructure components for WEKA deployment, including bare-metal compute shapes, virtual networking, and storage options. However, certain limitations exist compared to on-premises deployments, particularly regarding network configuration flexibility.

Workflow

The deployment process includes the following main phases:

  1. Prepare OCI bare metal infrastructure for WEKA.

  2. Install add-ons using templates for OCI HPC Images.

  3. Install WEKA on the OCI bare metal infrastructure.

  4. Configure the WEKA cluster.

  5. Add clients.

WEKA strongly recommends that you coordinate and obtain approval from OCI personnel before deploying any WEKA systems on OCI. This coordination ensures your deployment will be compatible with OCI's architecture and comply with cloud resource management policies.

1. Prepare OCI bare metal infrastructure for WEKA

Establish the foundational infrastructure required before installing the WEKA Data Platform software.

Procedure

  1. Verify resource compartment access:

    1. Search for and select compartments from the Services section.

    2. Locate and click your designated resource compartment link.

If you see "Nothing here? Possible reasons..." message, you lack proper access permissions. Contact your cloud team for access before proceeding.

  1. Verify IAM policy statements:

    • allow group <identity group> to manage compute-management-family in compartment <resource compartment>

    • allow group <identity group> to manage virtual-network-family in compartment <resource compartment>

    • allow group <identity group> to manage instance-family in compartment <resource compartment>

    • allow group <identity group> to manage volume-family in compartment <resource compartment>

    • allow group <identity group> to manage object-family in compartment <resource compartment>

    Replace terms in angle brackets with your company's specific names.

  2. Create cloud network:

    1. Search for and select VCN from the Services section.

    2. Ensure the designated VCN has:

      • Subnet with sufficient addresses for admin/management access to each server.

      • Subnet with sufficient addresses for high-performance access to each server.

      • Subnet with sufficient addresses for high-performance clients mounting WEKA.

VCN capacity planning must account for both WEKA Data Platform and high-performance clients, as client mount connections cannot traverse firewalls or NAT-gateways.

  1. Deploy bare-metal servers:

    1. Search for and select Instances from the Services section.

    2. Select the computer image:

      • Select a matching image from the OCI instance image gallery.

    3. Select the appropriate server shape:

      • Select the Bare Metal option.

      • Select one the Bare Metal shapes, such as:

        • BM.Optimized3.36

        • BM.DenseIO.E5.128

        • BM.HPC.E5.144

        • BM.GPU.H100, BM.GPU.H200, and BM.GPU.A100

    4. Configure the boot volume:

      • Access the Size and Performance settings panel for the boot volume.

      • Switch to Custom Configuration mode.

      • Using the performance slider, set the VPUs/GB ratio to a minimum of 40. Consider increasing this value beyond 40 VPUs/GB during periods of elevated cluster activity, because performance traces are stored on this boot volume.

    5. Configure network interfaces:

      • Create a primary NIC on the management subnet.

      • Create a secondary NIC on the high-performance subnet.

  2. Install OFED drivers:

    1. Install drivers compatible with your NIC and OS combination:

    2. Select the appropriate OS version link, then download and install the RPM/DEB package on each bare-metal server running WEKA Data Platform.

2. Install add-ons using templates for OCI HPC Images

The oci-hpc-images repository provides a set of Packer and Ansible-based templates designed to automate the creation of high-performance computing (HPC) images on Oracle Cloud Infrastructure (OCI). These templates support multiple operating systems and are optimized for OCI environments, enabling users to efficiently deploy consistent and reproducible HPC-ready images.

Supported platforms

The templates include specific installation instructions for the following Linux distributions:

  • Oracle Linux 8

  • Ubuntu 22.04

  • Ubuntu 24.04

Each distribution requires the installation of necessary dependencies such as Packer, Python, and Ansible, and the configuration of a Python virtual environment to isolate and manage dependencies.

Procedure

  1. Install required tools: Install packer, tmux, python, and supporting packages specified in the repository. Commands vary by OS version and are provided explicitly for each supported platform.

  2. Configure Python environment:

    1. Create and activate a Python virtual environment (packer_env).

    2. Upgrade pip and setuptools.

    3. Install a specific version of ansible-core.

    4. Use ansible-galaxy to install required roles as specified in requirements.yml.

  3. Configure environment variables:

    1. Copy the defaults.pkr.hcl.example file to defaults.pkr.hcl.

    2. Edit the file to specify required variables. For Ubuntu 24.04 or later, explicitly set: OpenSSH9 = true

  4. Customize the image:

    1. Navigate to the required OS-specific directory under images/.

    2. Modify the image .pkr.hcl file to include the appropriate image OCID for your region and select the necessary software modules.

  5. Build the image:

    1. Due to the potentially long build time, it is recommended to use a tmux session to ensure the process continues if the terminal disconnects:

      tmux new

    2. Initialize and build the image using the following commands: Replace <image-name> with the specific file name matching your configuration and target OS. The following command is an example for Ubuntu-22.

packer init images/Ubuntu-22/<image-name>.pkr.hcl
packer build -var-file="defaults.pkr.hcl" images/Ubuntu-22/<image-name>.pkr.hcl

3. Install WEKA on the OCI bare metal infrastructure

  1. Download the WEKA software. See Obtain the WEKA installation packages.

  2. Install the WEKA software.

    Once completed, the WEKA software is installed on all the allocated servers and runs in stem mode (no cluster is attached).

4. Configure the WEKA cluster

  1. Use the resources generator to create configuration files (drives0.json, compute0.json, and frontend0.json) in the /tmp directory of each server.

  2. Create containers using these configuration files on all cluster servers.

  3. Complete essential post-configuration:

    • Apply your license.

    • Activate the IO service.

    • Verify your configuration.

    • Consider enabling event notifications if needed.

Refer to the related topics for detailed instructions on each step.

Related topics

Manually configure the WEKA cluster using the resources generator

Perform post-configuration procedures.

5. Add clients or use converged mode

Depending on your deployment mode, you can choose one of the following options to access the WEKA filesystem:

  • Client-server mode: In this configuration, client functionality is deployed on dedicated client servers, similar to a bare-metal WEKA cluster. This setup separates client and backend functionality. For detailed instructions, refer to Add clients to an on-premises WEKA cluster.

  • Converged mode: In this configuration, client functionality is integrated with the backend servers. You can create a filesystem and mount it directly on each of the WEKA backend servers.

What to do next

Proceed to Getting Started with WEKA, which serves as your entry point for using the WEKA system. Start by familiarizing yourself with the graphical user interface (GUI) and command-line interface (CLI). Once you are comfortable, you can perform your first I/O operations using the WEKA filesystem. This includes creating a filesystem and mounting it on the appropriate client or backend servers, depending on your chosen deployment mode.

Sign in to .

Navigate to and verify your login has these permissions:

Find your preferred OS version on .

For production with WEKA 4.4.x, use:

For widest compatibility across all WEKA releases:

Access the repository: .

OCIDs for various regions can be found at the .

Once the WEKA software tarball is downloaded from , run the untar command.

Run the install.sh command on each server, following the instructions in the Install tab of .

https://cloud.oracle.com
https://cloud.oracle.com/identity/domains/policies
https://linux.mellanox.com/public/repo/mlnx_ofed/5.9-0.5.6.0/
https://linux.mellanox.com/public/repo/mlnx_ofed/5.4-3.4.0.0/
OCI HPC Images Repository
Oracle Cloud Infrastructure Image Documentation
get.weka.io
get.weka.io
WEKA cluster on OCI deployment
Operating system