WEKA installation on OCI

WEKA Data Platform deployment on Oracle Cloud Infrastructure (OCI) follows bare-metal installation with cloud-specific considerations.

Overview

The WEKA Data Platform deployment on OCI follows a process similar to bare-metal installation, with adaptations for cloud-specific architecture. This implementation allows you to leverage WEKA's high-performance storage capabilities within Oracle's cloud environment.

OCI provides the necessary infrastructure components for WEKA deployment, including bare-metal compute shapes, virtual networking, and storage options. However, certain limitations exist compared to on-premises deployments, particularly regarding network configuration flexibility.

WEKA cluster on OCI deployment

Workflow

The deployment process includes the following main phases:

  1. Prepare OCI bare metal infrastructure for WEKA.

  2. Install add-ons using templates for OCI HPC Images.

  3. Install WEKA on the OCI bare metal infrastructure.

  4. Configure the WEKA cluster.

  5. Add clients.

1. Prepare OCI bare metal infrastructure for WEKA

Establish the foundational infrastructure required before installing the WEKA Data Platform software.

Procedure

  1. Verify resource compartment access:

    1. Search for and select compartments from the Services section.

    2. Locate and click your designated resource compartment link.

If you see "Nothing here? Possible reasons..." message, you lack proper access permissions. Contact your cloud team for access before proceeding.

  1. Verify IAM policy statements:

    Navigate to https://cloud.oracle.com/identity/domains/policies and verify your login has these permissions:

    • allow group <identity group> to manage compute-management-family in compartment <resource compartment>

    • allow group <identity group> to manage virtual-network-family in compartment <resource compartment>

    • allow group <identity group> to manage instance-family in compartment <resource compartment>

    • allow group <identity group> to manage volume-family in compartment <resource compartment>

    • allow group <identity group> to manage object-family in compartment <resource compartment>

    Replace terms in angle brackets with your company's specific names.

  2. Create cloud network:

    1. Search for and select VCN from the Services section.

    2. Ensure the designated VCN has:

      • Subnet with sufficient addresses for admin/management access to each server.

      • Subnet with sufficient addresses for high-performance access to each server.

      • Subnet with sufficient addresses for high-performance clients mounting WEKA.

VCN capacity planning must account for both WEKA Data Platform and high-performance clients, as client mount connections cannot traverse firewalls or NAT-gateways.

  1. Deploy bare-metal servers:

    1. Search for and select Instances from the Services section.

    2. Select the computer image:

      • Find your preferred OS version on Operating system.

      • Select a matching image from the OCI instance image gallery.

    3. Select the appropriate server shape:

      • Select the Bare Metal option.

      • Select one the Bare Metal shapes, such as:

        • BM.Optimized3.36

        • BM.DenseIO.E5.128

        • BM.HPC.E5.144

        • BM.GPU.H100, BM.GPU.H200, and BM.GPU.A100

    4. Configure the boot volume:

      • Access the Size and Performance settings panel for the boot volume.

      • Switch to Custom Configuration mode.

      • Using the performance slider, set the VPUs/GB ratio to a minimum of 40. Consider increasing this value beyond 40 VPUs/GB during periods of elevated cluster activity, because performance traces are stored on this boot volume.

    5. Configure network interfaces:

      • Create a primary NIC on the management subnet.

      • Create a secondary NIC on the high-performance subnet.

  2. Install OFED drivers:

    1. Install drivers compatible with your NIC and OS combination:

    2. Select the appropriate OS version link, then download and install the RPM/DEB package on each bare-metal server running WEKA Data Platform.

2. Install add-ons using templates for OCI HPC Images

The oci-hpc-images repository provides a set of Packer and Ansible-based templates designed to automate the creation of high-performance computing (HPC) images on Oracle Cloud Infrastructure (OCI). These templates support multiple operating systems and are optimized for OCI environments, enabling users to efficiently deploy consistent and reproducible HPC-ready images.

Supported platforms

The templates include specific installation instructions for the following Linux distributions:

  • Oracle Linux 8

  • Ubuntu 22.04

  • Ubuntu 24.04

Each distribution requires the installation of necessary dependencies such as Packer, Python, and Ansible, and the configuration of a Python virtual environment to isolate and manage dependencies.

Procedure

  1. Access the repository: OCI HPC Images Repository.

  2. Install required tools: Install packer, tmux, python, and supporting packages specified in the repository. Commands vary by OS version and are provided explicitly for each supported platform.

  3. Configure Python environment:

    1. Create and activate a Python virtual environment (packer_env).

    2. Upgrade pip and setuptools.

    3. Install a specific version of ansible-core.

    4. Use ansible-galaxy to install required roles as specified in requirements.yml.

  4. Configure environment variables:

    1. Copy the defaults.pkr.hcl.example file to defaults.pkr.hcl.

    2. Edit the file to specify required variables. For Ubuntu 24.04 or later, explicitly set: OpenSSH9 = true

  5. Customize the image:

    1. Navigate to the required OS-specific directory under images/.

    2. Modify the image .pkr.hcl file to include the appropriate image OCID for your region and select the necessary software modules.

      OCIDs for various regions can be found at the Oracle Cloud Infrastructure Image Documentation.

  6. Build the image:

    1. Due to the potentially long build time, it is recommended to use a tmux session to ensure the process continues if the terminal disconnects:

      tmux new

    2. Initialize and build the image using the following commands: Replace <image-name> with the specific file name matching your configuration and target OS. The following command is an example for Ubuntu-22.

packer init images/Ubuntu-22/<image-name>.pkr.hcl
packer build -var-file="defaults.pkr.hcl" images/Ubuntu-22/<image-name>.pkr.hcl

3. Install WEKA on the OCI bare metal infrastructure

  1. Download the WEKA software. See Obtain the WEKA installation packages.

  2. Install the WEKA software.

    • Once the WEKA software tarball is downloaded from get.weka.io, run the untar command.

    • Run the install.sh command on each server, following the instructions in the Install tab of get.weka.io.

    Once completed, the WEKA software is installed on all the allocated servers and runs in stem mode (no cluster is attached).

4. Configure the WEKA cluster

  1. Use the resources generator to create configuration files (drives0.json, compute0.json, and frontend0.json) in the /tmp directory of each server.

  2. Create containers using these configuration files on all cluster servers.

  3. Complete essential post-configuration:

    • Apply your license.

    • Activate the IO service.

    • Verify your configuration.

    • Consider enabling event notifications if needed.

Refer to the related topics for detailed instructions on each step.

Related topics

Manually configure the WEKA cluster using the resources generator

Perform post-configuration procedures.

5. Add clients or use converged mode

Depending on your deployment mode, you can choose one of the following options to access the WEKA filesystem:

  • Client-server mode: In this configuration, client functionality is deployed on dedicated client servers, similar to a bare-metal WEKA cluster. This setup separates client and backend functionality. For detailed instructions, refer to Add clients to an on-premises WEKA cluster.

  • Converged mode: In this configuration, client functionality is integrated with the backend servers. You can create a filesystem and mount it directly on each of the WEKA backend servers.

What to do next

Proceed to Getting Started with WEKA, which serves as your entry point for using the WEKA system. Start by familiarizing yourself with the graphical user interface (GUI) and command-line interface (CLI). Once you are comfortable, you can perform your first I/O operations using the WEKA filesystem. This includes creating a filesystem and mounting it on the appropriate client or backend servers, depending on your chosen deployment mode.

Last updated