W E K A
4.3
4.3
  • WEKA v4.3 documentation
    • Documentation revision history
  • WEKA System Overview
    • WEKA Data Platform introduction
      • WEKA system functionality features
      • Converged WEKA system deployment
      • Optimize redundancy in WEKA deployments
    • SSD capacity management
    • Filesystems, object stores, and filesystem groups
    • WEKA networking
    • Data lifecycle management
    • WEKA client and mount modes
    • WEKA containers architecture overview
    • Glossary
  • Planning and Installation
    • Prerequisites and compatibility
    • WEKA cluster installation on bare metal servers
      • Plan the WEKA system hardware requirements
      • Obtain the WEKA installation packages
      • Install the WEKA cluster using the WMS with WSA
      • Install the WEKA cluster using the WSA
      • Manually install OS and WEKA on servers
      • Manually prepare the system for WEKA configuration
        • Broadcom adapter setup for WEKA system
        • Enable the SR-IOV
      • Configure the WEKA cluster using the WEKA Configurator
      • Manually configure the WEKA cluster using the resource generator
      • Perform post-configuration procedures
      • Add clients to an on-premises WEKA cluster
    • WEKA Cloud Deployment Manager Web (CDM Web) User Guide
    • WEKA Cloud Deployment Manager Local (CDM Local) User Guide
    • WEKA installation on AWS
      • WEKA installation on AWS using Terraform
        • Terraform-AWS-WEKA module description
        • Deployment on AWS using Terraform
        • Required services and supported regions
        • Supported EC2 instance types using Terraform
        • WEKA cluster auto-scaling in AWS
        • Detailed deployment tutorial: WEKA on AWS using Terraform
      • WEKA installation on AWS using the Cloud Formation
        • Self-service portal
        • CloudFormation template generator
        • Deployment types
        • AWS Outposts deployment
        • Supported EC2 instance types using Cloud Formation
        • Add clients to a WEKA cluster on AWS
        • Auto scaling group
        • Troubleshooting
      • Install SMB on AWS
    • WEKA installation on Azure
    • WEKA installation on GCP
      • WEKA project description
      • GCP-WEKA deployment Terraform package description
      • Deployment on GCP using Terraform
      • Required services and supported regions
      • Supported machine types and storage
      • Auto-scale instances in GCP
      • Add clients to a WEKA cluster on GCP
      • Troubleshooting
      • Detailed deployment tutorial: WEKA on GCP using Terraform
      • Google Kubernetes Engine and WEKA over POSIX deployment
  • Getting Started with WEKA
    • Manage the system using the WEKA GUI
    • Manage the system using the WEKA CLI
      • WEKA CLI hierarchy
      • CLI reference guide
    • Run first IOs with WEKA filesystem
    • Getting started with WEKA REST API
    • WEKA REST API and equivalent CLI commands
  • Performance
    • WEKA performance tests
      • Test environment details
  • WEKA Filesystems & Object Stores
    • Manage object stores
      • Manage object stores using the GUI
      • Manage object stores using the CLI
    • Manage filesystem groups
      • Manage filesystem groups using the GUI
      • Manage filesystem groups using the CLI
    • Manage filesystems
      • Manage filesystems using the GUI
      • Manage filesystems using the CLI
    • Attach or detach object store buckets
      • Attach or detach object store bucket using the GUI
      • Attach or detach object store buckets using the CLI
    • Advanced data lifecycle management
      • Advanced time-based policies for data storage location
      • Data management in tiered filesystems
      • Transition between tiered and SSD-only filesystems
      • Manual fetch and release of data
    • Mount filesystems
      • Mount filesystems from Single Client to Multiple Clusters (SCMC)
    • Snapshots
      • Manage snapshots using the GUI
      • Manage snapshots using the CLI
    • Snap-To-Object
      • Manage Snap-To-Object using the GUI
      • Manage Snap-To-Object using the CLI
    • Quota management
      • Manage quotas using the GUI
      • Manage quotas using the CLI
  • Additional Protocols
    • Additional protocol containers
    • Manage the NFS protocol
      • Supported NFS client mount parameters
      • Manage NFS networking using the GUI
      • Manage NFS networking using the CLI
    • Manage the S3 protocol
      • S3 cluster management
        • Manage the S3 service using the GUI
        • Manage the S3 service using the CLI
      • S3 buckets management
        • Manage S3 buckets using the GUI
        • Manage S3 buckets using the CLI
      • S3 users and authentication
        • Manage S3 users and authentication using the CLI
        • Manage S3 service accounts using the CLI
      • S3 rules information lifecycle management (ILM)
        • Manage S3 lifecycle rules using the GUI
        • Manage S3 lifecycle rules using the CLI
      • Audit S3 APIs
        • Configure audit webhook using the GUI
        • Configure audit webhook using the CLI
        • Example: How to use Splunk to audit S3
      • S3 supported APIs and limitations
      • S3 examples using boto3
      • Access S3 using AWS CLI
    • Manage the SMB protocol
      • Manage SMB using the GUI
      • Manage SMB using the CLI
  • Operation Guide
    • Alerts
      • Manage alerts using the GUI
      • Manage alerts using the CLI
      • List of alerts and corrective actions
    • Events
      • Manage events using the GUI
      • Manage events using the CLI
      • List of events
    • Statistics
      • Manage statistics using the GUI
      • Manage statistics using the CLI
      • List of statistics
    • Insights
    • System congestion
    • Security management
      • Obtain authentication tokens
      • KMS management
        • Manage KMS using the GUI
        • Manage KMS using the CLI
      • TLS certificate management
        • Manage the TLS certificate using the GUI
        • Manage the TLS certificate using the CLI
      • CA certificate management
        • Manage the CA certificate using the GUI
        • Manage the CA certificate using the CLI
      • Account lockout threshold policy management
        • Manage the account lockout threshold policy using GUI
        • Manage the account lockout threshold policy using CLI
      • Manage the login banner
        • Manage the login banner using the GUI
        • Manage the login banner using the CLI
      • Manage Cross-Origin Resource Sharing
    • User management
      • Manage users using the GUI
      • Manage users using the CLI
    • Organizations management
      • Manage organizations using the GUI
      • Manage organizations using the CLI
      • Mount authentication for organization filesystems
    • Expand and shrink cluster resources
      • Add a backend server
      • Expand specific resources of a container
      • Shrink a cluster
    • Background tasks
      • Set up a Data Services container for background tasks
      • Manage background tasks using the GUI
      • Manage background tasks using the CLI
    • Upgrade WEKA versions
  • Licensing
    • License overview
    • Classic license
  • Monitor the WEKA Cluster
    • Deploy monitoring tools using the WEKA Management Station (WMS)
    • WEKA Home - The WEKA support cloud
      • Local WEKA Home overview
      • Deploy Local WEKA Home v3.0 or higher
      • Deploy Local WEKA Home v2.x
      • Explore cluster insights and statistics
      • Manage alerts and integrations
      • Enforce security and compliance
      • Optimize support and data management
    • Set up the WEKAmon external monitoring
    • Set up the SnapTool external snapshots manager
  • Support
    • Get support for your WEKA system
    • Diagnostics management
      • Traces management
        • Manage traces using the GUI
        • Manage traces using the CLI
      • Protocols debug level management
        • Manage protocols debug level using the GUI
        • Manage protocols debug level using the CLI
      • Diagnostics data management
  • Best Practice Guides
    • WEKA and Slurm integration
      • Avoid conflicting CPU allocations
    • Storage expansion best practice
  • WEKApod
    • WEKApod Data Platform Appliance overview
    • WEKApod servers overview
    • Rack installation
    • WEKApod initial system setup and configuration
    • WEKApod support process
  • Appendices
    • WEKA CSI Plugin
      • Deployment
      • Storage class configurations
      • Tailor your storage class configuration with mount options
      • Dynamic and static provisioning
      • Launch an application using WEKA as the POD's storage
      • Add SELinux support
      • NFS transport failback
      • Upgrade legacy persistent volumes for capacity enforcement
      • Troubleshooting
    • Convert cluster to multi-container backend
    • Create a client image
    • Update WMS and WSA
    • BIOS tool
Powered by GitBook
On this page
  1. Operation Guide
  2. Background tasks

Set up a Data Services container for background tasks

Efficiently manage resource-intensive tasks with at least one Data Services container for improved performance and reliability.

PreviousBackground tasksNextManage background tasks using the GUI

Last updated 2 months ago

The Data Services container runs tasks in the background, particularly those that can be resource-intensive. At present, it runs the task. In upcoming releases, it will handle additional tasks that consume significant resources.

Running these tasks in the background ensures your CLI remains accessible and responsive without consuming compute resources. This strategy enhances performance, efficiency, and scalability when managing quotas. If a task is interrupted, it automatically resumes, providing reliability.

If the Data Services container is not operational, the quota coloring task reverts to the previous implementation and runs in a single process. This could result in the CLI hanging for an extended period. Therefore, ensuring the Data Services container runs is crucial to prevent this situation.

To improve data service performance, you can set up multiple Data Service containers, one per WEKA server.

After setting up the Data Service container, you can manage it like any other container within the cluster. If there’s a need to adjust its resources, use the weka cluster container resources or weka local resources commands. For more details, see Expand specific resources of a container.

Before you begin

  1. Ensure the server where you’re adding this container has sufficient memory available:

    • 3.5 GB if no dedicated core is specified.

    • 5.5 GB if a dedicated core is specified.

  2. The Data Service containers require a persistent 22 GB filesystem for intermediate global configuration data. Do one of the following:

    • If a configuration filesystem for the protocol containers exists (typically named .config_fs), use it and expand its size by 22 GB. See

    • If a configuration filesystem does not exist, create a dedicated 22 GB configuration filesystem for the Data Service containers.

  3. Set the Data Service global configuration. Run the following command:

weka dataservice global-config set --config-fs <configuration filesystem name>

Example:

weka dataservice global-config set --config-fs .config_fs

By default, the Data Service containers share the core of the Management process. However, if you have enough resources, you can assign a separate core to it.

Procedure

  1. Set up the Data Services container: Run the following command:

weka local setup container --name <container_name> --base-port <base-port> --join-ips <management-IP> --only-dataserv-cores --memory 1.5GB --allow-mix-setting

Parameters:

Parameter
Description

name*

The Data Services container name. Setdataserv0 to avoid confusion.

only-dataserv-cores*

Creates a Data Services container. This parameter is mandatory.

base-port

If a base-port is not specified, the Data Services container may still initialize as it attempts to allocate an available port range and could succeed. However, for optimal operation, it is recommended to provide the base port externally.

join-ips*

Specify the management IP of one of the servers in the cluster to join.

management-ips

This is optional. If not provided, it automatically takes the management IP of the server.

memory

Configure the container memory to be allocated for huge pages. It is recommended to set it to 1.5 GB.

Example
$ weka local setup container --name dataserv0 --base-port 14400 --join-ips 10.108.234.164  --only-dataserv-cores --memory 1.5GB --allow-mix-setting
Version 4.3.2 is already downloaded.
Created Weka container named dataserv0
Preparing version 4.3.2 of container dataserv0
No net parameter specified, configuring in UDP mode
Successfully set up container dataserv0
Starting container
Waiting for container to start up
Container "dataserv0" is ready (pid = 66904)
  1. Verify the Data Services container is up: Run weka local ps.

Example
$ weka local ps
CONTAINER  STATE    DISABLED  UPTIME    MONITORING  PERSISTENT  PORT   PID    STATUS  VERSION  LAST FAILURE
compute0   Running  False     1:21:58h  True        True        14300  44600  Ready   4.3.2
dataserv0  Running  False     44.59s    True        True        14400  66904  Ready   4.3.2
drives0    Running  False     1:22:39h  True        True        14000  43448  Ready   4.3.2
frontend0  Running  False     1:21:15h  True        True        14200  45680  Ready   4.3.2
  1. Verify the Data Services container is visible in the cluster: Run weka cluster container.

Example

See dataserve0 in the last row (CONTAINER ID 15).

$ weka cluster container
CONTAINER ID  HOSTNAME        CONTAINER  IPS             STATUS  RELEASE  FAILURE DOMAIN  CORES  MEMORY   UPTIME    LAST FAILURE
0             DataSphere-0    drives0    10.108.249.241  UP      4.3.2    DOM-000         1      1.54 GB  1:29:38h
1             DataSphere-1    drives0    10.108.211.190  UP      4.3.2    DOM-001         1      1.54 GB  1:29:39h
2             DataSphere-2    drives0    10.108.47.134   UP      4.3.2    DOM-002         1      1.54 GB  1:29:39h
3             DataSphere-3    drives0    10.108.234.164  UP      4.3.2    DOM-003         1      1.54 GB  1:29:39h
4             DataSphere-4    drives0    10.108.166.243  UP      4.3.2    DOM-004         1      1.54 GB  1:29:38h
5             DataSphere-0    compute0   10.108.249.241  UP      4.3.2    DOM-000         1      1.50 GB  1:28:56h
6             DataSphere-1    compute0   10.108.211.190  UP      4.3.2    DOM-001         1      1.50 GB  1:28:57h
7             DataSphere-2    compute0   10.108.47.134   UP      4.3.2    DOM-002         1      1.50 GB  1:28:57h
8             DataSphere-3    compute0   10.108.234.164  UP      4.3.2    DOM-003         1      1.50 GB  1:28:57h
9             DataSphere-4    compute0   10.108.166.243  UP      4.3.2    DOM-004         1      1.50 GB  1:28:58h
10            DataSphere-0    frontend0  10.108.249.241  UP      4.3.2    DOM-000         1      1.47 GB  1:28:13h
11            DataSphere-1    frontend0  10.108.211.190  UP      4.3.2    DOM-001         1      1.47 GB  1:28:13h
12            DataSphere-2    frontend0  10.108.47.134   UP      4.3.2    DOM-002         1      1.47 GB  1:28:13h
13            DataSphere-3    frontend0  10.108.234.164  UP      4.3.2    DOM-003         1      1.47 GB  1:28:14h
14            DataSphere-4    frontend0  10.108.166.243  UP      4.3.2    DOM-004         1      1.47 GB  1:28:14h
15            DataSphere-0    dataserv0  10.108.249.241  UP      4.3.2                    1      1.47 GB  0:07:41h
  1. Verify the data services and management processes have joined the cluster: Run weka cluster process.

Example

See PROCESS IDs 300 and 301.

$ weka cluster process
PROCESS ID  HOSTNAME      CONTAINER  IPS             STATUS  RELEASE  ROLES       NETWORK  CPU  MEMORY   UPTIME    LAST FAILURE
0           DataSphere-0  drives0    10.108.249.241  UP      4.3.2    MANAGEMENT  UDP           N/A      1:22:26h  Host joined a new cluster (1 hour ago)
1           DataSphere-0  drives0    10.108.6.1      UP      4.3.2    DRIVES      DPDK     2    1.54 GB  1:22:24h
20          DataSphere-1  drives0    10.108.211.190  UP      4.3.2    MANAGEMENT  UDP           N/A      1:22:28h  Host joined a new cluster (1 hour ago)
21          DataSphere-1  drives0    10.108.18.211   UP      4.3.2    DRIVES      DPDK     4    1.54 GB  1:22:24h
40          DataSphere-2  drives0    10.108.47.134   UP      4.3.2    MANAGEMENT  UDP           N/A      1:22:27h  Host joined a new cluster (1 hour ago)
41          DataSphere-2  drives0    10.108.0.189    UP      4.3.2    DRIVES      DPDK     4    1.54 GB  1:22:24h
60          DataSphere-3  drives0    10.108.234.164  UP      4.3.2    MANAGEMENT  UDP           N/A      1:22:29h
61          DataSphere-3  drives0    10.108.181.42   UP      4.3.2    DRIVES      DPDK     6    1.54 GB  1:22:24h
80          DataSphere-4  drives0    10.108.166.243  UP      4.3.2    MANAGEMENT  UDP           N/A      1:22:26h  Host joined a new cluster (1 hour ago)
81          DataSphere-4  drives0    10.108.32.208   UP      4.3.2    DRIVES      DPDK     2    1.54 GB  1:22:24h
100         DataSphere-0  compute0   10.108.249.241  UP      4.3.2    MANAGEMENT  UDP           N/A      1:21:52h  Configuration snapshot pulled (1 hour ago)
101         DataSphere-0  compute0   10.108.150.39   UP      4.3.2    COMPUTE     DPDK     6    1.50 GB  1:21:50h
120         DataSphere-1  compute0   10.108.211.190  UP      4.3.2    MANAGEMENT  UDP           N/A      1:21:52h  Configuration snapshot pulled (1 hour ago)
121         DataSphere-1  compute0   10.108.162.229  UP      4.3.2    COMPUTE     DPDK     2    1.50 GB  1:21:50h
140         DataSphere-2  compute0   10.108.47.134   UP      4.3.2    MANAGEMENT  UDP           N/A      1:21:46h  Removed from cluster: Not reachable by the cluster (1 hour ago)
141         DataSphere-2  compute0   10.108.38.178   UP      4.3.2    COMPUTE     DPDK     2    1.50 GB  1:21:50h
160         DataSphere-3  compute0   10.108.234.164  UP      4.3.2    MANAGEMENT  UDP           N/A      1:21:52h  Configuration snapshot pulled (1 hour ago)
161         DataSphere-3  compute0   10.108.254.134  UP      4.3.2    COMPUTE     DPDK     4    1.50 GB  1:21:50h
180         DataSphere-4  compute0   10.108.166.243  UP      4.3.2    MANAGEMENT  UDP           N/A      1:21:46h  Removed from cluster: Not reachable by the cluster (1 hour ago)
181         DataSphere-4  compute0   10.108.0.100    UP      4.3.2    COMPUTE     DPDK     4    1.50 GB  1:21:50h
200         DataSphere-0  frontend0  10.108.249.241  UP      4.3.2    MANAGEMENT  UDP           N/A      1:21:01h  Removed from cluster: Not reachable by the cluster (1 hour ago)
201         DataSphere-0  frontend0  10.108.10.152   UP      4.3.2    FRONTEND    DPDK     4    1.47 GB  1:21:05h
220         DataSphere-1  frontend0  10.108.211.190  UP      4.3.2    MANAGEMENT  UDP           N/A      1:21:01h  Removed from cluster: Not reachable by the cluster (1 hour ago)
221         DataSphere-1  frontend0  10.108.201.178  UP      4.3.2    FRONTEND    DPDK     6    1.47 GB  1:21:05h
240         DataSphere-2  frontend0  10.108.47.134   UP      4.3.2    MANAGEMENT  UDP           N/A      1:21:01h  Removed from cluster: Not reachable by the cluster (1 hour ago)
241         DataSphere-2  frontend0  10.108.172.186  UP      4.3.2    FRONTEND    DPDK     6    1.47 GB  1:21:05h
260         DataSphere-3  frontend0  10.108.234.164  UP      4.3.2    MANAGEMENT  UDP           N/A      1:21:08h  Configuration snapshot pulled (1 hour ago)
261         DataSphere-3  frontend0  10.108.145.253  UP      4.3.2    FRONTEND    DPDK     2    1.47 GB  1:21:05h
280         DataSphere-4  frontend0  10.108.166.243  UP      4.3.2    MANAGEMENT  UDP           N/A      1:21:08h  Configuration snapshot pulled (1 hour ago)
281         DataSphere-4  frontend0  10.108.219.191  UP      4.3.2    FRONTEND    DPDK     6    1.47 GB  1:21:05h
300         DataSphere-0  dataserv0  10.108.249.241  UP      4.3.2    MANAGEMENT  UDP           N/A      33.05s    Configuration snapshot pulled (40 seconds ago)
301         DataSphere-0  dataserv0  10.108.249.241  UP      4.3.2    DATASERV    UDP      1    1.47 GB  14.55s
Dedicated filesystem requirement for cluster-wide persistent protocol configurations