Set up WEKAmon for external monitoring

Configure WEKAmon to integrate Grafana and Prometheus for centralized monitoring of WEKA cluster metrics, logs, alerts, and statistics.

WEKAmon is an external monitoring package integrating Grafana and Prometheus to provide a centralized metrics, logs, alerts, and statistics dashboard.

WEKAmon integrates the following components to pprocide a unified dashboard for metrics, logs, alerts, and statistics:

  • Exporter: Collects data from the WEKA cluster and sends it to Prometheus.

  • Quota Export: Manages storage quotas and exports quota data to Prometheus.

  • Alert Manager: Sends alerts via SMTP when users approach soft quota limits.

WEKAmon operates independently of the built-in monitoring of the WEKA GUI.

WEKAmon setup

You can deploy WEKAmon in one of two ways:

  • Full WEKAmon stack deployment.

  • Exporter-only integration into an existing Grafana and Prometheus environment to visualize all monitoring data on a unified dashboard.

WEKA monitoring data on the Grafana dashboard example
circle-info

If you have deployed the WMS, follow the procedure in:Deploy monitoring tools using the WEKA Management Station (WMS). Otherwise, continue with this workflow.

Deploy full WEKAmon stack (Docker Compose)

Use this option when you do not already operate Grafana and Prometheus.

Before you begin

Setting up a dedicated physical server (or VM) for the installation is recommended.

Server minimum requirements:

  • 4 CPU cores

  • 16 GB RAM

  • 50 GB /

  • 50 GB /opt

  • 1 Gbps network

  • Docker CE

  • Docker Compose (or docker-compose-plugin)

For instructions on the Docker installation, see the Docker websitearrow-up-right.

Workflow: Install the WEKAmon package

  1. Obtain WEKAmon package

  2. Configure authentication

  3. Run install script

  4. Configure export.yml

  5. Configure quota-export.yml (optional)

  6. Start Docker containers

  7. Validate deployment

1. Obtain WEKAmon package

Install under /opt (recommended):

Alternatively, download the latest releasearrow-up-right from GitHub and extract into /opt.

2. Configure authentication

Authentication requires a WEKA cluster user and token.

On a WEKA cluster server

Perform the following steps on an existing host with access to the WEKA CLI, for example, on a WEKA backend server.

  1. Create a dedicated user with ClusterAdmin or OrgAdmin role. This username is displayed in the event logs, making the identification and troubleshooting of issues easier. For example:

  1. Generate an authentication token for the user:

  1. Transfer the wekamon-authtoken.json file to the WEKAmon server.

  2. Remove the token locally:

On a WEKAmon server

  1. Ensure the user running the WEKAmon container can read the authentication token file (/weka/.weka/auth-token.json). If the container operates with restricted permissions, adjust the file permissions accordingly. Typically, you can determine the container's user using docker inspect.

  2. Create a directory for the authentication token:

  1. Move the authentication token into the new directory:

  1. Ensure appropriate ownership and permissions are set:

Related topics

Create a local user

Obtain authentication tokens

3. Run installation script

This script creates required directories and permissions.

4. Configure export.yml

The WEKAmon and exporter configuration are defined in the export.yml file.

  1. Change directory to /opt/weka-mon and open the export.yml file.

  2. In the cluster section under the hosts list, replace the hostnames with the actual hostnames/IP addresses of the WEKA containers (up to three). Ensure the hostnames are mapped to the IP addresses in /etc/hosts.

  1. Optional. In the exporter section, customize the values according to your preferences. For details, see the Exporter configuration options topic below.

  2. Optional. Add custom panels to Grafana containing other metrics.

All other settings in the export.yml file have pre-defined defaults that do not need modification to work with WEKAmon. All the configurable items are defined but marked as comments by an asterisk (#).

To add custom panels to Grafana containing other metrics from the cluster, you can remove the asterisk from the required metrics (uncomment).

Example: In the following snippet of the export.yml, to enable getting the FILEATOMICOPEN_OPS statistic, remove the # character at the beginning of the line.

If the statistic you want to get is in a Category that is commented out, also uncomment the Category line (the first line in the example). Conversely, insert the # character at the beginning of the line to stop getting a statistic.

5. Configure quota-export.yml (optional)

This step is required if monitoring filesystem quotas.

  1. Edit:

  1. Ensure hosts match export.yml.

circle-info

The configuration of the Alert Manager is defined in the alertmanager.yml file found in the etc_alertmanager directory. It contains details about the SMTP server, user email addresses, quotas, and alert rules. To set this file, contact the Customer Success Team.

6. Start the docker-compose containers

  1. Run the following command:

  • For older Docker versions:

  1. Verify containers:

Expected containers:

  • grafana

  • prometheus

  • loki

  • export

  • quota-export (optional)

  • alertmanager

Example output:

If the status of the containers is not up, check the logs and troubleshoot accordingly. To check the logs, run the following command:

7. Validate deployment

Access Grafana:

<server-ip>: the physical server running the docker containers.

Default credentials: admin/admin.

Integrate exporter with existing Grafana and Prometheus

Use this option if Grafana and Prometheus are already deployed in your environment.

Only the exporter (and optionally quota-export) must be deployed.

Before you begin

Ensure:

  • Prometheus is operational

  • Grafana is operational

  • Exporter host can reach:

    • WEKA cluster API

    • Prometheus server

  • Port 8001 (exporter) is available

  • Authentication token exists (~/.weka/)

Procedure

  1. Obtain dashboard files from the WEKAmon package: weka-mon/var_lib_grafana/dashboards.

  2. Import dashboards into Grafana:

    1. Open Grafana UI.

    2. Navigate to Dashboards > Import.

    3. Upload JSON files.

    4. Select the existing Prometheus data source. (For details, see Import Dashboardarrow-up-right in the Grafana documentation.)

  3. Configure exporter files: Edit the export.yml file and quota-export.yaml file (if monitoring filesystem quota).

    • Cluster API endpoints.

    • Authentication file location

    • Organization details

    • Optional performance tuning

  4. Deploy exporter: Based on your environment requirements, you can deploy using:

    • Option 1: Docker container

    • Option 2: Compiled binary

    • Option 3: Python script

Option 1: Docker (recommended)

  1. Pull container:

  1. Run exporter:

  1. If monitoring filesystem quotas:

circle-info
  • --network=host works only on Linux.

  • On macOS/Windows, use -p to publish ports.

Option 2: Compiled binary (if Docker is not available)

  1. Download latest release (exportarrow-up-right):

  1. If monitoring filesystem quotas (quota-exportarrow-up-right):

Option 3: Run as a Python script

  1. Clone the exportarrow-up-right files and run the Python modules:

  1. If monitoring filesystem quotas (quota-exportarrow-up-right):

Configure Prometheus

  1. Add exporter target to prometheus.yml:

  1. If using quota-export, also add:

Exporter configuration reference

Exporter configuration options in the export.yml file

The exporter section defines the program behavior.

Exporter and loki parameters

Parameter
Description

listen_port

Do not change the Prometheus listening port unless the Prometheus configuration is updated.

timeout

Specify the maximum wait time in seconds for an API response. The default value is usually adequate.

backends_only

Run exclusively on WEKA backend servers.

max_procs and max_threads_per_proc

Scaling behavior:

The scaling behavior ensures that if the total number of hosts (servers and clients) exceeds the max_threads_per_proc, the system initiates additional processes as needed.

Example:

In a cluster configuration with 80 WEKA servers and 200 compute nodes, totaling 280 hosts, and using a default max_threads_per_proc of 100, it will operate with 3 processes since 280 / 100 approximately equals 3.

Recommendation:

For optimal performance, allocate at least 1 core per process. Therefore, for the given example, ensure there are at least 4 available cores on the hosting server or virtual machine.

loki:

host

When using the WEKAmon setup, keep the hostname unchanged. If you wish to disable sending events to Loki, leave the field blank.

loki:

port

Don't change the port when using the WEKAmon setup.

circle-info

In a cluster with 1000 servers, the exporter attempts to allocate one server per thread, ensuring the number of processes does not exceed the max_procs parameter. If necessary, it assigns multiple servers to a single thread by doubling or tripling them.

circle-check

Last updated