Set up WEKAmon for external monitoring
Configure WEKAmon to integrate Grafana and Prometheus for centralized monitoring of WEKA cluster metrics, logs, alerts, and statistics.
WEKAmon is an external monitoring package integrating Grafana and Prometheus to provide a centralized metrics, logs, alerts, and statistics dashboard.
WEKAmon integrates the following components to pprocide a unified dashboard for metrics, logs, alerts, and statistics:
Exporter: Collects data from the WEKA cluster and sends it to Prometheus.
Quota Export: Manages storage quotas and exports quota data to Prometheus.
Alert Manager: Sends alerts via SMTP when users approach soft quota limits.
WEKAmon operates independently of the built-in monitoring of the WEKA GUI.

You can deploy WEKAmon in one of two ways:
Full WEKAmon stack deployment.
Exporter-only integration into an existing Grafana and Prometheus environment to visualize all monitoring data on a unified dashboard.

If you have deployed the WMS, follow the procedure in:Deploy monitoring tools using the WEKA Management Station (WMS). Otherwise, continue with this workflow.
Deploy full WEKAmon stack (Docker Compose)
Use this option when you do not already operate Grafana and Prometheus.
Before you begin
Setting up a dedicated physical server (or VM) for the installation is recommended.
Server minimum requirements:
4 CPU cores
16 GB RAM
50 GB /
50 GB /opt
1 Gbps network
Docker CE
Docker Compose (or docker-compose-plugin)
For instructions on the Docker installation, see the Docker website.
Workflow: Install the WEKAmon package
Obtain WEKAmon package
Configure authentication
Run install script
Configure export.yml
Configure quota-export.yml (optional)
Start Docker containers
Validate deployment
1. Obtain WEKAmon package
Install under /opt (recommended):
Alternatively, download the latest release from GitHub and extract into /opt.
2. Configure authentication
Authentication requires a WEKA cluster user and token.
On a WEKA cluster server
Perform the following steps on an existing host with access to the WEKA CLI, for example, on a WEKA backend server.
Create a dedicated user with ClusterAdmin or OrgAdmin role. This username is displayed in the event logs, making the identification and troubleshooting of issues easier. For example:
Generate an authentication token for the user:
Transfer the
wekamon-authtoken.jsonfile to the WEKAmon server.Remove the token locally:
On a WEKAmon server
Ensure the user running the WEKAmon container can read the authentication token file (
/weka/.weka/auth-token.json). If the container operates with restricted permissions, adjust the file permissions accordingly. Typically, you can determine the container's user usingdocker inspect.Create a directory for the authentication token:
Move the authentication token into the new directory:
Ensure appropriate ownership and permissions are set:
Related topics
3. Run installation script
This script creates required directories and permissions.
4. Configure export.yml
The WEKAmon and exporter configuration are defined in the export.yml file.
Change directory to
/opt/weka-monand open theexport.ymlfile.In the cluster section under the hosts list, replace the hostnames with the actual hostnames/IP addresses of the WEKA containers (up to three). Ensure the hostnames are mapped to the IP addresses in
/etc/hosts.
Optional. In the exporter section, customize the values according to your preferences. For details, see the Exporter configuration options topic below.
Optional. Add custom panels to Grafana containing other metrics.
All other settings in the export.yml file have pre-defined defaults that do not need modification to work with WEKAmon. All the configurable items are defined but marked as comments by an asterisk (#).
To add custom panels to Grafana containing other metrics from the cluster, you can remove the asterisk from the required metrics (uncomment).
Example: In the following snippet of the export.yml, to enable getting the FILEATOMICOPEN_OPS statistic, remove the # character at the beginning of the line.
If the statistic you want to get is in a Category that is commented out, also uncomment the Category line (the first line in the example). Conversely, insert the # character at the beginning of the line to stop getting a statistic.
5. Configure quota-export.yml (optional)
This step is required if monitoring filesystem quotas.
Edit:
Ensure hosts match
export.yml.
The configuration of the Alert Manager is defined in the alertmanager.yml file found in the etc_alertmanager directory. It contains details about the SMTP server, user email addresses, quotas, and alert rules. To set this file, contact the Customer Success Team.
6. Start the docker-compose containers
Run the following command:
For older Docker versions:
Verify containers:
Expected containers:
grafana
prometheus
loki
export
quota-export (optional)
alertmanager
Example output:
If the status of the containers is not up, check the logs and troubleshoot accordingly. To check the logs, run the following command:
7. Validate deployment
Access Grafana:
<server-ip>: the physical server running the docker containers.
Default credentials: admin/admin.
Integrate exporter with existing Grafana and Prometheus
Use this option if Grafana and Prometheus are already deployed in your environment.
Only the exporter (and optionally quota-export) must be deployed.
Before you begin
Ensure:
Prometheus is operational
Grafana is operational
Exporter host can reach:
WEKA cluster API
Prometheus server
Port 8001 (exporter) is available
Authentication token exists (
~/.weka/)
Procedure
Obtain dashboard files from the WEKAmon package:
weka-mon/var_lib_grafana/dashboards.Import dashboards into Grafana:
Open Grafana UI.
Navigate to Dashboards > Import.
Upload JSON files.
Select the existing Prometheus data source. (For details, see Import Dashboard in the Grafana documentation.)
Configure exporter files: Edit the
export.ymlfile andquota-export.yamlfile (if monitoring filesystem quota).Cluster API endpoints.
Authentication file location
Organization details
Optional performance tuning
Deploy exporter: Based on your environment requirements, you can deploy using:
Option 1: Docker container
Option 2: Compiled binary
Option 3: Python script
Option 1: Docker (recommended)
Pull container:
Run exporter:
If monitoring filesystem quotas:
--network=hostworks only on Linux.On macOS/Windows, use
-pto publish ports.
Option 2: Compiled binary (if Docker is not available)
Download latest release (export):
If monitoring filesystem quotas (quota-export):
Option 3: Run as a Python script
Clone the export files and run the Python modules:
If monitoring filesystem quotas (quota-export):
Configure Prometheus
Add exporter target to
prometheus.yml:
If using quota-export, also add:
Exporter configuration reference
Exporter configuration options in the export.yml file
The exporter section defines the program behavior.
Exporter and loki parameters
listen_port
Do not change the Prometheus listening port unless the Prometheus configuration is updated.
timeout
Specify the maximum wait time in seconds for an API response. The default value is usually adequate.
backends_only
Run exclusively on WEKA backend servers.
max_procs and max_threads_per_proc
Scaling behavior:
The scaling behavior ensures that if the total number of hosts (servers and clients) exceeds the max_threads_per_proc, the system initiates additional processes as needed.
Example:
In a cluster configuration with 80 WEKA servers and 200 compute nodes, totaling 280 hosts, and using a default max_threads_per_proc of 100, it will operate with 3 processes since 280 / 100 approximately equals 3.
Recommendation:
For optimal performance, allocate at least 1 core per process. Therefore, for the given example, ensure there are at least 4 available cores on the hosting server or virtual machine.
loki:
host
When using the WEKAmon setup, keep the hostname unchanged. If you wish to disable sending events to Loki, leave the field blank.
loki:
port
Don't change the port when using the WEKAmon setup.
In a cluster with 1000 servers, the exporter attempts to allocate one server per thread, ensuring the number of processes does not exceed the max_procs parameter. If necessary, it assigns multiple servers to a single thread by doubling or tripling them.
Scenario: In a cluster consisting of 3000 hosts with configurations of max_procs = 8 and max_threads_per_proc = 100, the system is currently running 8 processes. Each process operates with 100 threads, but instead of managing 1 host per thread, each thread is handling nearly 4 hosts.
Last updated