Deploy Local WEKA Home on K8s
Manage the deployment, upgrade, and maintenance of Local WEKA Home (LWH) on Kubernetes (K8s) cluster. This deployment method provides a scalable, on-premises observability solution for WEKA clusters.
Overview
The LWH deployment provides an on-premises observability and monitoring solution for WEKA clusters. Organizations use this model to operate within their own infrastructure instead of relying on the WEKA-hosted cloud service. Running on a K8s cluster offers enhanced scalability, resilience, and control over system resources and data.
Deploying LWH on K8s supports scale-out environments and large cluster configurations. This architecture leverages Kubernetes orchestration capabilities for high availability, automated recovery, and simplified lifecycle management of the LWH components.
Deployment on K8s is supported for LWH version 4.x and above.
The deployment is managed through a configuration file to ensure a consistent, reproducible, and upgradeable installation process.
Solution architecture
The following diagram illustrates the solution architecture and the interaction between core components within the K8s environment:

Architecture components
The LWH v4.x solution ingests data from registered WEKA clusters and processes it through the following layers:
The LWH v4.x solution ingests data from registered WEKA clusters and processes it through the following layers:
Data ingestion layer: WEKA clusters send metrics, events, and alerts to LWH API endpoints.
API and ingress layer: Handles HTTP ingestion and routing. It supports multiple ingress controllers (ALB, Traefik, or Nginx) and can use an Envoy-based gateway service. API endpoints receive data and forward it to the persistent queue layer.
Processing layer: Uses NATS (persistent queues) for durable message storage and buffering. Worker services consume messages from these queues to process statistical data, events, and alerts.
Storage layer: Consists of specialized databases including a Postgres Database for metadata and a Victoria Metrics Cluster for raw time-series metrics. A secondary Victoria Metrics instance is used for internal application monitoring.
User interface layer: Provides a Grafana Dashboard for visualization and an LWH UI for managing rules and configurations.
Data flow
WEKA clusters send statistics, events, and alerts to API endpoints.
API components authenticate and validate incoming data.
Data is ingested into NATS persistent queues for reliable buffering.
Worker services consume messages from queues and process them.
Processed data is written to the appropriate databases:
Metrics are stored in the Victoria Metrics Cluster.
Events, alerts, and cluster metadata are stored in the Postgres Database.
The rules engine evaluates conditions and triggers configured integrations.
Grafana queries the databases to provide visual health and performance data.
Sizing and scaling guidelines
Explore the sizing guidelines and scaling behavior for a Local WEKA Home deployment.
A good starting point for estimation is one CPU core for every 1,000 WEKA processes (including management processes).
The default LWH installation supports approximately 40,000 WEKA processes (cores, backends, and clients). The actual number may vary based on the backend size and the backend-to-client ratio. Each WEKA process generates an average of 2,000 time series.
The primary performance metrics are the number of statistics messages and time series processed. For simplicity, you can base your sizing on the total number of WEKA processes, as other metrics tend to scale proportionally. However, some tuning may be necessary depending on your specific setup.
Component scaling behavior
API and Worker components: These use default autoscaling settings that support up to 100,000 processes. These components scale based on the current load.
VM Cluster (VictoriaMetrics): The cluster is configured by default to handle up to 80,000 processes. For higher loads, you may need to adjust CPU, memory, or the stateful set size.
NATS: This is configured by default to manage up to 100,000 processes.
Postgres: The Postgres database typically has low utilization and is not deployed redundantly by default. This design relies on:
Infrequent upgrades.
A strong consistency model.
Quick failover, assuming fast CSI reattachment (which is very fast with the WEKA filesystem).
Key tuning parameters
While the defaults handle common loads, you may need to tune the following parameters for very large or small deployments:
VMCluster: Adjust the CPU, memory, shard count, or capacity. You can often reduce these resources for smaller deployments.
Stats workers: The default memory setting is 1GiB, which might be insufficient for very high loads. As a guideline, processing stats for ~40,000 processes requires approximately 40 cores (hyperthreads).
Worker autoscaling: To prevent the Horizontal Pod Autoscaler (HPA) from resetting during redeployments, set
workers.stats.autoscaling.minReplicasto match your baseline usage.
Prerequisites
Before installing the Local WEKA Home, ensure the environment meets the following requirements.
Storage
A CSI (Container Storage Interface) driver is required. The storage class must support sharing or moving volumes between nodes, such as the WEKA CSI driver or Amazon EBS.
VictoriaMetrics Operator
The VictoriaMetrics Operator needs to be installed separately before installing the Local WEKA Home chart.
This separate installation prevents issues during uninstallation, such as Custom Resource (CR) objects becoming stuck, which can occur if the operator is auto-installed as a chart dependency.
The required installation method is Helm. The Operator Lifecycle Manager (OLM) method is not supported (see VictoriaMetrics Operator note about Setup chart repository).
Procedure
Run the following Helm command to install the operator. This command:
Installs version
0.39.1.Creates and uses the
victoria-metricsnamespace.Names the release
vmo.
Related information
VictoriaMetrics Operator official documentation
Deployment workflow
Configure Helm values: Create a
values.yamlfile to customize your WEKA Home deployment.Install the LWH: Follow one of the methods for deploying LWH on a Kubernetes environment: standard Helm installation or ArgoCD integration.
Configure networking and access: Set up ingress or gateway service access
Configure Helm values
Create a values.yaml file to customize your WEKA Home deployment. This file overrides the chart's default settings.
The following example highlights common adjustments, particularly for specifying a WEKA storage class for persistent volumes and using nodeSelector to schedule pods onto specific nodes (such as those running WEKA clients).
Refer to the complete values.yaml file in the WEKA Home Helm Chart repository for a full list of available parameters.
Configure gateway TLS (Optional)
If you enable TLS for the gateway (gateway.tls: true), you must manually create a Kubernetes secret containing your certificate and private key before installing the chart. The gateway.secretName value in your values.yaml must match the name of this secret.
Example TLS secret manifest
Ensure the cert.pem and key.pem data fields contain your Base64-encoded certificate and key content.
Install the LWH
You can deploy LWH on a Kubernetes environment using two primary methods: standard Helm installation or ArgoCD integration. Each method differs in setup complexity, ingress handling, and lifecycle management.
Method
Direct installation using Helm commands.
Integration with an ArgoCD application.
Requirements
Standard Helm CLI.
LWH v4.1.0-b40 or higher.
Configuration
Straightforward deployment.
Requires special handling for Helm hooks, secrets, and job lifecycle.
Secrets
Auto-generated during deployment.
Requires manual pre-creation of secrets.
Recommendation
Recommended for most standard deployments.
Suitable for environments managing applications using GitOps with ArgoCD.
Install the LWH using Helm
Use this procedure for a standard deployment of LWH using Helm commands.
The LWH Helm chart is publicly available on GitHub. The documentation on GitHub reflects the latest build. For a specific version, download the required values.yaml file directly.
Procedure
Add the WEKA Home Helm repository:
Run the Helm
upgradecommand to install or update the chart. Specify your namespace, the chart version, and the path to your customizedvalues.yamlfile.
Integrate the LWH with ArgoCD
Use this procedure to deploy LWH using ArgoCD.
ArgoCD integration requires version v4.1.0-b40 or higher. This method requires specific configuration adjustments because ArgoCD handles Helm charts differently than a standard Helm installation.
Helm hooks and jobs: ArgoCD uses alternative hook annotations. Job TTL (Time-To-Live) requires special handling to avoid conflicts.
Secrets: ArgoCD does not support the Helm
lookupfunction. You must manually create all required secrets before deployment.Ingress: Ingress updates in ArgoCD can be slow. If you use a gateway service instead of ingress, disable the ingress resource to improve update speeds.
Dashboards: LWH dashboards (starting from v4.1.0-b40) include an annotation (
argocd.argoproj.io/sync-options: Replace=true) to manage ConfigMap size limits.
Procedure
Configure Helm values for ArgoCD In your
values.yamlfile, set the following parameters:Set
generateSecrets: falseat the top level.To prevent conflicts with ArgoCD's job management, set the TTL for migration jobs:
(Optional) If you use a gateway service and not ingress, disable ingress creation:
Pre-create required secrets. Because ArgoCD does not support the Helm
lookupfunction, you must create the secrets manually.You can use the following script as a template. Update the
NAMESPACEandARGO_APP_NAMEvariables to match your environment.
Deploy the application Deploy the LWH Helm chart using your standard ArgoCD application definition. Ensure it references the
values.yamlfile you configured and uses the pre-created secrets.
Configure networking and access
Review the recommended methods for configuring network access to the LWH.
While WEKA Home supports various ingress controllers (such as ALB, Nginx, and Traefik), the simplest approaches are:
Use an Ingress Controller: Wrap the
gatewayservice with your cluster's standard ingress configuration, such as a VirtualService if you useIstio.Use a NodePort: Configure the service type as
NodePort. This method is ideal for dedicated nodes that do not require an external load balancer.
Upgrade Local WEKA Home
Use this procedure to upgrade an existing LWH deployment to a new version using Helm.
Before you begin
Ensure you have the path to your customized
values.yamlfile.Identify the new chart version you want to upgrade to.
Procedure
Update your local Helm repository to fetch the latest chart versions:
Run the
helm upgradecommand.This command uses
--installto upgrade the existingwekahomerelease.Replace
<new-version>with the specific chart version you are upgrading to.Ensure the
--namespaceand--valuesflags point to your existing deployment's configuration.
Last updated