List of Alerts
This page lists all the alerts generated by the Weka system, along with possible actions to take.
Name | Description | Actions |
AdminDefaultPassword | The admin password is still set to the factory default. | Change the admin user password to ensure only authorized users can access the cluster. |
AgentNotRunning | The Weka local control agent is not running on a host. | Restart the agent with |
ApproachingClientsUnavailability | Approaching the maximum amount of clients that can connect with the current cluster resources. | Make sure all backhand servers are up or expand the cluster with more backend servers. |
AutoRemoveTimeoutTooLow | Stateless Client auto-remove timeout too low. | Remount the host with a higher auto-remove timeout value. |
BackendNumaBalancingEnabled | A host has automatic NUMA balancing enabled which can negatively impact performance. | To disable, run |
BackendVersionsMismatch | There are mismatching versions of backend servers in the cluster. | Upgrade all the backend servers to match the cluster's version. |
BondInterfaceCompromised | The host is configured to work with a highly available network, but has lost the connectivity redundancy. A single network failure can disconnect the host from the cluster, which will result in the unavailability of data to the host (in case of a client host) or data protection reduced redundancy (in case of a backend host). | Check the network configuration, cables, NICs to resolve the issue. |
BucketHasNoQuorum | Too many compute nodes are down, causing the bucket compute resource to be unavailable. | Check that the compute nodes and their hosts are up and running and fully connected; Contact the Weka Support Team if the issue is not resolved. |
BucketUnresponsive | A compute resource has failed, causing system unavailability. | Check that the compute nodes and their hosts are up and running and fully connected; Contact the Weka Support Team if the issue is not resolved. |
ChokingDetected | High congestion level detected in the cluster. | For more information, refer to System Congestion. |
ClientNumaBalancingEnabled | A host has automatic NUMA balancing enabled which can negatively impact performance. | To disable, run |
ClientVersionsMismatch | There are clients with a version that does not match the cluster version. Some features may not be available until all the clients are upgraded. | Upgrade clients to be in the same version as the cluster by locally running |
ClockSkew | The clock of a host is skewed in relation to the cluster leader, with a time difference more than the permitted maximum of 30 seconds. | Make sure NTP is configured correctly on the hosts and that their dates are synchronized. |
CloudHealth | A host cannot upload events to the Weka cloud. | Check the host has Internet connectivity and is connected to the Weka cloud as explained in the Weka Support Cloud section. |
CloudStatsError | Statistics upload to Weka cloud failed. | Check the host has Internet connectivity and is connected to the Weka cloud as explained in the Weka Support Cloud section. |
ClusterInitializationError | The cluster has encountered an error while initializing. | Fix the underlying problem causing the error to successfully start IO operations. |
ClusterIsUpgrading | Cluster is upgrading. | If the upgrade doesn't finish normally, contact the Weka support for assistance. |
CPUFrequentStarvation | CPU frequent starvation detected in the last minute. | Check the relevant hosts logs for potential hardware problems or core allocation issues. |
CPUStarvation | Weka processes are experiencing long CPU stalls. | Check the relevant hosts logs for potential hardware problems. |
DataIntegrity | Data integrity issue found. | Contact the Weka support team. |
DataProtection | Some of the system's data is not fully redundant. | Check which node/host/drive is down and act accordingly. |
DedicatedWatchdog | A dedicated Weka host requires the installation of a watchdog driver. | Make sure a watchdog is available at /dev/watchdog. For more information, search the Weka knowledgebase in the Weka support portal. |
DriveDown | A drive is not responding. | Contact the Weka support team to check if the drive should be replaced. |
DriveEndurancePercentageUsed | Drive exceeding its life expectancy. | It is recommended to replace the drive before it fails. |
DriveEnduranceSparesRemaining | Drive internal spares running too low. | It is recommended to replace the drive before it fails. |
DriveNeedsPhaseout | A drive has too many errors. | Phase-out the drive and probably replace it. |
FilesystemHasTooManyFiles | The filesystem storage configuration for the size of file and directory entries is exceeding (or about to exceed). | Increase the max-files for the filesystem. |
FilesystemsThinProvisioningReserveReached | The request reserved capacity (for filesystem creation/expansion) is available. | The reserved capacity can now be used for filesystems creation/expansion. |
HangingIOs | Some IOs are hanging on the node acting as a driver/NFS/backend. | Check that the compute nodes and their hosts are up and running, and fully connected. Also check that if a backend object store is configured, it is connected and responsive. Contact the Weka Support Team if the issue is not resolved. |
HighDrivesCapacity | The average capacity of the SSDs is too high. | Free-up space on the SSDs or add more SSDs to the cluster. To add SSDs, see Expansion of specific resources. |
HighLevelOfUnreclaimedCapacityInObjectStore | High level of unreclaimed space in object store. | Check object store connectivity and deletion operations' progress. Validate authorization of deletion operations on the object store. Run |
JumboConnectivity | A host cannot send jumbo frames to any of its cluster peers. | Check the host network settings and the switch to which it is connected, even if Weka seems to be functional since this will improve performance. |
KmsError | KMS Error | Review the KMS credentials, permissions, and configuration, as suggested in KMS management. |
LegacyObsData | Legacy v3.4 data found in object-store. | This legacy data format is deprecated, and will not be supported in v3.15. The filesystem must be detached or removed before upgrading to v3.15. |
LicenseError | A license conflict exists. | Make sure the cluster is using a correct license, the license has not expired, and the cluster allocated space does not exceed the license. |
LowDiskSpace | The host has low disk space (for | Free up space on the host, or contact the Weka Support Team. |
ManualOverridesActive | Manual overrides are active. | Please contact the Weka Support Team. |
MismatchedDriveFailureDomain | The drive failure domain does not match the failure domain of its attached host. | Either connect the mismatched drive to a host with a matching failure domain, or re-provision the drive to erase its failure domain. |
NegativeUnprovisionedCapacity | Weka capacity usage changes detected due to cluster upgrade. | One or more of the filesystems need to be resized in order to reclaim capacity. Contact the Weka Support Team. |
NetworkInterfaceLinkDown | A Network interface has a link down status. | Check the connectivity to the down interface and see if there is anything blocking it. |
NoClusterLicense | No license is assigned to the cluster. | Obtain and install a license from get.weka.io. |
NodeBlacklisted | There is a blacklisted node in the cluster. | Use |
NodeDisconnected | A node is disconnected from the cluster. | Check network connectivity to make sure the node can communicate with the cluster. |
NodeNetworkUnstable | A node seems to have an unstable network. As a consequence, it has been fenced by the system and does not contribute resources to the Weka cluster. | Make sure there is no network connectivity issue in the cluster. Contact the Weka Support Team if the issue is not resolved. |
NodeRDMANotActive | RDMA is supported on the host but it is inactive. | Make sure Mellanox OFED version 4.6 or higher is properly installed on the host and there is at least one RDMA capable device. |
NodeTieringConnectivity | A node cannot connect to an object-store. | Check connectivity with the object store and make sure the node can communicate with it. |
NotEnoughActiveDrives | Reduced data protection. | Check connectivity, host status, replace problematic drives, and/or expand the cluster with new failure domains. |
OfedVersions | A host Mellanox OFED version ID does not match the one used by the Weka container. | Install a supported OFED. If the current version needs to be retained or the alert continues after a supported version is installed, contact the Weka Support Team. |
PartialConnectivityTrackingDisabled | The cluster's partial connectivity tracking mechanism is disabled, affecting the cluster's self-healing capabilities. | Contact the Weka Support Team. |
PartiallyConnectedNode | A node seems to be only partially connected. | Make sure there is no network connectivity issue. Contact the Weka Support Team if the issue is not resolved. |
PassedClientsAvailabilityThreshold | Reached Clients Limit | Add more backend servers to the cluster, check whether backends are down, or disconnect some clients. |
PerformanceDegradedLowRAM | The host is running low on RAM. Additional Metadata entries are swapped to the SSD. This might impact performance. | Make sure all the compute hosts and processes are up, add more hosts to the Weka cluster, or the configured RAM of the cluster backend hosts. |
QuotasHardLimitReached | There are directory quotas that have reached their hard limit. | Run |
QuotasSoftLimitReached | There are directory quotas that have reached their soft limit. | Run |
ResourcesNotApplied | There are changes to host resources that are not applied in the Weka cluster. | To apply changes run |
SSDCapacityDiscrepancy | Used SSD capacity mismatches the expected range | Monitor COMPUTE processes' stability, contact the Weka Support Team. |
SystemDefinedTLS | The Weka cluster uses an auto-generated self-signed certificate. | Run |
TLSCertificateExpired | TLS Certificate has expired. | Replace the current certificate using |
TLSCertificateExpiresSoon | TLS Certificate is about to expire. | Replace the current certificate using |
TieredFilesystemOverfillingSSD | Tiered filesystems' SSD Capacity overfilling. | Resolve tiering connectivity issues or increase the upload bandwidth. |
TraceDumperDown | Trace dumper is down | Contact the Weka Support Team to restart the trace dumper. |
TracesDisabled | Traces are disabled. | To turn them back on contact the Weka Support Team. |
TracesFreezePeriodActive | A trace freeze period is active. | Some traces can be protected from rotating for a period of time to debug the system. This is done by the Weka Support Team when needed. If the issue persists after the case has been resolved please contact the Weka Support Team. |
UdpModePerformanceWarning | The backend host is configured in UDP mode. | If this is a misconfiguration use |
Last updated