Manually prepare the system for WEKA configuration
If the system is not prepared using the WMS, perform this procedure to set the networking and other tasks before configuring the WEKA cluster.
Once the hardware and software prerequisites are met, prepare the backend servers and clients for the WEKA system configuration.
This preparation consists of the following steps:
Install NIC drivers
Enable SR-IOV (when required)
Set up ConnectX cards
Configure the networking
Configure the HA networking
Verify the network configuration
Configure the clock synchronization
Disable the NUMA balancing
Enable kdump and set kernel panic reboot timer
Disable swap (if any)
Validate the system preparation
Related topics
Prerequisites and compatibility
1. Install NIC drivers
To install Mellanox OFED, see NVIDIA Documentation - Installing Mellanox OFED.
To install Broadcom driver, see Broadcom adapter setup for WEKA system.
To install Intel driver, see Latest Drivers & Software downloads.
2. Enable SR-IOV
Single Root I/O Virtualization (SR-IOV) enablement is mandatory in the following cases:
The servers are equipped with Intel NICs.
When working with client VMs, a physical NIC's virtual functions (VFs) must be exposed to the virtual NICs.
Related topic
3. Set up ConnectX cards
Configure firmware parameters: All ConnectX ports used directly with WEKA servers and clients require specific firmware settings for optimal performance. Set the following non-default parameters:
ADVANCED_PCI_SETTINGS=1
PCI_WR_ORDERING=1
Use the following command to apply these settings to all MLX devices:
mst start && for MLXDEV in /dev/mst/* ; do mlxconfig -d ${MLXDEV} -y set ADVANCED_PCI_SETTINGS=1 PCI_WR_ORDERING=1; done
Set link type: Certain ConnectX VPI cards require modification of the link type, to specifically set the port to use InfiniBand or Ethernet networking. If applicable, set the port mode with the following command, where 1=InfiniBand and 2=Ethernet:
mlxconfig -y -d /dev/mst/<dev> set LINK_TYPE_P<1,2>=<1,2>
For example, the following command sets port 2 to InfiniBand:mlxconfig -y -d /dev/mst/<dev> set LINK_TYPE_P2=1
Reboot the system: A reboot is required after applying the firmware settings to ensure the changes take effect.
Related information
For additional details, refer to the NVIDIA ConnectX documentation.
4. Configure the networking
Ethernet configuration
The following example of the ifcfg
script is a reference for configuring the Ethernet interface.
TYPE="Ethernet"
PROXY_METHOD="none"
BROWSER_ONLY="no"
BOOTPROTO="none"
DEFROUTE="no"
IPV4_FAILURE_FATAL="no"
IPV6INIT="no"
IPV6_AUTOCONF="no"
IPV6_DEFROUTE="no"
IPV6_FAILURE_FATAL="no"
IPV6_ADDR_GEN_MODE="stable-privacy"
NAME="enp24s0"
DEVICE="enp24s0"
ONBOOT="yes"
NM_CONTROLLED=no
IPADDR=192.168.1.1
NETMASK=255.255.0.0
MTU=9000
MTU 9000 (jumbo frame) is recommended for the best performance. Refer to your switch vendor documentation for jumbo frame configuration.
Bring the interface up using the following command:
# ifup enp24s0
InfiniBand configuration
InfiniBand network configuration normally includes Subnet Manager (SM), but the procedure involved is beyond the scope of this document. However, it is important to be aware of the specifics of your SM configuration, such as partitioning and MTU, because they can affect the configuration of the endpoint ports in Linux. For best performance, MTU of 4092 is recommended.
Refer to the following ifcfg
script when the IB network only has the default partition, i.e., "no pkey
":
TYPE=Infiniband
ONBOOT=yes
BOOTPROTO=static
STARTMODE=auto
USERCTL=no
NM_CONTROLLED=no
DEVICE=ib1
IPADDR=192.168.1.1
NETMASK=255.255.0.0
MTU=4092
Bring the interface up using the following command:
# ifup ib1
Verify that the “default partition” connection is up, with all the attributes set:
# ip a s ib1
4: ib1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 4092 qdisc mq state UP group default qlen 256
link/infiniband 00:00:03:72:fe:80:00:00:00:00:00:00:24:8a:07:03:00:a8:09:48
brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
inet 10.0.20.84/24 brd 10.0.20.255 scope global noprefixroute ib0
valid_lft forever preferred_lft forever
Define the NICs with ignore-carrier
ignore-carrier
ignore-carrier
is a NetworkManager configuration option. When set, it keeps the network interface up even if the physical link is down. It’s useful when services need to bind to the interface address at boot.
Open the
/etc/NetworkManager/NetworkManager.conf
file to edit it.Under the
[main]
section, add one of the following lines depending on the operating system:For some versions of Rocky Linux, RHEL, and CentOS:
ignore-carrier=*
For some other versions:
ignore-carrier=<device-name1>,<device-name2>
. Replace<device-name1>,<device-name2>
with the actual device names you want to apply this setting to.
Example for RockyLinux and RHEL 8.7:
[main]
ignore-carrier=*
Example for some other versions:
[main]
ignore-carrier=ib0,ib1
Restart the NetworkManager service for the changes to take effect.
5. Configure dual-network links with policy-based routing
The following steps provide guidance for configuring dual-network links with policy-based routing on Linux systems. Adjust IP addresses and interface names according to your environment.
General Settings in /etc/sysctl.conf
/etc/sysctl.conf
Open the
/etc/sysctl.conf
file using a text editor.Add the following lines at the end of the file to set minimal configurations per InfiniBand (IB) or Ethernet (Eth) interface:
# Minimal configuration, set per IB/Eth interface net.ipv4.conf.ib0.arp_announce = 2 net.ipv4.conf.ib1.arp_announce = 2 net.ipv4.conf.ib0.arp_filter = 1 net.ipv4.conf.ib1.arp_filter = 1 net.ipv4.conf.ib0.arp_ignore = 0 net.ipv4.conf.ib1.arp_ignore = 0 # As an alternative set for all interfaces by default net.ipv4.conf.all.arp_filter = 1 net.ipv4.conf.default.arp_filter = 1 net.ipv4.conf.all.arp_announce = 2 net.ipv4.conf.default.arp_announce = 2 net.ipv4.conf.all.arp_ignore = 0 net.ipv4.conf.default.arp_ignore = 0
Save the file.
Apply the new settings by running:
sysctl -p /etc/sysctl.conf
RHEL/Rocky/CentOS routing configuration using the network scripts
Navigate to
/etc/sysconfig/network-scripts/
.Create the file
/etc/sysconfig/network-scripts/route-mlnx0
with the following content:10.90.0.0/16 dev mlnx0 src 10.90.0.1 table weka1 default via 10.90.2.1 dev mlnx0 table weka1
Create the file
/etc/sysconfig/network-scripts/route-mlnx1
with the following content:10.90.0.0/16 dev mlnx1 src 10.90.1.1 table weka2 default via 10.90.2.1 dev mlnx1 table weka2
Create the files
/etc/sysconfig/network-scripts/rule-mlnx0
and/etc/sysconfig/network-scripts/rule-mlnx1
with the following content:table weka1 from 10.90.0.1 table weka2 from 10.90.1.1
Open
/etc/iproute2/rt_tables
and add the following lines:100 weka1 101 weka2
Save the changes.
RHEL/Rocky 8+ routing configuration using the Network Manager
For Ethernet (ETH): To set up routing for Ethernet connections, use the following commands:
nmcli connection modify eth1 ipv4.routes "10.10.10.0/24 src=10.10.10.1 table=100" ipv4.routing-rules "priority 101 from 10.10.10.1 table 100"
nmcli connection modify eth2 ipv4.routes "10.10.10.0/24 src=10.10.10.101 table=200" ipv4.routing-rules "priority 102 from 10.10.10.101 table 200"
The route's first IP address in the provided commands represents the network's subnet to which the NIC is connected. The last address in the routing rules corresponds to the IP address of the NIC being configured, where eth1
is set to 10.10.10.1
.
For InfiniBand (IB): To configure routing for InfiniBand connections, use the following commands:
nmcli connection modify ib0 ipv4.route-metric 100
nmcli connection modify ib1 ipv4.route-metric 101
nmcli connection modify ib0 ipv4.routes "10.10.10.0/24 src=10.10.10.1 table=100"
nmcli connection modify ib0 ipv4.routing-rules "priority 101 from 10.10.10.1 table 100"
nmcli connection modify ib1 ipv4.routes "10.10.10.0/24 src=10.10.10.101 table=200"
nmcli connection modify ib1 ipv4.routing-rules "priority 102 from 10.10.10.101 table 200"
The route's first IP address in the above commands signifies the network's subnet associated with the respective NIC. The last address in the routing rules corresponds to the IP address of the NIC being configured, where ib0
is set to 10.10.10.1
.
Ubuntu Netplan configuration
Open the Netplan configuration file
/etc/netplan/01-netcfg.yaml
and adjust it:network: version: 2 renderer: networkd ethernets: enp2s0: dhcp4: true nameservers: addresses: [8.8.8.8] ib1: addresses: [10.222.0.10/24] routes: - to: 10.222.0.0/24 via: 10.222.0.10 table: 100 routing-policy: - from: 10.222.0.10 table: 100 priority: 32764 ignore-carrier: true ib2: addresses: [10.222.0.20/24] routes: - to: 10.222.0.0/24 via: 10.222.0.20 table: 101 routing-policy: - from: 10.222.0.20 table: 101 priority: 32765 ignore-carrier: true
After adjusting the Netplan configuration file, run the following commands:
ip route add 10.222.0.0/24 via 10.222.0.10 dev ib1 table 100 ip route add 10.222.0.0/24 via 10.222.0.20 dev ib2 table 101
SLES/SUSE configuration
Create
/etc/sysconfig/network/ifrule-eth2
with:ipv4 from 192.168.11.21 table 100
Create
/etc/sysconfig/network/ifrule-eth4
with:ipv4 from 192.168.11.31 table 101
Create
/etc/sysconfig/network/scripts/ifup-route.eth2
with:ip route add 192.168.11.0/24 dev eth2 src 192.168.11.21 table weka1
Create
/etc/sysconfig/network/scripts/ifup-route.eth4
with:ip route add 192.168.11.0/24 dev eth4 src 192.168.11.31 table weka2
Add the weka lines to
/etc/iproute2/rt_tables
:100 weka1 101 weka2
Restart the interfaces or reboot the machine:
ifdown eth2; ifdown eth4; ifup eth2; ifup eth4
Related topic
6. Verify the network configuration
Use a large-size ICMP ping to check the basic TCP/IP connectivity between the interfaces of the servers:
# ping -M do -s 8972 -c 3 192.168.1.2
PING 192.168.1.2 (192.168.1.2) 8972(9000) bytes of data.
8980 bytes from 192.168.1.2: icmp_seq=1 ttl=64 time=0.063 ms
8980 bytes from 192.168.1.2: icmp_seq=2 ttl=64 time=0.087 ms
8980 bytes from 192.168.1.2: icmp_seq=3 ttl=64 time=0.075 ms
--- 192.168.2.0 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1999ms
rtt min/avg/max/mdev = 0.063/0.075/0.087/0.009 ms
The-M do
flag prohibits packet fragmentation, which allows verification of correct MTU configuration between the two endpoints.
-s 8972
is the maximum ICMP packet size that can be transferred with MTU 9000, due to the overhead of ICMP and IP protocols.
7. Configure the clock synchronization
The synchronization of time on computers and networks is considered good practice and is vitally important for the stability of the WEKA system. Proper timestamp alignment in packets and logs is very helpful for the efficient and quick resolution of issues.
Configure the clock synchronization software on the backends and clients according to the specific vendor instructions (see your OS documentation), before installing the WEKA software.
8. Disable the NUMA balancing
The WEKA system autonomously manages NUMA balancing, making optimal decisions. Therefore, turning off the Linux kernel’s NUMA balancing feature is a mandatory requirement to prevent extra latencies in operations. It’s crucial that the disabled NUMA balancing remains consistent and isn’t altered by a server reboot.
To persistently disable NUMA balancing, follow these steps:
Open the file located at:
/etc/sysctl.conf
Append the following line:
kernel.numa_balancing=disable
9. Enable kdump and set kernel panic reboot timer
Enabling kdump and configuring the kernel panic reboot timer ensures system crashes leave log files for analysis and automate system reboot after a kernel panic to minimize downtime.
10. Disable swap (if any)
WEKA highly recommends that any servers used as backends have no swap configured. This is distribution-dependent but is often a case of commenting out any swap
entries in /etc/fstab
and rebooting.
11. Validate the system preparation
The wekachecker
is a tool that validates the readiness of the servers in the cluster before installing the WEKA software.
The wekachecker
performs the following validations:
Dataplane IP, jumbo frames, and routing
ssh connection to all servers
Timesync
OS release
Sufficient capacity in /opt/weka
Available RAM
Internet connection availability
NTP
DNS configuration
Firewall rules
WEKA required packages
OFED required packages
Recommended packages
HT/AMT is disabled
The kernel is supported
CPU has a supported AES, and it is enabled
Numa balancing is enabled
RAM state
XFS FS type installed
Mellanox OFED is installed
IOMMU mode for SSD drives is disabled
rpcbind utility is enabled
SquashFS is enabled
noexec mount option on /tmp
Procedure
Download the wekachecker tarball from https://github.com/weka/tools/blob/master/install/wekachecker and extract it.
From the install directory, run
./wekachecker <hostnames/IPs>
Where: Thehostnames/IPs
is a space-separated list of all the cluster hostnames or IP addresses connected to the high-speed networking. Example:./wekachecker 10.1.1.11 10.1.1.12 10.1.1.4 10.1.1.5 10.1.1.6 10.1.1.7 10.1.1.8
Review the output. If failures or warnings are reported, investigate them and correct them as necessary. Repeat the validation until no important issues are reported. The
wekachecker
writes any failures or warnings to the file:test_results.txt
.
Once the report has no failures or warnings that must be fixed, you can install the WEKA software.
What to do next?
If you can use the WEKA Configurator, go to:
Configure the WEKA cluster using the WEKA Configurator
Otherwise, go to:
Manually configure the WEKA cluster using the resources generator
Last updated