How to install Mellanox driver

Environment

Kernel version: 4.18.14, Ubuntu 18.04.1 LTS (4.18 kernel is not supported officially. Driver need to be modified to coincide with the kernel modification.)

Kernel version: 4.15, Ubuntu 18.04.1 LTS

MLNX driver: MLNX_OFED_LINUX-4.4-2.0.7.0-ubuntu18.04-x86_64

Prerequisites

Refer to: https://docs.mellanox.com/display/MLNXOFEDv451010/Supported+Platforms+and+Operating+Systems

In ubuntu 18.04,

sudo apt-get install perl dpkg autotools-dev autoconf libtool make1.10 automake m4 dkms debhelper tcl tcl8.5 chrpath swig graphviz tcl-dev tcl8.5-dev tk-dev tk8.5-dev bison flex dpatch zlib1g-dev curl libcurl4-gnutls-dev python-libxml2 libvirt-bin libvirt0 libnl-3-dev libglib2.0-dev libgfortran3 automake m4

Kernel Compile

  1. The kernel should be compiled on its root path so that drivers can be installed with their Makefiles later.
  1. sudo make menuconfig
  2. sudo make -j8
  3. sudo make modules_install
  4. sudo make install
  1. Reboot to the new kernel.

Install MLNX_OFED driver

  1. Download driver from the http://mellanox.com/page/products_dyn?product_family=26&mtag=linux_sw_drivers 
  2. Install as below. (In the case of Ubuntu)

./mlnxofedinstall --without-dkms --add-kernel-support --kernel `uname -r` --without-fw-update --force

Other References

Troubleshooting (using kernel 4.18)

Prob) Some error occurs during kernel “make install” like the following screen capture.

Sol) Remove MLNX_OFED -> update kernel (kernel install) -> reinstall MLNX_OFED

Prob)

/tmp/MLNX_OFED_LINUX-4.4-2.0.7.0-4.18.14/mlnx_iso.3007/mlnx-ofed-kernel/mlnx-ofed-kernel-4.4/drivers/scsi/scsi_transport_srp.c: In function 'srp_timed_out':
/tmp/MLNX_OFED_LINUX-4.4-2.0.7.0-4.18.14/mlnx_iso.3007/mlnx-ofed-kernel/mlnx-ofed-kernel-4.4/drivers/scsi/scsi_transport_srp.c:640:24: error: 'BLK_EH_NOT_HANDLED' undeclared (first use in this function); did you mean 'BLK_EH_DONE'?
  BLK_EH_RESET_TIMER : BLK_EH_NOT_HANDLED;
                       ^~~~~~~~~~~~~~~~~~
                       BLK_EH_DONE
/tmp/MLNX_OFED_LINUX-4.4-2.0.7.0-4.18.14/mlnx_iso.3007/mlnx-ofed-kernel/mlnx-ofed-kernel-4.4/drivers/scsi/scsi_transport_srp.c:640:24: note: each undeclared identifier is reported only once for each function it appears in
/tmp/MLNX_OFED_LINUX-4.4-2.0.7.0-4.18.14/mlnx_iso.3007/mlnx-ofed-kernel/mlnx-ofed-kernel-4.4/drivers/scsi/scsi_transport_srp.c:641:1: warning: control reaches end of non-void function [-Wreturn-type]
}
^

Sol) According to https://patchwork.kernel.org/patch/10421063/, BLK_EH_NOT_HANDLED has been changed to BLK_EH_DONE. So modify the source code of the driver.

Troubleshooting (using kernel 4.14 (nova))

Prob) modprobe svcrdma failed because of unknown symbol.

Reference: https://community.mellanox.com/message/9049

Sol)


Linux Inbox Driver

Refer to the Mellanox user manual for Linux Inbox Driver. (http://www.mellanox.com/page/inbox_drivers)

  1. Uninstall MLNX_OFED driver first.
  2. Install the following packages.

yum install libibverbs librdmacm libibcm libibmad libibumad libmlx4 libmlx5 opensm ibutils infiniband-diags srptools perftest mstflint librdmacm-utils librdmacm-utils libibverbs-utils -y

  1. Note that last two packages have different names from the instruction in the User Manual.
  1. Reboot the system and check the status of device.

Change Link Rate

Reference: https://community.mellanox.com/s/question/0D51T00006RXcmRSAT/how-to-change-speed-from-fdr-to-qdr-using-ibportstate-command-in-centos-62

Change from 56 -> 40 (Base LID=5, Port=2) Please use your LID and Port number. (ibportstate <LID> <Port> will show the current state)

ibportstate 5 2 fdr10 0 espeed 30
ibportstate
5 2 reset

(takes some time to be reset.)