Sunday, July 26, 2015

Setting up IPoIB (IP over Infiniband) on Ubuntu 14.04 using Mellanox MT25208


So, my wife recently agreed to the purchase of a new server. It needed a storage focus and a direct connection to the other server. I have been working with Infiniband networks at work and thought it would be fun to setup a small Infiniband network at home. 
In my excitement to purchase, I bought two Mellanox MT25208 cards without doing my research. Apparently these cards prefer to be in an RHEL or CentOS system. The bulk of the drivers seemed to be written for those systems. Panicked I began searching for any way to get them to work without having to reimage my servers. 

The solution turned out to be fairly simple thanks to this post: 
http://www.servethehome.com/configure-ipoib-mellanox-hcas-ubuntu-12041-lts/
I have listed their steps with any changes I needed, along with my config and testing.

%%%%%%%%
I used these parts:
1. 1x Infiniband 10GBs 4X CX4 to CX4 Cable SAS M/M 0.5M LATCH Type DDR
2. 2x Mellanox MHGA28-XTC InfiniHost III Ex 20Gb Dual-Port InfiniBand Adapter Card
I have ordered a new cable to try and get 20Gbps.
3. 1x Mellanox MC1104130-002 MicroGiGaCN latch, 30 AWG, 3 meter Copper Cable

  • Install the cards and connect them.
  • Verify that the cards can be seen by both servers:
    • lspci -v | grep Mell05:00.0 InfiniBand: Mellanox Technologies MT25208 [InfiniHost III Ex] (rev 20)Subsystem: Mellanox Technologies MT25208 [InfiniHost III Ex]
  • Install opensm on both servers: 
    • sudo apt-get install opensm
  • Load the required modules:
    • sudo modprobe mlx4_ib
    • sudo modprobe ib_umad
    • sudo modprobe ib_ipoib
  • Verify that the cards can see each other:
    • ibstat
      CA 'mthca0'
      CA type: MT25208
      Number of ports: 2
      Firmware version: 5.3.0
      Hardware version: 20
      Node GUID: 0x0002c9020023828
      System image GUID: 0x0002c902002382b
      Port 1:
      State: Active 
      Physical state: LinkUp 
      Rate: 20 
      Base lid: 2 
      LMC: 0 
      SM lid: 2 
      Capability mask: 0x02510a6a 
      Port GUID: 0x0002c9020023829 
      Link layer: InfiniBand
      Port 2:
      State: Down 
      Physical state: Polling 
      Rate: 10 
      Base lid: 0 
      LMC: 0 
      SM lid: 0 
      Capability mask: 0x02510a68 
      Port GUID: 0x0002c902002382a 
      Link layer: InfiniBand

  • Verify that the system can see the cards:
    • ifconfig -a
      ib0       Link encap:UNSPEC  HWaddr 80-00-04-00-00-00-00-00-00-00-00-00-00  
                BROADCAST MULTICAST  MTU:2044  Metric:1
                RX packets:0 errors:0 dropped:0 overruns:0 frame:0
                TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
                collisions:0 txqueuelen:256 
                RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

      ib1       Link encap:UNSPEC  HWaddr 80-00-04-00-00-00-00-00-00-00-00-00-00  
                BROADCAST MULTICAST  MTU:2044  Metric:1
                RX packets:0 errors:0 dropped:0 overruns:0 frame:0
                TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
                collisions:0 txqueuelen:256 
                RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
  • If that all looks correct we can make the setup persistent:
    • sudo vim /etc/modules
      # /etc/modules: kernel modules to load at boot time.
      #
      # This file contains the names of kernel modules that should be loaded
      # at boot time, one per line. Lines beginning with "#" are ignored.
      # Parameters can be specified after the module name.

      mlx4_ib
      ib_umad
      ib_ipoib
  • Then we will setup the network:
    • sudo vim /etc/network/interfaces
      Add ib0 to auto ib0 lo br0

      Then add a section for the IB setup
      # IB network setup
      iface ib0 inet static
              address 10.5.5.2
              netmask 255.255.255.0
              post-up echo connected > /sys/class/net/ib0/mode
              post-up /sbin/ifconfig $IFACE mtu 65520

      Set the address to something outside of your regular network. I used the 10.5.5.0/24 network so there won't be any confusion.
  • Setup the other server the same way with a different address.
  • Reboot both servers.
  • Verify that the connection is active:
    • ifconfig -a
      ib0       Link encap:UNSPEC  HWaddr 80-00-04-00-00-00-00-00-00-00-00-00-00  
                inet addr:10.5.5.2  Bcast:10.5.5.255  Mask:255.255.255.0
                inet6 addr: fe80::202:c902:28:39a5/64 Scope:Link
                UP BROADCAST RUNNING MULTICAST  MTU:65520  Metric:1
                RX packets:49 errors:0 dropped:0 overruns:0 frame:0
                TX packets:56 errors:0 dropped:9 overruns:0 carrier:0
                collisions:0 txqueuelen:256 
                RX bytes:5850 (5.8 KB)  TX bytes:8387 (8.3 KB)
  • Now for some tests:
    • iperf -c 10.5.5.2 -i 1 -l 65520
      ------------------------------------------------------------
      Client connecting to 10.5.5.2, TCP port 5001
      TCP window size: 2.50 MByte (default)
      ------------------------------------------------------------
      [  3] local 10.5.5.1 port 38358 connected with 10.5.5.2 port 5001
      [ ID] Interval       Transfer     Bandwidth
      [  3]  0.0- 1.0 sec  1.03 GBytes  8.84 Gbits/sec
      [  3]  1.0- 2.0 sec  1.12 GBytes  9.63 Gbits/sec
      [  3]  2.0- 3.0 sec  1.12 GBytes  9.60 Gbits/sec
      [  3]  3.0- 4.0 sec  1.13 GBytes  9.69 Gbits/sec
      [  3]  4.0- 5.0 sec  1.10 GBytes  9.45 Gbits/sec
      [  3]  5.0- 6.0 sec  1.09 GBytes  9.40 Gbits/sec
      [  3]  6.0- 7.0 sec  1.09 GBytes  9.35 Gbits/sec
      [  3]  7.0- 8.0 sec  1.12 GBytes  9.65 Gbits/sec
      [  3]  8.0- 9.0 sec  1.11 GBytes  9.57 Gbits/sec
      [  3]  9.0-10.0 sec  1.12 GBytes  9.60 Gbits/sec
      [  3]  0.0-10.0 sec  11.0 GBytes  9.48 Gbits/sec
So far so good, I'm happy with those results. I will run some additional testing after the new cable arrives. It should be closer to 20Gbs at that point.

No comments:

Post a Comment