Advanced LACP Configuration Using esxcli

Introduction

Link Aggregation Control Protocol (LACP) allows a network device to negotiate an automatic bundling of links by sending LACP packets to the peer. End devices with LACP enabled (in this case, it’s ESXi and physical switch) send/receive frames called LACPDUs each other. Based on which LACP timer it is using, the period of LACPDUs’ differ. 1 second for fast and 30 seconds for slow. Why do we need LACP? The ultimate goal of LACP is to detect any LAG mis-configuration and wiring errors. For more information about LAG & LACP, refer here.

It’s been a long time to use LAG for ESXi servers but no LACP as it wasn’t supported. LACP was finally introduced in vSphere 5.1 but due to the SSO complexity and bugs, vSphere 5.1 was excluded from my list. With the introduction of stable vSphere version 5.5, as well as the enhanced LACP features, it’s been decided to upgrade and start configuring LACP. For more information about enhanced LACP features, refer to this link.

Enhanced LACP is only supported on dvSwitch 5.5 which means that previous version of dvSwitch needs to be upgraded. This requires vCenter server & ESXi to be upgraded as well. For more information about upgrading & configuring LACP, there is a step-by-step guide written by Chris Wahl which can be found here and I strongly recommend this post.

In this blog post, it will be going through advanced LACP configuration using esxcli command. Most will be based on SSH shell, be prepared!

Advanced Configuration

Assuming LACP is configured, let’s confirm it is operating. First of all, log-in to the ESXi server where LACP is configured using root credentials and run esxcli network vswitch dvs vmware lacp 

~ # esxcli network vswitch dvs vmware lacp
Usage: esxcli network vswitch dvs vmware lacp {cmd} [cmd options]

Available Namespaces:
 config Command to get LACP configuration
 stats Command to get LACP protocol statistics
 status Command to get LACP port status
 timeout Command to set LACP LAG timeout
There are 4 available namespaces and I will be going through each of them:
  • config get
  • stats get
  • status get
  • timeout set

esxcli network vswitch dvs vmware lacp config get

Config get namespace shows the current overall LACP configuration:

  • Name of the dvSwitch
  • Name of the LAG
  • ID of the LAG
  • NICs which are in LAG
  • Status, i.e. enabled or disabled
  • Mode, i.e. active or passive
  • Load balancing algorithm, there are 20 algorithms available

An example output is shown below.

~ # esxcli network vswitch dvs vmware lacp config get

DVS Name          LAG Name               LAG ID  NICs           Enabled  Mode    Load balance
----------------  ---------------------  ------  -------------  -------  ------  ------------
dvSwitch_TEST     default_uplink_pg_lag       0  vmnic0,vmnic1     true  Active  ---

esxcli network vswitch dvs vmware lacp stats get

Stats get namespace shows you the real time data of LACPDUs’ being sent/received. This could be used to ensure that LACPDUs are being sent/received to/from the physical switch. An example is shown below.

~ # esxcli network vswitch dvs vmware lacp stats get
DVSwitch          LAGID  NIC     Rx Errors  Rx LACPDUs  Tx Errors  Tx LACPDUs
----------------  -----  ------  ---------  ----------  ---------  ----------
dvSwitch_TEST         0  vmnic0          0          10          0         116
dvSwitch_TEST         0  vmnic1          0          10          0         116

This namespace will be updated based on the LACP timer, i.e. slow or fast. This will be discussed later on in this post.

esxcli network vswitch dvs vmware lacp status get

Status get represents detailed LACP configuration and the most important information is:

  • LACP timer, fast or slow
  • LACP mode, active or passive

An example output is below and take a close look at italic & bold lines.

/var/log # esxcli network vswitch dvs vmware lacp status get

dvSwitch_TEST
   DVSwitch: dvSwitch_TEST
   Flags: S - Device is sending Slow LACPDUs, F - Device is sending fast LACPDUs, A - Device is in active mode, P - Device is in passive mode
   LAGID: 0
   Mode: Active
   Nic List:
         Local Information:
         Admin Key: 11
         Flags: SA
         Oper Key: 11
         Port Number: 32769
         Port Priority: 255
         Port State: ACT,AGG,SYN,COL,DIST,
         Nic: vmnic1
         Partner Information:
         Age: 00:00:05
         Device ID: aa:bb:cc:dd:ee:ff
         Flags: SA
         Oper Key: 6
         Port Number: 11
         Port Priority: 127
         Port State: ACT,AGG,SYN,COL,DIST,
         State: Bundled

         Local Information:
         Admin Key: 11
         Flags: SA
         Oper Key: 11
         Port Number: 32768
         Port Priority: 255
         Port State: ACT,AGG,SYN,COL,DIST,
         Nic: vmnic0
         Partner Information:
         Age: 00:00:05
         Device ID: aa:bb:cc:dd:ee:ff
         Flags: SA
         Oper Key: 6
         Port Number: 20
         Port Priority: 127
         Port State: ACT,AGG,SYN,COL,DIST,
         State: Bundled

On the example above, the flag is set to SA which means Slow and Active. Active/Passive mode could be changed via GUI but not the LACP timer. It could be done via esxcli and will be explained shortly.

esxcli network vswitch dvs vmware lacp timeout set

Timeout set namespace allows you to change the LACP timer either to slow or fast. This is a very important element to consider as it has to be matched on both sides, i.e. ESXi and physical switch.

Before going through timeout set namespace, let’s take a look at physical switch configuration. The example below is from Juniper QFabric:

show configuration interfaces ABCD
description ABCD;
mtu 9216;
aggregated-ether-options {
   minimum-links 1;
    link-speed 10g;
    lacp {
        active;
        periodic fast;
    }
}
unit 0 {
    family ethernet-switching {
        port-mode trunk;
        vlan {
            members [ 10 11 12 13 14 15 ];
        }
    }
}

The above example output’s LACP is set to fast which means, LACP on dvSwitch needs to be configured as fast as well. Running esxcli network vswitch dvs vmware lacp status get will show you the LACP timer as described above. By default, it is set to Slow and it needs to modified to fast.

~ # esxcli network vswitch dvs vmware lacp timeout set
Error: Missing required parameter -l|--lag-id
 Missing required parameter -s|--vds
 Missing required parameter -t|--timeout
Usage: esxcli network vswitch dvs vmware lacp timeout set [cmd options]
Description: 
 set Set long/short timeout for vmnics in one LACP LAG
Cmd options:
 -l|--lag-id=<long> The ID of LAG to be configured. (required)
 -n|--nic-name=<str> The nic name. If it is set, then only this vmnic in the lag will be configured.
 -t|--timeout Set long or short timeout: 1 for short timeout and 0 for long timeout. (required)
 -s|--vds=<str> The name of VDS. (required)

Using the status get namespace, the mandatory parameters could be obtained. In this case:

  • –lag-id=0
  • –vds=dvSwitch_TEST
  • –timeout=1

Executing esxcli network vswitch dvs vmware lacp timeout set –lag-id=0 –vds=dvSwitch_TEST –timeout=1, there will be no output if it’s successful. Checking the status again, you will see the timer is set to fast:

/var/log # esxcli network vswitch dvs vmware lacp status get

dvSwitch_TEST
   DVSwitch: dvSwitch_TEST
   Flags: S - Device is sending Slow LACPDUs, F - Device is sending fast LACPDUs, A - Device is in active mode, P - Device is in passive mode
   LAGID: 0
   Mode: Active
   Nic List:
         Local Information:
         Admin Key: 11
         Flags: FA
         Oper Key: 11
         Port Number: 32769
         Port Priority: 255
         Port State: ACT,AGG,SYN,COL,DIST,
         Nic: vmnic1
         Partner Information:
         Age: 00:00:05
         Device ID: aa:bb:cc:dd:ee:ff
         Flags: FA
         Oper Key: 6
         Port Number: 11
         Port Priority: 127
         Port State: ACT,AGG,SYN,COL,DIST,
         State: Bundled

         Local Information:
         Admin Key: 11
         Flags: FA
         Oper Key: 11
         Port Number: 32768
         Port Priority: 255
         Port State: ACT,AGG,SYN,COL,DIST,
         Nic: vmnic0
         Partner Information:
         Age: 00:00:05
         Device ID: aa:bb:cc:dd:ee:ff
         Flags: FA
         Oper Key: 6
         Port Number: 20
         Port Priority: 127
         Port State: ACT,AGG,SYN,COL,DIST,
         State: Bundled

Now let’s check if it sends/receives LACPDUs. The example below shows that LACPDUs received/transmitted successfully over 1 second.

~ # esxcli network vswitch dvs vmware lacp stats get
DVSwitch      LAGID NIC    Rx Errors Rx LACPDUs Tx Errors Tx LACPDUs
dvSwitch_TEST 0     vmnic1 0         72509      0            72355
dvSwitch_TEST 0     vmnic0 0         120912     0            369634
~ # esxcli network vswitch dvs vmware lacp stats get
DVSwitch      LAGID NIC    Rx Errors Rx LACPDUs Tx Errors Tx LACPDUs
dvSwitch_TEST 0     vmnic1 0         72510      0            72356
dvSwitch_TEST 0     vmnic0 0         120913     0            369635

Wrap-up

In this post, using esxcli command I went through how to:

  • Check the status of LACP
  • Check the detailed information
  • Check the stats
  • Configure LACP timer

I mainly focused on matching LACP timer as I had an issue on this. By default, VMware uses slow and Juniper QFabric uses fast that network flapping occurred.

Hope this helps 😀

 

Advertisement

4 thoughts on “Advanced LACP Configuration Using esxcli

  1. thank you , l’im glad i found your article because there is really not much information about this elsewhere !

    do you know where i can find more information about the Load balancing algorithms ?
    i really don’t know what choice to make and why 🙂

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s