On Wednesday, I wrote about a VMware vSphere 5 networking bug that caused issues with iSCSI networking. That bug, described in VMware KB 2008144 caused vmk traffic to be sent over the unused vmnic uplink in a team where there is an unused uplink and an explicit failover policy present. See the diagram below to better understand what was going on there….
The second bug vSphere 5 networking bug I experienced was similar to the first: traffic was sent out of an unexpected interface after upgrading to ESXi 5. This particular bug surfaced while troubleshooting my iSCSI bug (because why not have two unrelated bugs at the same time). Many of the troubleshooting steps I used in the first networking bug were employed on this, so I won’t bore you again with the details. I will, however, give you a quick overview of the network setup that this issue appeared in.
Here’s layer 1 connectivity for ESXi host vmnics to the switching stack.
Here’s the ESXi network config:
The specific portion of the configuration that was impacted by this bug was vSwitch0, which contained vmnic0 & vmnic1, my Management Network vmknic and vMotion vmk port group. The Management and vMotion port groups had a manually set failover order as pictured below:
This is all pretty standard network configuration for a VMware ESXi host with 6 physical network adapters, and follows best practice for management network redundancy for VMware HA (I highly recommend reading more on HA best practices in Duncan Epping and Frank Dennemon’s VMware vSphere 5 Clustering Technical Deepdive book).
The problems that were manifested as a result of the bug were:
- ESXi hosts would intermittently fall out of manageability by vCenter, the vSphere Client, and SSH (which was enabled from the console of the hosts). Management connectivity could be restored (most times) by restarting the ESXi Management Network from the console. I could usually ping the management network IP address even though the host was not manageable.
- ESXi syslogs stopped being sent to the vCenter Syslog collector.
- vMotion between hosts in the cluster intermittently worked. vMotion success was not always in sync with management network connectivity. vMotion capabilities could be restored by restarting the ESXi Management Network from the console.
- As an added bonus, VMware High Availability (HA) would sometimes detect host failures and restart VM’s on the surviving HA nodes.
Notice my use of ‘intermittently, usually, and sometimes’ – this made for tough troubleshooting. If you’re gonna fail, fail big. None of this wimpy on-again, off-again nonsense.
Luckily, I had VMware support on the phone as this problem appeared. The support tech seemed to know just what the problem was:
A known issue on ESXi 5 occurs when two or more vmkernel NIC’s (vmknic) are on the same standard vSwitch. Under this configuration traffic may be sent out the incorrect vmknic.
As far as I am aware, there is no VMware Knowledgebase article for this issue yet (comment if you know of one), so details are based on my own conversation with the support engineers working the case. From what I was able to infer, this bug appears:
- More often when ESXi hosts are under stress (my iSCSI involving network bug really stressed out the hosts – and me – when all paths were down)
- Seems to happen more on Broadcom NIC’s then Intel
- Triggered and/or fixed by a network up/down event (such as restarting the management network on the host).
- Does NOT happen with a Distributed Virtual Switch.
- Is scheduled to be patched with or after vSphere 5 Update 1.
Whereas my iSCSI bug involved a vmnic team with an unused uplink, and traffic being sent out the wrong vmnic, this second bug occurred with two vmnic’s (one active, one standby), two vmk ports on a standard switch, and traffic being sent out the wrong vmk port, which happened to have a different active vmnic than the correct vmk port had. Here’s a diagram of the traffic flow gone wrong:
I still find it a bit odd that ICMP traffic continued to flow to the interface, but that the Management traffic took an alternate route out and landed on my non-routed vMotion VLAN (and different subnet).
The workaround for this bug is simple – remove the second NIC and second VMkernel Port (vMotion for me) from the vSwitch and restart the ESXi Management Network. Once this was done, management traffic flowed normally.
I then created a new vSwitch, attached the second vmnic to it, and then re-created the VMkernel port for vMotion.
While the work-around was great for getting my hosts back into manageability, it was not so great for the redundant architecture I had originally implemented. After splitting the VMkernel ports onto two different vSwitches, I received warnings in vCenter that “
Host currently has no management network redundancy.” KB 1004700 addresses this message if you are looking for more info on it. I could disable the warning, but that would be like slapping a fresh coat of paint on a jalopy.
The workaround for this bug kills redundancy. Simply adding another two physical NIC’s and, in my case, binding one to the Management vSwitch and one to the vMotion vSwitch. This change would require host downtime to install new hardware if your host only had 6 NIC’s like this environment did.
Alternatively, you could migrate your Management network and vMotion networks to a virtual Distributed Switch (vDS) as this bug does not appear to impact vDS – only standard vSwitches. Side note: Check Duncan Epping’s post on using a virtual vCenter server connected to a vDS if that’s holding you back from going to a vDS. Also read the new vDS Best Practices whitepaper from VMware.
This bug could impact more configurations than the one I highlighted. For example, I could see it causing issues with Multiple-NIC vMotion in Sphere 5.
Drop a comment if you have experienced this bug, know of a KB article, or can think of any other ways it might be manifested.
Update (Nov 8 2012): I received an email from Nick Eggleston about this issue – Nick experienced this problem at a customer site and heard from VMware support that “There is a vswitch failback fix which may be relevant and has not yet shipped in ESXi 5.0 (it postdates 5.0 Update 1 and is currently targeted for fixing in Update 2). However this fix *has* already shipped in the ESXi 5.1 GA release.” Thanks, Nick!
Avram Woroch says
Thank god, this means I’m *not* crazy (well, not for this reason anyway). I’m having similar issues in my home lab, with vSwitch0 carrying Management and vMotion, and vSwitch1 carrying iSCSI. Having all sorts of issues, regardless of what kind of software SAN I’m trying.
After reading this I could only think, what a horrible example to publish, the architecture in place is incorrect to begin with so it doesn’t really properly demonstrate the bug. Your illustration shows vmnic0 and vmnic1 wired to disparate switches, an iscsi storage switch stack and a prod switch??? which is very strange but lets not drill into this piece too much although it should have made everyone start scratching heads.
Then – you have bonded two vmnic into a team where one can fail over to the other, yet have configured the upstream ports to point to different VLANs… “Wrong VLAN Wrong switch WRONG WRONG WRONG” you state, of course, you configured it wrong. Did you notice the observable VLAN ranges are different for the vmnics that you have bonded?
in your illustration it should show that when traffic fails over to a different vmnic it can still reach the desired subnets, at least in a valid configuration. You can’t put vmnics in a team and expect the ports in the upstream switch they wire to be configured differently with different VLANs. You should be able to fail over to any vmnic in the team and have traffic flow upstream to trunked ports that have the same VLANs configured.
Josh Townsend says
Thanks for the input, hater. Hopefully your comments help readers better design their own environments so they don’t end up the way I found this environment when the customer called for help.
Alastair Cooke says
Are the three physical switches interconnected?
If not then your problem is that your standard vSwitch uplinks don’t have Layer 2 adjacency. Layer 2 adjacency on all uplinks is a requirement for standard vSwitch behavior.
I would have your three vSwitches match up to your three physical switches then you get proper failover between NICs to provide redundancy.
I think this affects any two NIC / two vmkernel port setup. I’ve been setting up for a new test run and with the 10GB links (actually UCS vnics) I have separating NFS vmkernel ports from management vmkernel ports wasn’t a priority. I’ve been seeing random host disconnects as well as NFS datastore disconnects. I moved the vmkernel ports to separate vswitches and it appears the problem has been solved.
I have another 50 hosts to setup for my testing so I should know tomorrow if everything is actually fixed. If it is I owe you few rounds should we ever meet up.
Given that I have wasted about 3 days on this your blog post is welcome.
Joshua Townsend says
Jason – glad it helped. If you can recreate the problem, call VMware. Maybe we can get a KB written on this before it impacts more people.
Wow. I’m having the exact same problem. Thanks for the update.
I’m opening this up to our TAM to see if we can get more pressure on them to get a fix (or at least a KB)
Jim O'Boyle says
Hi, Was just checking back to see if there was any more information about this and in doing further research, tripped across KB 2008144 which covers both the iSCSI and management network problem. That was updated on 3/17 to state vSphere 5.0 Update 1 fixes this and is now available, fyi…
i run into the same failure than you described here (tx!) during Network configuration
after a few tests, the problem does not accure anymore if i left the Gateway for the new configured vmkernel port (vmotion) blank (so it has the GW from the management network) . this is like the configuration in ESX(i) 4
Brandon Neill says
There is only one gateway configured for the vmkernel. The field that you see each time you configure a vmkernel interface will always be the same gateway, if you change it in one screen, it changes in all other screens. The gateway should always be configured for an IP on your manage network, not on the iSCSI, NFS, FT or vMotion interfaces.
same issue here using 2x vmotion
It seems to be fixed by removing the 2 vmotion, restart management and recreate vmotion.
I couldn’t find any KB around this at vmware.com
The 2nd problem you found is not clear in the Update 1 release notes, that it has fixed it ONLY the “unused adapter problem”.
Can anybody clarify this, as we have some strange problems with Active/Standby on the vmotion/ management network?
Dustin Lema says
The problem still exists in U1. I’ve just duplicated this behavior.
If anyone has a fix (as opposed to the dVS workaround) I think we’d all appreciate the answer.
I have seen similar behavior on v126.96.36.199581 also using iSCSI with port bindings.
I think right now the option is to change to vDS.
Can anyone confirm that this behavior affects also Active/Standby configurations?
We are experiencing this same problem and we are on Update 1. I’m also having problems with iscsi losing access to it’s datastores, however i’m not using vmnic’s, i have qlogic hba cards and they are not configured under the networking section, nor are they setup as active/standby config. We are using EMC PowerPath VE version 5.7 to connect to our EMC SAN, and we have intermittent drops of datastores, dead paths, etc. We have exhausted all options, engaged Cisco for the switch side, EMC for the san side, Dell for the server / vmware OS support, and because EMC and Dell see drops to the network they are blaming cisco. However, cisco see’s no problem on our 3750X switches which we use for our isolated iscsi connectivity. We are experiencing the host disconnects as a result of using vmware’s best practices for management and vmotion networks on an active/standby config using 2 nics. I’m going to remove them and set them up on their own vswitch and dedicated vmnic’s for each, but then i lose redundancy. Again, we have update 1 on 90% of our hosts, and an even higher level than that on our latest servers: ESXi 5.0, build 702118. Does anybody have any KB’s or any updates from vmware regarding this? Every vendor is pointing the finger at the other vendor and i’m not getting any headway on my iscsi problem. My qlogic cards are qle4062c, and running the lates vmware/emc certified drivers, and latest qlogic firmware. PLEASE HELP!
Joshua Townsend says
Austin – I haven’t heard an update from VMware on this. I’ll reach out and see if I can get some updated information. Stay tuned!
Did you find anything about the datastores disconnection ? We have a similar bug, but using Native Multipathing with EMC Clariion. Anyone blame others and my configurations seems to be good.
Joshua Townsend says
Datastore disconnects could be due to a number of other factors – for example, older firmware on QLogic and Emulex HBA’s can cause storage drops. iSCSI switches with buffer overruns can cause similar situations. I recommend pulling up the vmkernel log file and analyzing it for clues.
Kyle Wallace says
If you get a KB from VMware on this, please let me know. Would really like to know what update it will be fixed in.
Joshua Townsend says
I asked @VMwareCares on Twitter:
Josh Townsend @joshuatownsend 2 Aug
@VMwareCares Any ideas on this problem I blogged about – I’m still hearing from people that it’s an issue: https://bit.ly/NOaZZV . Thanks!
VMware Cares @VMwareCares 3 Aug
@joshuatownsend We’re still looking into this. We’ll post a KB article / patch when we can.
We’re still in a holding pattern…… I’ll post more when there are updates. –Josh
Joshua. Have you configured a trunk carrying the VLANS for management and vmotion over both vmnico and vmnic1? What happens when vmnic0 (or the physical switch connected to vmnic0) dies? Is the management network available on vmnic1. I don’t know for sure but those look like just access ports with a single VLAN configured on each uplink.
In any case, try trunking those uplinks at the physical switch(es) and pass 802.1q tagged VLANS over vmnic0/vmnic1. I’ve always done this and have the same basic setup as you with Management and vmotion on one vswitch in an active/standby config.
I believe I am seeing this with v5.0 update 1 as well but with NFS data stores.
I have a Management Network vmkernel and a NFS access vmkernel setup on two active/active physical nics. These are the only nics in the host.
Cloning/storage vmotion a 4GB machine takes up to an hour with both vmkernels setup. If I remove the Management Network vmkernel and have vmotion, management, and NFS trafic go through just the one vmkernel the clone/vmotion takes like 2-4 minutes.
Janåke Rönnblom says
Any news about this?
Joshua Townsend says
No news yet – this post still gets a ton of hits and my colleagues and I at Clearpath are still running into the issue. I’m hoping to get some lab time on vSphere 5.1 to see if I can reproduce the issue there….
Michal Rasinski says
I have some strange problems with Active/Standby on the vmotion/ management network, too. One host works ok, and other not.I’ve tried diffrent things and when I change Number of ports on the vSwitch to 56, the connection to management is ok. I will test it and give a feedback.
Michal Rasinski says
I’ve check and it helps only for moment. Still when I add physical adapter to vmotion/management vswitch connection to managemnt can’t be established.
Nick Eggleston says
This issue is supposed to be resolved in 5.0 update 2 (forthcoming) and 5.1 (released). Can anyone test and post results?
Rene Rodriguez says
I have found that moving the standby adapter to unused, resolves the issue with multi-nic vmotion. Every time we tried the active / standby on one vswitch with two vmkernals for vmotion and two physical nics in active/standby, only one nic would do the vmotion work. After i moved the standby adapter to unused on each vmk. vmotion began using both links.
im on ESXi 5.0 Update 1.
Nick Eggleston says
Can you try the same on ESX 5.1? The underlying bug is supposed to be fixed in that release.
Anders Kongsted says
As far as I can see, the problem is there for ESXi.
Can anyone confirm or reject that?
Joshua Townsend says
Correct – this is an ESXi issue.
Kiriki Delany says
Has anyone seen is this is possible related to duplicate MAC addresses in a cluster?
We are seeing issues that could cause this behaviour due to duplicate MAC.
Yes we have a similar problem in ESX 5.1.
We have vSwitch0 with vmnic0 and vmnic1 on the same network with a vmkernal management port and vmkernal vmotion port.
When we migrated a vm using vmotion it would flood the other ports and cause problems within the production environment.
The work around was split vswitch0 and create a seperate vswitch just for vmotion and this was fine. The only downside was no redundancy for both management and vmotion.
We are thinking of creating a trunk as forbsy says.
the symptom I have which may or may not be related is two VM’s on same VLAN have the SAME IP address and did not complain — no IP conflict warnings — my application just updated two different databases : the new one and the old one —
Fred C says
I have had a similar issue with a flexible NIC on a Windows 2003 server that lost it’s connectivity during a sVmotion. I found out that vmxnet3 NIC did not suffer from the same problems. VMWare will not acknowledge not support the flexible NIC unfortunately since it is deprecated in vSphere 5.1 U1, So stay away from flexible since its implementation is broken.
Josh Townsend says
Thanks Fred! Great info and recommendation.
Josh keep up the good contribution to the web community. Hater I think your comment is discraceful! Why did you even comment?
Josh Townsend says
Thanks JD. No worries – I’ve got thick skin. Ken (hater) has some good points – could just work on his delivery.
Like the kids say, haters gonna be haters.