Archive for February, 2015

There is an issue I have noticed with VMware systems deployed with Nexus vPC technology that involve traffic only making it out of the vPC by disabling half the vPC or getting rid of the vPC completely. Initially you’re thinking this is a Cisco issue and I am here to tell you that you’re wrong.

In the virtual switch port-groups and the VMNIC teaming there is a load balancing algorithm you can choose from. I have seen issues where the VMNICS are set to route based on IP hash but the port-group could be set to something like route based on originating  port-id. 

If you’re noticing that pinging the machine from the vPC enabled switches, if they have a SVI enabled, that the ping is only responsive on ONE of the devices and from a north end machine, outside the vPC and probably your desk, only gets responses when HALF the vPC is down, you need to immediately check the hashing for the vmnics and the port-group.

Use the command: esxtop – to review what virtual machines are using what vSwitch and vmnic port to further aid in your troubleshooting.

I would highly suggest you keep it the same at both levels, there may be only odd circumstances where mixing these is helpful but you’re likely trading predictability for what may be perceived performance you’re probably not getting.

I was in a training class recently and they were speaking about ECMP and how it “converges” if a link goes down. Let me just say this, that is absolutely incorrect and is just as bad as saying “I have two class C’s”, it really doesn’t bode well with most people.

With ECMP you’re actually installing multiple routes of the same cost into the routing table and you’re either going to load balance based on a per-packet or per flow basis with per-flow being the most preferred because of the nature of TCP operations. Now, how it load balances on which link will be determined upon the algorithm used, most use round-robin.

Please understand, ECMP doesn’t mean the links are of EQUAL bandwidth and latency, just from a metric cost perspective they’re “equal”. When a link goes down there is absolutely no convergence taking place, the packets/flow just get routed out of one of the other available, equal-cost links. Please stop saying they’re “converging” because that makes most think there is either a dynamic computation taking place with a dynamic routing protocol or the router itself is having to install a route into the RIB from the FIB.