vSphere – vDS port disconnects & confusion


This week I got a call from one of my workmates of the Microsoft team, who asked me to check a virtual machine which seemed to have some network problems.

The first thing I checked were the basics like CPU, RAM, SAN & network utilization. None of those resources was heavily used so I moved on to the Events tab of the affected VM, which showed the following:

Remove snapshot
VMNAME
Completed
USERNAME
VCENTER.local
DATE 02:15:28
DATE 02:15:28
DATE 02:19:12

Actual the snapshot creation was not the time I was looking for but the removal, but somehow I didn’t care that much about it and I also checked the vDS “Events” tab and some entries made me curious:

The dvPort 10810 link was up in the vSphere Distributed Switch  in DATACENTER Info DATE 02:19:10

The dvPort 10810 was unblocked in the vSphere Distributed Switch  in DATACENTER. Info DATE 02:19:10

The dvPort 10810 was not in passthrough mode in the vSphere Distributed Switch  in DATACENTER. Info DATE 02:19:10

The dvPort 10810 was not in passthrough mode in the vSphere Distributed Switch  in DATACENTER. Info  DATE 02:18:45

The dvPort 10810 link was down in the vSphere Distributed Switch  in DATACENTER Info DATE 02:18:45

Also the vmkernel.log showed corresponding messages:

*Timestamp +1 hr !

DATET01:19:10.798Z cpu50:11074)NetPort: 2599: resuming traffic on DV port 10810

DATET01:19:10.798Z cpu50:11074)NetPort: 1237: enabled port 0x200000e with mac 00:50:56:XX:XX:XX

My first thought: “25 seconds of network downtime, there must be something wrong!?” But it didn’t took long to find the show stopper:

According to KB2011040 and the VMware support all those messages can be safely ignored.

OT: Why do I post log messages which can be ignored?

However, what was causing the network issues? In the end the miracle has been solved by the virtual machine log:

DATET01:19:10.755Z| vcpu-0| Checkpoint_Unstun: vm stopped for 24733733 us

Sometimes one cannot see the wood for the trees… So problem was really “just” caused by the snapshot removal and the corresponding virtual machine stun. The process is basically the same as when creating a virtual machine snapshot, the VM gets stunned for a certain time to freeze I/O and to switch to the delta.vmdk.  Unfortunately I missed that the VM already had two snapshots and this unnecessarily increased the snapshot removal duration.  I hope this helps when you are seeing those vDS event messages and you start to wonder if they are maybe the cause of your problem …

Print Friendly

Related Post

It's only fair to share...Tweet about this on TwitterShare on LinkedInEmail this to someone

Leave a comment

Your email address will not be published. Required fields are marked *