For the past two days at work I have been attempting to setup more advanced networking for my test, almost production VMWare ESX server. The server hardware that I am using (IBM 3550) has two on board Gig Nics. With a current Internet connection of 3Mbps and an upgrade to 10Mbps coming in the spring I doubt I will ever reach the 1 Gbps limit of one of those Nics, but I am planning on running most if not all my infrastructure in a Virtual environment. So I need fail-over and redundancy. So I needed two things: NIC Teaming and Port Trunking.
NIC Teaming
For those of you that don’t know what I mean when I talk about NIC teaming, it’s taking two Network Interface Cards (NICs) and joining them or bonding them together so traffic can come and go on either card/port and reach it’s destination. On the ESX side this was very easy, I simply added the second NIC into my switch0 where my first NIC was already.
Cisco was a bit more tricky, and I googled for help, Scott Lowe turned out to have a wealth of knowledge about the subject and I followed his prescription:
s3#conf t s3(config)#int port-channel1 s3(config-if)#description NIC team for ESX server s3(config-if)#int gi0/23 s3(config-if)#channel-group 1 mode on s3(config-if)#int gi0/24 s3(config-if)#channel-group 1 mode on s3(config-if)#exit s3(config)#port-channel load-balance src-dst-ip
And just like that my NICs were teamed. This was tested by a continual ping (pint -t in windows, standard ping in linux – see why I like linux so much better?) during me unplugging one network cable, and then replacing that cable and unplugging the other. This will allow me to send 2Gbps of traffic back and forth with my ESX server but also allow me to replace network cables or maintain uptime during a NIC failure etc.
Port Trunking
This is the more difficult one and I knew it going into this, but I needed it, badly. Why? Well, I have to explain VLANs to you first. Before VLANs (Virtual Network) If a computer was connected to a switch it could see all traffic on that switch. This meant you needed to buy a switch for each network segment you have. With VLANs you are able to segment ports/servers/machines into seperate networks and prevent their traffic from being seen buy other networks on the same physical switch. I have one for my storage network, one for my internal network, one for my DMZ network and I’ll have a couple more by the time this deployment is done.
So that’s a VLAN, and I had those three setup already but what does this have to do with trunking or the ESX server? Well I want to have a firewall as a virtual machine on this ESX server, so I need access to all VLANs from that machine. In the physical world this would be done simply by adding NICs and connecting them to the different VLANs. While I can add as many NICs to the virtualized machine as I want to I cannot plug them into a VLAN which the ESX server doesn’t have access to. This is where Trunking comes into play.
Trunking allows all VLAN traffic (or you can specify which VLANs) goes do a given port. Or in my case a port-channel which is a teamed pair of ports. I followed Soctt’s method but this is where I ran into my trouble and why I’m posting this entry. I could not find the encapsulation command, in the config mode, in the config-if mode. No where. I later learned that my switch was already configed for dot1q encapsulation. So while I had the trunk established I could not get traffic to pass over it. I digged and digged and then I found it.
The bigger problem was VLAN1. I had inherited the network structure. And it did not follow security standards. My now friend who set this network up did not know at the time that putting all traffic on VLAN1 was a mistake. But blame aside I knew we were operating on VLAN1, I knew we shouldn’t be. But I never changed ti myself. Until yesterday. I determined that VLAN1 traffic does not get tagged as being a part of a VLAN. There is no way to tell a Cisco switch to tag traffic on VLAN1 with the VLAN1 tag. It’s not possible. So when I had the trunk setup traffic was coming to the ESX server marked as VLAN-less, VLAN2 and VLAN3. This caused all of the VLAN1 (read by ESX as VLAN-less) to be dropped. This is by design, and it’s a good thing. But how do you switch from VLAN1 to VLAN4 and not loose connectivity?
I had three switches. I was connected personally to switch 2, this meant that once I ran the command to switch all ports on switch2 to vlan4 I could no longer communicate with vlan1 (personally) and all three of my switches – their management IP addresses are on VLAN1. So I ran the following sequence of commands on Switch3, then switch1, then switch2. When complete all traffic which was on vlan1 would be on vlan4. The only thing missing would be the management of the switches. (more on this in another post when I figure this one out!)
s3#conf t s3(config)#int range gi0/1 - 10 , gi0/16 , gi0/20 , po1 s3(config-if)#switchport mode access s3(config-if)#switchport access vlan 4 s3(config-if)#exit s3(config)exit s3#sh run int gi0/1 switchport
I forgot to mention the range command, allows you to issue comands to multiple ports. If you want to od all FastEthernet ports on a 24 port switch: int range fa0/1 – 24. Or as in the case above a range and a couple of specified ports use the comma in between. Also note I did po1, which is the port-channel created for the NIC teaming. In the end I removed that so all VLAN traffic (except VLAN1) was being sent down that port.
Wow this turned out to be a longer post that I thought it would be. When I finished the above I still struggled with my management of the switches. I’m still working on that. so far I’ve given an IP address to vlan4 on each switch. Only one is consistanly responding on that ip address, the others seem to fail the majority of the time. But hey thats why we have console cables right?
When I figure that out I’ll post it here. If you have suggestions post it in the comments.
Oh and yes, not my virtual server on the ESX server has access to VLAN2, VLAN3, and VLAN4. what a great end to the year. Sucess!