Cisco Switched Internetworks

Chris Lewis

Chapter 2

Switched VLANs

Objectives

In this chapter, we’ll look at VLANs in more depth. We’ll look at the differences between switched VLANs in ethernet and token ring environments, trunking VLANs between switches using FDDI and ISL (trunking over ATM is covered later), we’ll cover VMPS, VTP, and customizing Spanning Tree. We’ll also examine how full duplex can be incorporated in to a network and the role that CGMP plays in optimizing multicast capabilities. This chapter forms the theoretical background for the router and switch configuration used in later chapters.

VLAN Environments

There are several concepts and technologies that have come together which now contribute to the switched VLAN environments we see today. In this section we’ll look at multi-switch ethernet VLANs, full duplex ethernet and token ring in a switched environment.

Multi-VLAN Ethernet Switches

In Chapter 1 we saw that each interface on a switch was a collision domain, and each VLAN defined within the switch is a broadcast domain. The question arises of how a switch identifies a packet as belonging to one VLAN or another. The answer is VLAN tagging. Internal to a switch, all packets coming in on an interface are tagged with the VLAN ID of that interface. This VLAN ID is then used within the switch to associate this packet with the VLAN in question. This is clearly of most benefit to incoming broadcast packets that will be confined to the switch interfaces associated with that VLAN. The VLAN ID added to the packet is removed prior to the packet being sent out of the switch to an end station. The only time a packet leaves a switch with the VLAN ID still appended is when the packet is traversing an inter-switch trunk, as shown in figure 2-1.

The ability of VLANs to segment traffic, localize collisions and control broadcasts have been discussed. VLANs can also improve security, in that packets sent on the network are not visible to everyone, as is the case in a shared network. It is simple to create a VLAN for users with high security needs, then all the packets they send on their VLAN will not appear on any other segments, making it impossible for other users to capture and decipher their data. With switched ethernet leading to instances when a switch interface has only one end station connected to it, the possibility of changing the half duplex nature of ethernet is made possible. Of course, implementing full duplex only works for interfaces that are dedicated to one end-station, but if that end station is a server of some kind, significant benefits can arise.

Full Duplex Ethernet

Full duplex ethernet connections use point to point cabling, such as you could use between switches or between a switch interface and a single server, no hubs are involved in the process. Full duplex by definition means that both ends of the point to point link can transmit at the same time, therefore collisions are completely eliminated in a full duplex connection. Full duplex connections can use 10baseT, 100BaseTX, 100BaseFX and ATM as the point to point link as illustrated in figure 2-2.

Within a standard half duplex ethernet NIC (Network Interface Card) only the transmit or receive circuitry will be active at any time. For example, when a station is not actively transmitting data on to the network, the receive circuitry is active, performing the carrier sense function of ethernet. Effectively both the transmit and receive circuitry share the same logical network cable and can therefore only operate independently of each other.

In a Full duplex ethernet NIC, there is no collision detect circuitry and the cabling provides a direct path from the transmit circuitry of the NIC at one end of the point to point connection, to the receive circuitry of the NIC at the other end.

With this arrangement, as shown in figure 2-3, there is no multiple access possible on the same media as in half duplex ethernet. Half duplex normally starts to reach its limit at around 50% utilization. Full duplex can happily operate at 100% utilization in both directions without significant performance implications.

The theory of switching ethernet packets, using full duplex ethernet and incorporating VLANs in to ethernet networks should be clearly understood by now. Later chapters will cover how to configure these features on Cisco hardware. Switched token ring and VLANs for token ring are a whole other story and are covered next.

Switched Token Ring

Token ring refuses to die. Despite all the advances with ethernet technology and the leadership in market share, there are still a significant number of token ring installations. Indeed, with new advances like switched token ring, and token ring at 100 Mbit/sec, along with some of the inherent benefits of token ring’s technology, like its ability to set different priorities, it looks like token ring is here to stay. At least for the immediate future. Let’s just review some of the basics of traditional token ring operation before we look at the newer features.

Traditional Token Ring Operation

Token ring is based on the IEEE802.5 specification. Although the physical appearance of token ring wiring is that of a star, the physical path is a ring, as illustrated in figure 2-4.

Priority values 0-3 are set by the user, 4 is for bridge transmissions, 5 is for non-real time multimedia and 6 is for real time multi-media, 7 is for critical MAC frames. Switches typically use two priority queues, one high and one low. Commonly priority 0-4 are assigned to the low queue and 5-6 to the high queue. Priority 7 frames use the system queue for the highest priority.

On each ring, a continual process of LAN Monitor election is in effect. Essentially, every few seconds, all stations on a ring will try to determine who should be the LAN Monitor. The LAN Monitor makes sure that the token is in tact, that there are no duplicate tokens, and general housekeeping tasks. The token itself is a packet containing a special bit pattern that each workstation needs to receive before it can transmit on to the network. By this method of passing the token around the ring, token ring networks eliminate collisions. As the token traverses the ring, each station repeats the data, performs error checking and if the station is the destination, passes the data to the higher layer protocols within the workstation. When the token gets back to the originating station, the data is removed from the ring and the token is passed on to the next in line.

As the token is necessary to send data on the network, it is possible to alter individual station’s priority so that they get the token more often than others.

Token Ring networks were extended by the use of source route bridges in much the same way the ethernet networks were extended by the use of transparent bridges. Source route bridges however, appear to be a standard end station on the ring to other end stations (as opposed to transparent bridges which are unseen to other end stations). Source Route Bridges (SRBs) forward packets destined for other rings and copy packets from other rings on to the local ring. SRBs end up having similar limitations to them that transparent bridges did in the ethernet world and experience difficulty when scaling to larger environments and dealing with multimedia applications. We’ll now take a look at how token ring handles source routing and particularly the use of explorer packets to determine routes. This is necessary to understand the benefits of the switched token ring environment and will be useful in later chapters when we discuss ATM’s routing, which is based on a form of source routing.

Source Routing

The goal of source routing is to generate a packet heading containing a route that is inserted by the end station. That route consists of a sequential list of bridges and LAN segments that form the path from source to destination. The traditional mechanism by which the source knows the route to get to the destination, is by means of all routes explorer packets. Assuming a source end station wishes to send a packet to a remote (ie located on another segment) destination MAC address of unknown location. First a local test frame is sent on the local segment, which will be returned indicating that the destination address is unrecognized. Then the source end station will send out a single all routes explorer packet that will be replicated at each potential route choice, on every possible path. These explorer packets keep a diary of their travels and when they reach the destination station, they are returned to the source. The first explorer packet to return to the source end station has its route information selected and a cache of discovered routes is kept in the end station for future reference.

To insert this information in a packet header obviously requires additional fields that are not used when a packet is sent to an end station on the local segment. The indicator that marks whether this additional information is present, is the Routing Information Identifier (RII). The RII is the multicast bit in the source address that is not used in normal circumstances, as nobody ever sends from a multicast address. If an RII is present, a RIF (Routing Information Field) is next (as shown in figure 2-5), that contains the following fields.

Type: 3 bits are used to identify whether the packet already contains routing information, is an all paths explorer, or a spanning tree explorer (sometimes referred to as a single route explorer)

Length: 5 bits are used to identify the number of bytes in the RIF.

Direction: 1 bit specifies whether the route should be read from left to right, or right to left.

Largest Frame: A 3 bit value represents the maximum packet size

Route: A sequence of 16 bit route designators, split between 12 bits for the LAN number and 4 bits for the bridge number.

In addition to this All Routes Broadcast, there is a single route broadcast, which is also known as the Spanning Tree Explorer. This mechanism relies on Spanning Tree to create a loop free path to all LANs for explorer packets, so that the replication of an all routes broadcast explorer does not happen.

Token Ring Bridge Types

Cisco token ring switches support source route bridging (SRB), transparent bridging (TB), and Source Route Transparent (SRT) bridging. Source route bridges operate as described in the previous section and token ring transparent bridges operate much the same as ethernet transparent bridges and forwarding decisions are based purely on MAC address (that is the bridge refers to a station cache to determine which interface to forward the packet to). The Source Route Transparent bridge can operate as either a source route bridge when a RIF is present, or as a transparent bridge when no RIF is present.

One problem with the normally desirable SRT bridging is that it does not allow the same MAC address to appear on different rings. At first this may seem a reasonable restriction, however, in an SNA configuration utilizing the load balancing and fault tolerant features of 3745 controllers, the lack of duplicate MAC address support is troublesome. The 3745 controller allows all of its interfaces to utilize the same MAC address, this enables multiple source route paths to be defined for the same MAC address, thus providing load balancing and automatic recovery in the event of an interface failure. For this purpose it is preferable to maintain a pure source route environment when IBM controllers are present. However, the issue arises of how to support ethernet LANs in that environment, as ethernet frames do not support a RIF. Note, the following discussion is only relevant if you need to bridge ethernet and token ring together. Routing between ethernet and token ring does not present these problems, typically the only time ethernet and token ring need to be bridged is if non-routable protocols like Netbios need to be supported across both media.

Source route token ring networks are bridged to ethernet LANs using Source-Route Translational Bridging (SRTLB). We will discuss the operation of this type of bridging with reference to figure 2-6.

In practice, ethernet segments are assigned a virtual ring number that the source route LANs can use to identify it with. This LAN number only exists within the switch or router being used to connect the two segments (in this case the ethernet segment is assigned a token ring LAN number of 100). If a source route packet comes in to the switch or router connecting the token ring and ethernet, the RIF is terminated in that device and the packet is transparently bridged on to the ethernet segment.

Besides the adding or deletion of a RIF as packets traverse the ethernet to token ring boundary, the two other prime reasons for translational bridging are bitswapping and handling of netbios name queries.

Bitswapping arises because ethernet networks transmit bytes on the wire starting with the least significant bit first (canonical), whereas token ring networks start with the most significant bit (non-canonical). This means the device connecting token ring and ethernet must reverse the bit order for each byte in sequence. An example with reference to figure 2-6, is if PC1 wants to send a packet to PC2. Supposing PC2 has a token ring MAC address of 00-00-c0-00-00-02, the router/switch device in this figure must actually tell PC1 that PC2 has a MAC address of 00-00-03-00-00-40.This is achieved by converting the address to binary and reversing the order of bits in each byte.

Handling netbios name queries is also difficult. Supposing PC1 wants to determine the netbios name of PC2. Within the netbios protocol, this is handled via broadcasts, which in the ethernet environment are sent to MAC address FF-FF-FF-FF-FF-FF. However, in the token ring environment, PC2 listens to MAC address C0-00-00-00-00-80 for name queries. To accommodate this, the router/switch must recognize that PC1 is sending a name query, generate a RIF (if there is additional bridging in the path to PC2) and change the destination address. Fortunately most of this is hidden from the configuration details. All that really needs to be completed is to create the pseudo ring that is identified with the ethernet LAN, which is done by the source-bridge transparent command.

Source Route Switching

Source Route Switching (SRS) can be used to segment an existing token ring in a source route bridge network. SRS allows several switch interfaces to be configured as part of the same ring number, as illustrated in figure 2-7.

In this figure, the interfaces that are part of the distributed ring (interface 1 and interface 2) use SRS to carry packets between these interfaces, and either SRT or SRB to send packets to other interfaces that are configured for different ring numbers. SRS operates by switching packets based on MAC address, in much the same way as an ethernet transparent bridge does with a station cache. Given that this is the way it operates, maybe a better name would be plain token ring switching, as no source routing is going on. However, we have to live with what we are given.

Token Ring VLANs

In concept, token ring VLANs are the same as ethernet VLANs, they are a layer 2 broadcast domain. Whereas routers were used to internetwork between different bridged networks (in a bridged network, all stations are on the same IP subnet), switches are used to internetwork between different VLANs. In transparent bridging (as used on ethernets), there is only one type of broadcast, and therefore only one type of broadcast domain. In token ring there are broadcast types that are confined to a single ring, and broadcast types that traverse the entire bridged network. In traditional token ring LANs, individual bridge interfaces identify separate rings. Typically end stations were connected to Multi Station Access Units that were chained together to form a ring. With this topology, it was simple to see that ring specific broadcasts did not travel through bridge interfaces. With the switched token ring VLAN environment, it is a different story. We have seen that with SRS, a single ring can span multiple interfaces on the switch. To accommodate the two broadcast domains present in a token ring VLAN, we meet the first terms that differentiate a token ring VLAN from an ethernet VLAN, which are the CRF and BRF.

CRF is an IEEE term which stands for Concentrator Relay Function. The IEEE consider the familiar Multi-Station Access Unit (previously referred to as a MAU, now an MSAU) a concentrator. All the CRF is, is a way of identifying interfaces on a switch that should be considered part of the same ring. With reference to figure 2-7, interface 1 and 2 would be associated with the same CRF, as they are configured for the same ring, number 100. It should be noted however, that a CRF cannot have physical token ring switch ports directly assigned to it. When configuring the switch, vlan numbers are assigned to crf names with the "set vlan" command and VLANs are associated with physical interfaces via a separate "set vlan" command. Thus the association between a CRF and a physical interface is therefore a two step process.

The second token ring specific VLAN term is a BRF, the Bridge Relay Function. The BRF is used to transport broadcast packets between different rings. In figure 2-7 the bridge relay function facilitates broadcast communications between interfaces 3 and 4.

Both the CRF and BRF are internal functions of the switch. The best way to think of it is in the order of the configuration tasks that we will explore in chapter 5. A BRF is given a VLAN ID number that identifies the bridge (SRB or SRT) that connects all logical rings (CRFs) together. Multiple CRFs end up belonging to the single parent BRF. This may become clearer with reference to figure 2-8. In this figure, CRF 1 groups together interfaces 1, 2 and 3 and the BRF is used to communicate between these interfaces to interface 4, which is part of CRF2.

In summary, a CRF identifies an interface, or group of interfaces that belong to the one ring. Within a CRF, SRS is used to switch packets between interfaces. Multiple CRFs are connected with one BRF that uses either SRB or SRT to forward packets between CRFs.

Dedicated Token Ring

Just as shared hubs in the ethernet world are half duplex devices, so token ring adapters that connected to MSAUs are half duplex. Similarly, when we come to the world of switching, ethernet can support full duplex communication for end stations connected directly to switch ports, token ring can also support full duplex in the same configuration.

The traditional operation of token ring adapters is a passing of the token from NIC to NIC around the ring, this is sometimes referred to as TKP (Token Passing). Dedicated Token Ring (DTR) is specified in a new IEEE standard 802.5r. This standard defines how a switch interface can look like a concentrator (MSAU) port to an end station NIC. Also, the DTR specification defines a new method of communication that bypasses the need for a token to be passed between end stations, which is known as Transmit Immediate (TXI). With TXI in effect, the switch to end station communication is full duplex enabling the switch interface and end station NIC to transmit simultaneously.

DTR with TXI is really only appropriate for server connections in a token ring environment, much the same as full duplex ethernet.

An interesting question though, is a point to point connection with TXI really token ring, when there is no token and no ring? I don’t really know, maybe if that link can be part of a source route scheme. The real point is that the traditional distinctions between all classes of network technology are blurring and becoming less meaningful.

VLAN Trunking

VLAN trunking describes how switch to switch traffic can be encapsulated to support multiple VLAN traffic being transported over the one switch interface. This is useful when you have more than one catalyst in a switched network and want to have interfaces on different Catalysts being part of the same VLAN. The most important encapsulation that supports multiple VLANs over the one segment, at least from the Cisco point of view is ISL, the Inter Switch Link protocol. Both 802.10 FDDI and LANE across ATM can be used to transport VLANs through a backbone, as well as ISL on Fast Ethernet.

ISL

ISL is currently private to Cisco, but in many ways is similar to the 802.1q specification. To deploy ISL on a server requires an ISL aware NIC. This is a good idea to implement as it enables multiple VLANs to communicate with a server directly via a switch interface, without recourse to a router interface as shown in figure 2-9. In this figure, assuming the server is equipped with an ISL capable NIC, packets from multiple VLANs will be sent directly to the server by Catalyst3.

The key enabler for this technology, is the VLAN ID in the ISL header. A different encapsulation is required for this facility, as adding a VLAN ID to an already maximally sized ethernet packet would cause devices to recognize the packet as a giant (i.e. an oversized packet) and discard it. For this reason, the VLAN IDs inserted by switches are removed prior to transmission on to non-ISL links. The ISL frame format is shown in figure 2-10 and 2-11.

Although the ISL specification allows an ISL frame payload (identified as the encapsulated from in figure 2-10) to be up to 24.5 Kbytes, the maximum packet size is currently never reached. ISL currently encapsulates ethernet, FDDI or token ring frames, and the size of the payload is limited by the maximum packet size of each of these technologies. The ISL frame encapsulation is 30 bytes, split between the 26 byte header and the 4 byte CRC. Of the

technologies encapsulated within ISL, FDDI has the minimum packet size, set at 17 bytes therefore, the minimum ISL encapsulated packet is 47 bytes. Token ring now has a theoretical maximum packet size of 18,000 bytes, yielding a maximum ISL packet size of 18030 bytes. Multiple frames are not encapsulated within the one ISL packet at the moment. If at some point in the future multiple frames could be encapsulated together, they would have to be destined for the same VLAN, as there is only one VLAN identifier in the ISL header.

The following describes the individual fields within the ISL header.

DA - Destination Address: The DA field of the ISL packet is always set to the same 40 bit address. This address is a multicast address and is currently set to be 01-00-0C-00-00. This indicates to the receiver that the packet is in ISL format.

TYPE - Frame Type: The TYPE field value indicates the type of frame that is encapsulated. The values currently defined are 0000 for Ethernet, 0001 for token ring 0010 for FDDI, and 0011 for ATM.

USER - User defined bits: The USER bits are there as an extension of the TYPE field and usually have a value of 0000, which is the default. For Ethernet frames, two USER field values have been defined, which are 0 and 1 that indicate the priority of the packet as it passes through the switch.

SA - Source address: The SA field is the source address field of the ISL packet and contains a 48 bit MAC address of the switch interface transmitting the frame. As ISL is a point to point link, the receiving device may ignore the SA field of the frame.

LEN – Length: The LEN field represents the packet size in bytes excluding the DA, T, U, SA, LEN, and CRC fields, it is stored as a 16-bit value.

AAAA03: The AAAA03 field is an 18-bit constant value of AAAA03.

HSA - High bits of source address: The HSA field is the upper 3 bytes, the manufactures ID portion, of the MAC source address. It must contain the Cisco prefix 00-00-0C.

VLAN - Virtual LAN ID: The VLAN field is the virtual LAN ID of the packet, which identifies the VLAN membership of the encapsulated packet.

BPDU: The BPDU referred to here is the BPDU used by spanning tree, the bit is set if the encapsulated packet is of this type.

INDX: The INDX field is used for diagnostic purposes only and may be set to any value by other devices. It is a 16-bit value and is ignored in received packets.

RES: The RES field is used to accommodate specific fields in Token Ring or FDDI packets that are not present in ethernet (in which case the field is all zeroes). When token ring packets are encapsulated with an ISL packet, the AC and FC values appear here. For FDDI packets, this field contains the content of the FC field.

Payload: Is the encapsulated frame, including its own CRC value, completely unmodified. The internal frame must have a CRC value that is valid once

the ISL encapsulation fields are removed. The payload can, in theory, vary from 1 to 24.5 Kbytes. Once a switch receives an ISL frame, the ISL header is stripped off and the payload is associated with the VLAN in question on the receiving switch. It is possible that the payload may be re-encapsulated in ISL for transport on to another switch, if the VLAN exists there.

CRC: The CRC is a standard 32-bit CRC value calculated on the entire encapsulated frame including the payload. The receiving switch will check this CRC and can discard packets that do not have a valid CRC on them. Note that this CRC is in addition to the one at the end of the payload data.

VLAN Trunk Protocol

The VLAN Trunk Protocol (VTP) is used to distribute VLAN information to switches across connecting trunks. In concept, it is like a routing protocol that advertizes VLANs and provides reachability information across the inter switch connections. VTP enabled switches send summary advertisements every 300 seconds to VLAN1 (the default management domain), only on trunk interfaces. The advertisement contains a configuration revision number (starting at 0 and incrementing by one each time) a list of the VLANs this switch knows about and some configuration information for each VLAN. The advertisements are sent to a multicast address so that all neighboring switches receive them, but are not forwarded by switches. Routers, unless they are configured for bridging will ignore these advertisements. The prime benefit of this protocol is that all switches within the same management domain learn about new VLANs created on the switch sending the VTP advertisements. Additionally, as VLAN configuration only has to be entered once, you get the benefit of no re-typing errors. Once VTP is operational within a switched network, new switches can be brought on line with a minimal VTP configuration and learn about the VLAN configuration of the network via VTP, thus reducing the configuration work necessary.

When we show how to configure VTP in chapter 5, we’ll show the configuration of all Catalysts in the switched network as the server for VTP. It is possible to configure one Catalyst as the server and others as clients, however, Catalysts configured as VTP clients can lose their VTP configuration when power cycled. This is because VTP servers keep their configuration in non-volatile memory, or can access it across the network via TFTP when required.

Even with VTP configured for each Catalyst, it is still necessary to configure the VLAN membership for individual interfaces on each Catalyst. Through VTP, each Catalyst will know about the VLANs that exist, and the Catalysts they are on, but not the individual interface membership within a Catalyst.

We have already mentioned a management domain and need to define what this means. The management domain is set by using the set vtp command, which establishes (among other things), the management domain name, and VTP operation as client or server. A management domain is a collection of VLAN numbers that are all configured for the same domain name. Several VLAN domains can exist within a network and can be thought of as equivalent to separate routing domains in a routed network, however, a switch can only be in one domain at a time.

VTP Pruning

It is possible to optimize VTP by enabling pruning on certain interfaces within the domain. Cisco switches have single colored interfaces, meaning that each interface only belongs to one VLAN. This makes the management of broadcasts within VLANs on a single switch effective, as broadcasts for one VLAN are not forwarded out of an interface belonging to another VLAN. But what about multi-colored ports used as VLAN trunks between switches, they carry traffic for multiple VLANs, consider figure 2-12.

Without VTP pruning, broadcast information will be sent for both the red and green VLANs to switch 2, which will forward that on to switch 3 via the trunk link. As switch 3 only has red VLAN interfaces defined, it will drop the broadcasts belonging to the green VLAN. This however has wasted bandwidth on the switch 2 to switch 3 trunk. It is possible to stop this wasting of bandwidth by pruning the Green VLAN traffic at the switch 2 trunk connection that leads to switch 3.

To enable this type of facility, each trunk interface keeps the status of a variable on a per VLAN basis. This variable can either be in the joined state, indicating the interface will send broadcast and other flooded frames for the VLAN, or the status will be pruned, in which case the interface will not send broadcasts (other than STP, CDP and VTP packets) originating from that VLAN.

VLAN Membership Policy Server

VLAN Membership Policy Server (VMPS) is a mechanism that allows a new end station to be introduced to the network, and for it to be automatically configured to a specific VLAN. This is done by reference to a VMPS database that lists MAC address to VLAN membership. This list of associations is really a text file that resides on the VMPS. The process is initiated when an interface on a switch is configured as dynamic. The only real restriction in the network design when using VMPS is that VMPS servers and clients must be configured for the same management VLAN and that security features are limited on interfaces that want dynamic VLAN assignment.

The configuration commands to setup VMPS vary from switch to switch, but generically they must include the following configuration.

· Define an IP address for a TFTP server and a filename for the VMPS database that resides on that TFTP server.

· Enable VMPS for the switch being configured.

· Define the interfaces that will be obtaining their VLAN membership dynamically.

VMPS has some attractions for a centrally managed network, although it does require some work to setup initially. The trade off is with VMPS, all you need do is add MAC to VLAN membership entries on a central text file database to bring a new device on-line. Without VMPS, each interface on the network needs to be assigned to its VLAN by logging in to that switch and configuring the interface manually. Of course, typing MAC addresses by hand is a problem and needs alteration if the device gets a new NIC. We’ll look at this more in chapter 5.

FDDI VLANs

FDDI (Fiber Distributed Data Interface) has been around for many years and has become a popular LAN backbone technology, being able to supply 100 Mbit/sec bandwidth before 100 Mbit/sec ethernet was available. FDDI is defined by ANSI X3T9.5 and is implemented as a token passing network. Specifically, FDDI has two simultaneously counter rotating tokens, each operating at 100 Mbit/sec to provide redundancy.

FDDI implemented on multi-mode fiber optic cable can span a distance of 2 kilometers between end stations, with single mode capable of supporting links up to 32 kilometers between end stations.

CDDI (Copper Distributed Data Interface) uses the same protocols as FDDI, but is implemented on copper cables and has become popular as a way of providing 100 Mbit/sec throughput with STP or UTP cables, for distances of up to 100 meters.

At the time that FDDI was put together, it was very expensive to support circuitry that would generate a 200 Mhz clock signal to support 100Mbit/sec throughput (with the type of Manchester encoding used by ethernet and token ring, there are two clock transitions for every bit of data). To avoid the cost of 200Mhz circuitry, the FDDI designers went with 5 bit encoding of four bit data. This all comes about through having to maintain synchronization between end stations when transmitting a constant stream of 0s or 1s.

With Manchester encoding, voltage transitions, rather than voltage levels represent data. This assures that even if a constant stream of 0s or 1s is transmitted, there are always transitions occurring in the data stream to maintain synchronization. With 5 bit encoding of four bit data, it is possible to assign 5 bits for each four bits of data and maintain transitions within the data stream even when continuously sending 0s. By this mechanism, it is possible to have circuitry running at 125 Mhz that supports throughput of 100Mbit/sec.

Interestingly, ATM shows its roots in being designed for optical networks when it comes to delivering 25Mbit/sec ATM to the desktop over token ring cabling. Token ring cables support 16Mbit/sec throughput, which with Manchester encoding requires a clock speed of 32Mhz. Therefore, if we use 5 bit encoding of four bit data, we can get an effective data rate of four fifths of 32 Mbit/sec, which yields 25.6 Mbit/sec, the speed that ATM to the desktop operates at.

Encapsulating or Translational Bridging

The method of getting packets across a FDDI backbone to ethernet networks, is usually implemented via routing between the two topologies. However, when we come to look at mapping ethernet VLANs across a FDDI backbone, we have to look at some form of bridging, to allow the two ethernet segments connected via FDDI to belong to the same subnet. This is necessary, as the same subnet cannot appear at two physical locations within a network, as it would confuse the routing tables.

Ethernet and FDDI frame types are different, so there are two ways of bridging ethernet to FDDI, either by translating the packet information, or encapsulating ethernet frames within FDDI frames. Translational operation (in accordance with IEEE 802.1H) is the default and does introduce more latency because of the translation process, which can become a concern particularly for connection oriented protocols. Currently, however a Cisco switch using the encapsulation method is only compatible with another Cisco switch and limits the interoperability of devices on your network. With FDDI interfaces, there are multiple encapsulation types available, just as there are for ethernet, examples include encapsulation sap for 802.2 mode and encapsulation snap for sub network access protocol operation, which happens to be the default. The interface command fddi encapsulate changes the mode of bridging from the default translational to encapsulation.

Apart and fddicheck

An important part of translational bridging is Apart, Automated Packet Recognition and Translation. Apart uses a lookup table (referred to as a CAM, Content Addressable Memory) that maps a specific layer 2 frame type, such as ethernet II or SNAP to a MAC source address. A FDDI module can be configured to use this feature by the set bridge command.

Another feature of the FDDI module on Cisco switches is fddicheck. This feature is available to counter the operation of some older FDDI devices that do not conform to the most recent specifications. What should happen is that a FDDI interface should wait until it receives a token before transmitting on to the fiber cable. The specifications also state that two frames marked as void frames should be sent immediately after any data frame is sent. As the FDDI interface maintains possession of the token until it receives one of the void frames it sent following the data frame, it does not expect to see any other end station use the FDDI ring. If any other packets are seen on the ring while the interface holds the token, they are stripped off the ring. However, problems occur if another station erroneously sends a void frame on to the ring without possession of the token. Consider figure 2-13.

Let’s say station 1 on the ethernet needs to send a packet via the FDDI ring which is delivering backbone service. The first thing that happens is the switch receives the packet and when int 1 on the switch receives the FDDI token, it sends the packet received from station 1, then the two void frames as before. In normal operation, the sent packet will travel the FDDI ring and return to int 1, which will then remove the packet from the ring. The void frames are then received by int 1, which then takes these frames as the signal to stop removing frames from the ring.

Problems occur if the old FDDI device sends void frames without having possession of the token. This is particularly problematic if these erroneous void frames are sent out before the switch int 1 interface has removed its packet from the ring. With reference to figure 2-13, the scenario is that station 1 sends a packet destined for a device on the other side of the FDDI ring to the switch. The switch will insert an entry in its CAM listing the source address and the frame type (in this case the MAC address of station 1 and a frame type of 802.3). Assuming the switch is performing translational bridging, it forwards the frame on to the FDDI ring with the ethernet address of station 1 as the source. If the old FDDI device now sends a void frame before this packet returns to int 1 on the switch, the switch will take this as a signal to stop removing packets from the network. Therefore, when the packet int 1 did send on the ring returns, it will be forwarded on again instead of being removed. Without fddicheck, the situation gets even worse as the CAM on the switch is now updated as the packet appears to have originated on the FDDI ring, with the source address of station 1. So the result is loss of connectivity and loss of bandwidth due to continually circulating packets on the FDDI ring.

What fddicheck does to prevent this happening is to check the source address of all incoming packets against the CAM before the CAM is updated. If a packet coming in to the FDDI side of the switch has a source MAC address that has already been associated with the ethernet side of the switch, the CAM will not be updated and connectivity is maintained.

As fddicheck uses the CAM, Apart must be enabled for fddicheck to be enabled.

802.10 VLAN Tagging

In the section on VLAN trunking, we said we could use ISL, FDDI or ATM links as trunks. To enable FDDI to be used as a VLAN trunk, we must implement 802.10 frame tagging.

802.10 is now implemented as an open standard that enables LAN traffic to carry a VLAN identifier. 802.10 was originally conceived to improve security on shared LANs, providing encryption and authentication features. As a layer two protocol, 802.10 is suitable for use in switched LANs and enables fast switching of frames by layer 2 addressing. This standard is implemented by the 802.10 header appearing after the MAC header, but before the frame’s data, as illustrated in figure 2-14.

The frame format shown in figure 2-14 is referred to as a Secure Data Exchange unit (SDE) and is the encapsulation type that needs to be configured on a FDDI interface to enable 802.10 frame tagging. The 802.10 header is split in to clear and protected sections. The clear header comprises the 802.10 LSAP, the SAID (Secure Association Identifier), and the MDF (Management Defined Field) is an optional feature. The protected header is a copy of the source address, for validation purposes to ensure that the frame is from the advertised source address. The ICV (Integral Check Value) is a secure form of FCS, it enables a receiving station to check that the data contained in the packet has not been modified in transit.

When using 802.10 to facilitate VLAN frame tagging, switches must minimally support the clear header portion. The clear header comprises the LSAP, which identifies the frame as an 802.10 VLAN frame, the SAID, which provides the VLAN ID and the MDF optional field.

In an environment where a VLAN is distributed between two switches that are interconnected via a FDDI ring, as in figure 2-15, the switches maintain a VLAN to SAID association. In this setup, if an ethernet end station on VLAN 1 connected to switch 1, that wants to send to an end station on switch 2, VLAN 1 it will want to send the normal 802.3 frame on to the ethernet. When the packet reaches switch 1, an 802.10 header is appended for transporting the frame over the FDDI ring. Once switch 2 gets this packet, with the appended header, it examines the SAID to see if it is destined for one of its VLANs, if it is, the frame has the 802.10 header removed and the frame is forwarded to the appropriate VLAN interfaces otherwise the frame is dropped.

This association between VLAN and SAID number are defined using the set vlan command, which will be examined further in chapter 5, however, VTP does advertise these VLAN mappings.

It should be noted that there is no direct relationship between a VLAN number and a SAID number, they can be selected independently of each other and any VLAN value (up to 1000) can map to any SAID number (up to 4.29 billion). One implementation issue that should be considered here is that when multiple VLANs are mapped across a backbone FDDI, it is advantageous that a separate instance of the Spanning Tree Protocol runs for each VLAN. This makes better use of the available links, as the paths that each Spanning Tree will take through the network can be customized so that different VLANs use different trunk connections, thus maximizing the available bandwidth.

Spanning Tree Optimization

Spanning Tree Protocol (STP), as incorporated in to the 802.1D standard is a necessary part of providing loop free paths in a bridged network. The benefit of STP is that it allows bridged networks to continue functioning with multiple physical paths between LANs. The downside is that STP does not load balance between equal paths, and in a large environment is slow to converge to a new path after a link failure. It is possible to optimize STP operation, but before we discuss the options, it is useful just to recap on some of the terms that will be used, with reference to figure 2-16.

The root bridge is elected according to the bridge ID value. On the root bridge, all interfaces are placed in the forwarding state. For each segment that has more than one bridge connected to it, a designated bridge is selected that will be the one to forward frames to the root. Each bridge selects a root port that will be used to forward frames toward the root bridge. Ultimately STP selects all the designated bridges and root ports necessary and identifies a loop free path between the root bridge and all LANs, placing the selected bridge interfaces in to a forwarding state and all others in a blocked state. The spanning tree is maintained by the root bridge transmitting BPDUs every 2 seconds by default. Upon receipt of a BPDU from the root bridge, the other bridges transmit their own BPDU (a BPDU was covered in chapter 1). Within large networks, this operation is sub-optimal and was subsequently replaced by routed networks. However, with VLANs, the opportunity to run several instances of STP on the network is now available and thus the opportunity to optimize STP operation to provide better utilization of network resources.

Optimizing Timers

The first place to look for optimizing STP in a switched network, is with the timers used to send BPDUs and those that determine when a missing BPDU indicates a link failure. The original designers of STP set conservative levels for timers, and rightfully so, as they were unsure of the topologies that network administrators would build with STP bridges. However, in a well designed switched network, there is plenty of scope to speed up the convergence time of STP by trimming timers. The key timer values are set at the root bridge (we’ll cover how to be get the switch you want selected as the root bridge next) and are the hello time, max age and forward delay.

The max age timer has a default value of 20 seconds, and is tunable down to 4 seconds on the 2900xl range of switches from Cisco. Using a default hello time of 2 seconds with a 4 second max age timer is probably too tight. This means that only two BPDU packets can be missed before a switch will be forced to re-compute its spanning tree. In routing distance vector routing protocols, we normally allow the timers to be set for three missed packets before any recalculations occur. This would imply a minimum max age of 7 seconds to be on the safe side. This is still much quicker than the 20 seconds default. Of course, if the hello timer (which is settable for between 1 and 10 seconds) is reduced, the max age timer can be reduced in step, however, by doing this you do pay the penalty of more bandwidth being consumed by BPDU traffic.

The Forward Delay timer sets the amount of time a switch interface will be kept in both the listening and learning state when making the transition from blocked to forwarding as a result of a Spanning Tree recalculation. This value has a default of 15 seconds and a range of 4 to 30 seconds. If the minimum 4 seconds is selected, a total of 8 seconds will pass before an interface transitions from a blocked to forwarding state if the spanning tree has selected it as part of the tree. The potential problem is that the switch has only 8 seconds to identify all the possible BPDUs that would prevent it from placing the interface in to forwarding mode. If an essential BPDU is not received within this time, the interface may be erroneously put in a forwarding state, which could cause a loop. Chapter 1 discussed the horrendous possibilities of a loop existing in a transparently bridged network, clearly something to be avoided.

Root Bridge and Interface Priorities

The root bridge should be as near to the center of your network as possible, to approximately synchronize the delivery of management BPDUs to bridges on the edge of the network. The STP selects a root bridge among the available switches in the network based on the bridge ID, the bridge with the lowest bridge ID will be selected as the root. In the event of a tie, the switch with the lowest value MAC address is selected. The default value for all switches is 32,768, and a range between 0-65,535 is allowed. For Catalysts, it is a simple matter of using the set spantree priority command to set the switch bridge ID priority (remember 0 will be higher priority than 65,535). The bridge priority needs to be set on a per VLAN basis, as each VLAN will be running its own STP and select its own root bridge.

Once the switch that will serve as the root bridge has been selected, it is beneficial to look at the individual interface priorities on a per VLAN basis, to make sure you are making the most of your available network bandwidth. This is illustrated with reference to figure 2-17.

The command to define the interface priority per VLAN, is the set spant portvlanpri command. The goal here is to set the link between the two 1/1 interfaces as the preferred link for VLAN1 and the link connecting the 1/2 interfaces as the preferred link for VLAN 2. By setting the priority appropriately we can do this and still have the other link available to the VLAN as a backup, should its preferred link fail.

So, if we use the command set spant portvlanpri 1/1 31 1 on switch 1 in figure 2-17, we are setting the priority for VLAN 1 on interface 1/1 (the first interface in the first module) to 31. The priority can be set to a value between 0 and 63, with 0 indicating highest priority. Assuming that interface1/2 has a priority of 32, this will make 1/1 the preferred interface for VLAN 1. One point of interest is that if we have set switch 1 as the root bridge, we will not see interface 1/2 blocked for VLAN1, to see this link blocked, we have to log in to switch 2, which will show interface1/2 as blocked for VLAN 1. Performing a similar configuration for VLAN 2 to use the link connecting interfaces 1/2 will separate the traffic from each VLAN and send each over a different link. The setting of port priority on a per VLAN basis is available on the Catalyst 5000 range, but not on the 2900XL range, which currently only allows the setting of an interface priority, irrespective of its VLAN membership. By making these adjustments to priority, each of the links in this figure will be the preferred route for different VLANs.

The final priority setting feature to look at is the interface cost per VLAN (referred to in the Cisco documentation as the port cost). This is set by default according to the type of media in use on the link. For example, a 10 Mbit/sec interface will have a default cost of 100, whereas a 100 Mbit/sec interface will have a default cost of 10. As with most interface specific commands, there are options to set this for all VLANs present on the interface, or on a per VLAN basis. The cost of an interface is used in the STP calculation that selects which interface will become the root port. The root port is the one that provides the lowest cost path back to the root bridge.

The two commands discussed here are the set spantree portcost for setting the cost on the interface for all VLANs and the set spantree portvlancost which defines the cost per VLAN on each interface selected in the command.

STP Uplink Fast Groups

The Uplink Fast feature is made available on the Catalyst 5000 range by software releases starting at 3.1.1. It is there to speed up the convergence time of STP switched networks, in the presence of redundant links. The method used is to group interfaces together and identify them as specific uplink groups. This is most easily explained with reference to figure 2-18, which shows part of what a switched network that employs uplink fast groups might look like. The key feature is that both the switch configured as the root bridge and the switch configured as the backup bridge have physical links to each switch that is used to connect to users in the network. In this instance the root bridge and its backup only connect to other switches, not LAN segments that have end stations on them.

With this configuration, the blocked links to switch 3 and 4 are shown as dotted lines. For uplink fast groups to work, uplink fast must be enabled on each non root bridge switch and each switch must have at least two connections (one to the root bridge, one to its backup), one of these two links has to be in the blocked state. Uplink fast also only provides backup for the links from the distribution switches (switches 3 and 4 in figure 2-18) back to the root bridge and its backup. Uplink fast is not an appropriate technology to provide for the backup of user VLAN links (such as those to VLAN 1, 2 or 3 in figure 2-18). It is also not appropriate to enable uplink fast on the root or backup root bridge, as fast transitions of the root bridge location will cause problems for the rest of the spanning tree to converge.

An example of how this works in practice, is if we consider the link from switch 1 to switch 4 failing. Immediately that switch 4 detects the link failure, it will unblock the connection to switch 2. As the switch 1 to switch 2 link is no longer present, no loops are generated by this unblocking and connectivity to a root bridge (even though it is the backup root bridge) is restored without the need to go through the listening and learning phases. Convergence is therefore completed within seconds, rather than minutes as it is not necessary for the Max-Age timer to expire.

One last point on the tasks for configuring this setup is that there is a simpler way to set the root bridge and its backup for a VLAN. Instead of manipulating bridge IDs, you can use the set spantree root command. This command lets you designate the bridge as the root or secondary, to specify the VLANs this is to be effective for, the diameter of the network (in terms of bridge hops) that will be acceptable and the hello timer. An example is set spantree root 1-3 dia 10, which will set the switch as the root bridge for VLANs 1, 2 and 3, accepting a maximum bridge hop count of 10 between end stations. The benefit of using this command is that the switch will start off with a priority of 8192 (as reflected in its bridge ID) and test to see if that is low enough to make it the root. If not, the switch will lower its priority on each VLAN until it is selected as the root.

Fast EtherChannel

Fast EtherChannel is the next step up in bandwidth provision from full duplex ethernet. With full duplex ethernet, implemented on a fast ethernet link (all I mean by fast ethernet is a 100 Mbit/sec ethernet link), we could get 200Mbit/sec data rates if you assume that both directions on the point to point link are transmitting at the maximum 100Mbit/sec. Fast EtherChannel is a technology that groups these full duplex fast ethernet links together to provide increments of 200Mbit/sec banwdith, as depicted in figure 2-19.

Fast EtherChannel is of course a point to point link technology, as it relies upon similar circuitry as used for full duplex ethernet. I realize all the marketing literature talks of full duplex fast ethernet providing 200Mbit/sec throughput, but having worked extensively with digital WAN links during my career, I find this claim a bit misleading. When you buy a 64Kbit/sec leased digital circuit from a telco, you get 64Kbit/sec in both directions on the point to point link. The telco does not advertise it as a 128Kbit/sec link because you can transmit 64Kbit/sec in both directions simultaneously. I concede that the marketers had to differentiate full duplex ethernet within the marketplace, and what better way than to say it is twice the throughput? In reality there are very few occasions when both directions on a full duplex point to point link are simultaneously saturated. Having said that, full duplex ethernet does provide some benefits, and particularly for server connections, it is a good thing.

Fast EtherChannel Concepts

Being a multi-link technology, Fast EtherChannel provides the expected benefits of being able to load balance across multiple links, and provide resiliency in the event of link failure. An added benefit is that all of this is transparent to applications, it is not like any network APIs have to be re-written (like would be necessary if we were to move from IP to native ATM at the desktop). Being implemented on Cisco technology, Fast EtherChannel interoperates with protocols like ISL and provides an upgrade path to Gigabit ethernet.

The technology used in Fast EtherChannel trunks was first developed by Kalpana, before it was acquired by Cisco. The basis of this technology is that between two and four separate trunks can be grouped together to appear as one link with the combined bandwidth of the component links. When implemented on router links (such as the Fast Ethernet Interface Processor, or Versatile Interface Processor, the router uses layer 3 principles to load share. This means the router looks at the source and destination IP addresses and load balances across the EtherChannel links based on that information.

Fast EtherChannel does not require the use of 802.1D STP to converge, as it has its own peer to peer protocol that only considers the device on each end of the point to point link.

Fast EtherChannel Network Design

The most typical applications of fast etherchannel are shown in figure 2-20

In this figure, three links are aggregated to produce a throughput of 600 Mbit/sec between the central switch and the server and there are two 400 Mbit/sec links connecting distribution switches to the central switch. If support for transporting multiple VLAN traffic across Fast EtherChannel links is required, all that need be done is to use ISL encapsulation on each link within the EtherChannel. An alternate configuration is to use Spanning Tree Protocol to provide resilience within the network connections as shown in figure 2-21, although this does require Catalyst software 3.1 or later, as early versions of Fast EtherChannel does not support STP. Of course this has the disadvantage of wasted bandwidth because of STP blocked ports.

In this figure, the central switch has three 400 Mbit/sec links configured, two of which connect to the same switch. Spanning Tree Protocol is enabled on these links and will select one of the two links to switch 1 to be place in the blocking state, ready to take over should a failure occur in the unblocked link. Currently these Fast EtherChannel links can be implemented on 100BaseTx and 100BaseFx on the Catalyst 5000 series, and soon on the 2900XL series. It is also expected that when the Gigabit Ethernet standard (802.3z) gains acceptance in the marketplace, Gigabit EtherChannel will be available to provide aggregate throughputs in the region of 8Gbit/sec.

The normal operation for a Catalyst switch to forward packets, is upon receipt, the packet is sent to all interfaces in readiness to be sent out. A separate process decides if the packet will be dropped from the interface (if it does not need to be forwarded) or forwarded out on to the network wire. This mechanism provides the opportunity to implement an exclusive OR operation for the interfaces belonging to the Fast EtherChannel link. What this means is that for those interfaces configured to be part of the Fast EtherChannel, the packet will be forwarded out of only one of them. In this mode, the switch is allowed to forward a packet out of interfaces to MAC addresses learned from a different interface. Of course, this is in contrast to normal switch operation, whereby the switch learns to forward packets to MAC addresses it has learned the existence of from source MAC addresses in incoming packets.

The exclusive OR operation just mentioned is based on a comparison of the source and destination MAC pair. If there are four links in the Fast EtherChannel bundle, the exclusive OR operation will be performed on the last 2 bits of the source and destination MAC addresses within the packet waiting to be forwarded. With two bits being used, there are four possible answers to the exclusive OR operation (00, 01, 10, 11) which are associated with each of the links. This method of distributing load between links does not directly use a determination of the comparative load on each of the links within the bundle, but has proven to deliver acceptable load splitting in the Fast EtherChannel environment and is much faster than some real time load analysis. Similar operations occur for the 2 and 3 links in a bundle case.

This analysis of source and destination MAC addresses works well in an ethernet switch environment as the switch transparently bridges packets without altering the MAC addresses of the packet as it is forwarded out an interface. The only time when this logic is not appropriate is if it is used to connect two routers together, when the source and destination addresses will always be the same. If it is required to link two routers together with Fast EtherChannel technology, it is better to use a layer 3 routing protocol, like IGRP or OSPF to load balance across equal cost links.

The latest release of Fast EtherChannel utilizes the Port Aggregation Protocol (PAgP, and is sometime referred to as Fast EtherChannel II, it supports uplink fast (as discussed in the previous section) and STP. PAgP eases the creation of Fast EtherChannel bundles by automatically identifying multiple direct links between switches, exchanging parameter information and grouping the links in to a channel. If STP is running, a bundle is then identified as a single bridge interface for the purpose of STP blocking or forwarding. One implementation issue to consider is that PAgP does not allow any of the links in the bundle to use dynamic VLANs. Either all the links must belong to the same VLAN, or be configured as a trunk to carry multi-VLAN traffic.

These options are set via the set port command. Typically Fast Ethernet is set to the "desirable" option for automatic configuration of bundles, however, using the keywords "on" and "off" either set or clear the assignment of Fast EtherChannel bundles respectively. When using the "on" option, the links that will form the bundle need to be explicitly identified. We will look at specific configuration examples later in Chapter 5.

Gigabit Ethernet

Proponents of ATM networks always point to the scalability, Quality of Service aspects and suitability to multiple traffic types (voice, data and multimedia) that ATM offers and say that ultimately ATM will rule everywhere. At the desktop, the reason that ethernet and other data oriented technologies have had trouble with audio and video traffic is how the network behaves during time of congestion. When an ethernet LAN is busy, there are variable delays which can destroy a video stream. There have been schemes such as RTP (Real Time Protocol) and RSVP (Resource reSerVation Protocol) that enable multimedia over ethernet to some extent. However, there is no substitute for having massive amounts of available bandwidth to make multimedia applications work well in an ethernet environment. With gigabit ethernet, we certainly are getting in to the realms of massive available bandwidth for real time applications to the desktop.

It is most probable that early implementations of gigabit ethernet will be within backbones and server connections. However, as the price of this technology falls, there is no reason to assume that it will not work its way to the desktop, with server connections making use of gigabit links in Fast EtherChannel bundles.

Gigabit Ethernet Basics

The gigabit ethernet standard is based upon the merging of two technologies, first the 802.3 frame formats and second, the physical interface of FiberChannel. This was done to maintain backwards compatibility with legacy ethernet technologies and make use of the gigabit speeds available in the FiberChannel physical interface. Figure 2-22 shows the pertinent layers borrowed from these two standards to form the basis of gigabit ethernet.

There are two key issues to familiarize yourself with as regards gigabit ethernet operation. The first is that the initial standards identify fiber optic cable as the primary cabling system, with a new specialized balanced and shielded copper cable as a short haul alternative. UTP is not an option at present, although there are investigations in to gigabit ethernet over UTP (it is conceivable this may be available by time of printing). The second is that the encoding of data bits on to the cabling system uses 10 bit encoding of 8 bit data, which is similar in concept to the 5 bit encoding of 4 bit data used in FDDI networks. The process to present the 8 bit data encoded in to 10 bits to the upper layers is handled by the serializer/Deserializer illustrated in figure 2-22. The FiberChannel specification sets out a signaling rate of 1.062 Gbit/sec which has been upped by gigabit ethernet to 1.25 Gibit/sec, which as we know from the FDDI specification, yields 1 Gbit/sec throughput due to the encoding of 8 bits of data in to 10 bits for transmission on to the cable. The encoding of 8 bits of data in to 10 bits not only allows synchronization to be maintained, but the potential of a DC bias, which is present in the FDDI encoding mechanism to be eliminated.

Gigabit Ethernet Standards

At the physical layer, the connector for FiberChannel is the SC optical connector that is used for both single mode and multi mode fiber. This is also the fiber connector used for gigabit ethernet. There are two fiber specification for gigabit ethernet, 1000BaseLX for long wave lasers over single and multimode cable, and 1000BaseSX for short wave laser over multimode fiber. 1000BaseCX is the standard for gigabit ethernet over the new specialized copper cable connector.

When looking at whether to deploy short wave or long wave lasers, the choice is fairly simple, depending on the physical constraints of the network you have to implement. Short wave lasers are used in CD players and are therefore relatively cheap, however will only cover distances up to 250 meters on 62.5 micron multimode fiber, whereas long wave lasers will cover up to 550 meters on the same fiber type. Long wave lasers will span a distance of 3 km using single mode fiber.

By contrast the 1000BaseCX standard will only support distances of up to 25 meters on the 2 pair shielded twisted pair cable.

To handle all these options (multimode fiber also comes in the option of 62.5 micron or 50 micron diameter), the IEEE 802.3z committee have provided a gigabit ethernet interface carrier layer that will allow network managers to configure individual interfaces on a gigabit switch for different media types.

At the MAC layer, half duplex gigabit ethernet implements the usual CS/MA/CD of 802.3 and full duplex ethernet is the same as that implemented for both 10Mbit.sec and 100 Mbit/sec ethernet. In order for gigabit ethernet to be compatible with the collision detect mechanisms of standard 802.3, a facility called carrier extension has been added to the specification. Carrier extension increases the size of small packets by adding bits to the frame. This is necessary, because as the speed of putting a frame on to the network cable increases, the shorter the time available to handle collisions, particularly for small packets. The second change to the standard 802.3 that has been implemented for gigabit ethernet is frame bursting, when an end station can send several frames on to the network cable without relinquishing control of the network. This is achieved by adding extension bits between frames, so that to listening stations, the cable will always appear busy for the duration of the frame burst.

The frames that are sent on the network cable are fully compatible with the standard 802.3 format and therefore no frame translation is required when going from standard 802.3 to gigabit ethernet (unlike the 802.3 to FDDI translation necessary when connecting those two topologies together). This has the added benefit of reducing latency when implementing gigabit ethernet as a backbone technology as not translation is required.

Gigabit Ethernet Deployment

The initial deployment of gigabit ethernet is expected to be used to relieve packet congestion on backbones, in much the same way that 100 Mbit/sec ethernet did. In time, gigabit switches will be available to replace the 10/100 switches currently in use. It is unlikely that non-blocking gigabit switches will be available immediately after gigabit the first gigabit devices come to market. The issue is that a switch with multiple gigabit ethernet connections could be faced with demands of over 15-25Gbit/sec, which is a throughput level that no vendor in the industry offers at the moment. Non-blocking means that a packet never has to wait in a queue to be forwarded out of an interface, and allows a device to forward packets at wire speed on all interfaces simultaneously.

As such gigabit ethernet can only be viewed as a point to point technology, rather than a networking technology for some time, as it is unlikely that we will have the gigabit networking products to fully support shared half duplex gigabit ethernet on a large scale in the near future.

As such, gigabit ethernet has value in short haul connections of up to a few kilometers that require high bandwidth. The technology is familiar and the complexities of ATM implementation are avoided.

CGMP and Multicasts

IP multicasting is becoming more popular as a means of efficiently delivering multimedia traffic over LANs and WANs. The goal is for a host to send a stream of traffic out once, and for it to be received by multiple clients. This is achievable by broadcasting, but multicasting adds a measure of intelligence by dynamically routing the data stream to subnets where it is required and stopping it from reaching subnets where it is not needed.

In a classic routed network, the existing specifications for IP multicast (RFC1112 for IGMP, RFC 1584 for MOSPF and RFC 1075 for DVMRP) work well. However when combined with the increasingly popular layer 2 switches that are being deployed across corporate networks, its benefits are easily negated. In this section we’ll take a review of multicasting, look at why switches can negate the benefits of multicasting and see how Cisco’s Group Multicast Protocol (CGMP) can restore those benefits in a switched environment.

Multicast Basics

Multicasts use class D IP addresses (those that start at 224.0.0.1 up to 239.255.255.255) to address a subset of hosts within the network. Typical well-known multicast addresses are 224.0.0.2 for all routers on a subnet, or 224.0.0.5 for all MOSPF routers. The Internet Group Management Protocol (IGMP) administers the dynamic joining and leaving of hosts to a multicast group and multicast routing protocols like MOSFP (Multicast Open Shortest Path First) direct multicast traffic to the hosts belonging to a particular multicast group.

Classic multicast in a routed environment is illustrated in figure 2-23, which shows host 1 multicasting to hosts 4, 7 and 8 on remote networks (remote meaning on the other side of one or more routers). To achieve this, hosts 4, 7 and 8 must have used IGMP to register membership with host 1 and routers 1, 2 and 3 must have used a multicast routing protocol to route the multicast packets across the network to the appropriate hosts. This can be quite efficient for the distribution of video or audio traffic across a network and requires minimal administration after the initial setup of IGMP and say MOSPF.

We have stated what multicasts use at the IP level, but what is used at layer 2? For example, what MAC address would router 3 use to forward the multicast packet on to hosts 7 and 8? Sending two packets point to point using the MAC address of each host would somewhat defeat the object. Sending a MAC layer broadcast to all ones likewise is not the aim. In fact special multicast MAC addresses are used. The way it works is that the low order 23 bits of the 32 bit destination multicast IP address are mapped to the low order 23 bits of the MAC address. Hosts that are members of the multicast group in question will recognize the destination MAC address as the MAC address of the multicast group and pass the packet up through the layers for processing. The range of ethernet addresses used in multicast packets ranges from 01:00:5e:00:00:00 to 01:00:5e:7f:ff:ff.

We have already mentioned DVMRP and MOSPF, but with Cisco routers there is a more flexible solution for routing within a multicast environment and that is Protocol Independent Multicasting (PIM). PIM offer two modes of operation, dense mode and sparse mode that are appropriate in different circumstances.

Dense mode PIM is best for applications like distributing real time video over a LAN. The characteristics of this traffic are that there are a few senders but many receivers, the multicast traffic is fairly constant and the senders and receivers are in close proximity. Dense mode PIM uses a mechanism to distribute the multicast traffic that is very similar to the Reverse Path Forwarding technique used within DVMRP (the distance vector multicast routing protocol as specified in RFC 1075). The basis of Reverse Path Forwarding(RPF) is that the router starts off by flooding the multicast packet out of every interface except the one on which the packet was received. This process ensures that the datastream will reach all LANs within a network. If one of the routers in the network has a LAN attached that has no clients on it that wish to receive the multicast stream, it will send a prune message upstream to stop further multicast packets from being sent to it.

In the instance when the first host on a LAN wants to receive the multicast, it has to wait until RPF performs one of its periodic floods of all networks to receive its first packet of the stream. The router that had been sending a prune message back upstream will no longer send a prune message for the LAN that now has a host wishing to receive the multicast stream and the host is now online. There is obviously a trade off here between the frequency of RPF floods and the amount of time it takes a new host to come online to a multicast stream. Additionally, one would not want to perform RPF floods too frequently, as the bandwidth penalty can become quite high.

The difference between DVMRP and dense mode PIM is that dense mode PIM can use any unicast routing protocol for the unicast route update messages (like IGRP or OSPF) rather than the DVMRP specific unicast routing protocol (which is a lot like RIP version 1).

Dense mode PIM is illustrated in figure 2-24. The LAN TV host sends one multicast packet to router1, which floods the packet to R2, R# and R4. R2 and R4 have hosts that wish to receive this multicast stream and do not send a prune message back. R3 has no hosts wishing to receive the multicast stream and sends a prune message back to router 1. Subsequent packets in that multicast stream from the LAN TV host are only sent to R2 and R4 by router 1. This situation persists until R3 has a host wanting to receive the LAN TV multicast and a flood occurs, at which time R3 will not return a prune message.

Sparse mode PIM is more suited to data applications that traverse WAN links. An example might be regional offices of a stockbroker company that has many news service distributed over a WAN. Not all offices want to receive all news services. Each news service could be defined as a discrete multicast group and only the offices that have hosts (broker workstations) that register for a specific news service actually get the data sent to them. A possible setup for this scenario is given in figure 2-25.

This setup fits the design goals of sparse mode PIM, as the traffic is intermittent (headlines for news stories that pop up on broker workstations are only sent out as generated by the news service), and bandwidth utilization is more of a concern as senders and receivers have to communicate via WAN links. Sparse mode PIM is also beneficial when there are many streams, each taken by a small number of receivers. The application depicted in figure 2-25 would work very poorly with the RPF mechanism of dense mode PIM, the WAN links would suffer from the regular flooding, which would be mainly unnecessary as the membership of groups at the remote branches should not change very frequently.

By contrast, sparse mode PIM forces receivers to register their interest in a particular multicast group with a rendezvous point (not just the next hop router as in dense mode PIM). Senders also register the groups they are willing to service with the rendezvous point. Once a sender/receiver pair has been identified, the routers along the path between the sender and receiver optimize the path for subsequent packets in the multicast group in question. The basis of sparse mode PIM is that a host has to make a request before it receives any data from the multicast sender.

Multicasting clearly has good solutions for a routed environment. Things are not so rosy when it comes time for multicast traffic to traverse a switched network. The reason for this is that switches learn where workstations are in the network by examining source MAC addresses. For example, in figure 2-26, the switch will generate the listed table and every time a packet is to be sent to one of the known MAC addresses (MAC a through f) it will refer to this table to decide which interface to send the packet out of.

Now when the time comes to send a packet destined for a multicast MAC address, the switch will refer to its MAC address to interface lookup table and see that the destination address is not listed. This is clearly the case as no station will have used the multicast address as its own source address at any time. By default a switch will assume that the destination multicast address belongs to a workstation that it does not know about and flood it out of every interface. Because of this logic, the switch treats multicasts in the same way as broadcasts and multicast benefits are therefore negated. At this stage every workstation on the switch will receive the multicast packets, when what we really want is for the switch to only send the multicast stream out the interfaces that have workstations registered for the multicast traffic. If a multicast stream is a real time video application requiring between 1 and 2 Mbit/sec of bandwidth, this is clearly a problem in terms of wasted bandwidth on the segments attached to the switch interface that do not need the traffic.

CGMP to the rescue

To deal with this problem of a layer 2 switch treating a multicast as a broadcast, Cisco have developed the Cisco Group Multicast Protocol (CGMP). Prior to CGMP, one could program specific destination multicast addresses to be sent out only specific switch interfaces. This however, becomes an administrative inconvenience, particularly with a large and continually changing network. Also by utilizing a manual process, the IGMP join process that is meant to automate the delivery of multicast traffic is negated.

An alternate solution that was employed to manage the level of broadcasts and multicasts was setting the maximum percentage of broadcast traffic on a segment. All broadcast/multicast traffic above that percentage was dropped. This is brute force method does allow you to ensure that a link does not become over-burdened with broadcast traffic, but one assumes the broadcasts are there for a reason. If a multicast application is used to send stock price data, you may not want the switch to drop multicast packets during time of heavy market trading. CGMP is there to resolve many of these problems and to retain broadcast/multicast traffic on the network.

The goal of CGMP, is to enable a switch to make use of a layer 3 router’s intelligence and thus have the switch treat multicasts in a more efficient way. What CGMP does is to make use of the fact that to receive a multicast stream, an end station must issue an IGMP join message (this is true for either dense mode or sparse mode PIM). The segment that has the workstation on it that issued the join message is then added to the multicast distribution tree. When an end station makes a request to join a multicast group, CGMP notes its source MAC address. A CGMP join message is then sent from the router to the switch (as illustrated in figure 2-27), that adds an entry to the switch’s switch table identifying the end station requiring the multicast stream. With this entry in place any incoming traffic destined for the multicast address in question is sent to the segment where the join message originated from. By this mechanism, packets belonging to the multicast stream are only sent to the switch interfaces that require them.

As CGMP works its magic by making entries in the switch’s switching table, no additional processing overhead is experienced by the switch and it can continue to operate without any performance implications. Put simply, once the extra entry is in the switching table, all multicast packets are switched the same way as other packets. This is in stark contrast to some other competitive protocols that require the switch to examine the layer three data for each incoming packet to identify the packet as multicast and determine which interface it should be forwarded out of.

A final word about CGMP is that as it facilitates communication between a router and a switch, it can enable RSVP priorities to be communicated between these devices. An example might be that a multimedia stream is headed for a particular interface on a switch. Assuming a router is feeding this stream to the switch, the router can use CGMP to tell the switch that a multimedia stream is headed for a particular interface. Upon receipt of this information, the switch can choose to increase the interface priority to ensure that the multimedia stream is not disturbed by other activity within the switch.

Summary

This chapter introduced the main concepts relating to switched VLANs that are necessary to design, build and troubleshoot these types of networks. We reviewed VLAN tags within a switch, noting that these tags that associate a packet with a particular VLAN are never sent out on segments where end stations are located. They only exist within switch backplanes or connections used as VLAN trunks. We discussed full duplex ethernet as a point to point technology useful for server connections. We covered the basics of token ring switching, which allows us to access the inherent prioritization mechanisms of token ring (something missing in ethernet) that make token ring attractive in a switched environment.

We also covered the basics of source route bridging, source route transparent bridging, source route switching along with the concentrator relay function and bridge relay function inherent in switched token ring networks. We also looked at the new dedicated token ring, a point to point token ring technology that requires neither a token, nor a ring.

We then covered trunking, the transport of VLAN data across multiple switches with ISL encapsulation on fast ethernet links. The VLAN trunk protocol was discussed as a means of advertising VLAN information throughout a switched network to reduce manual configuration and the operation of VLAN Membership Policy Server to enable a device to be connected to a network and assume membership of the correct VLAN.

We discussed the operation of both encapsulating and translational bridging across FDDI networks, noting that the simplest solution was to route across these backbones, but conceding that if VLANs were to traverse the FDDI backbone, bridging was necessary. We covered how fddicheck can deal with void frames that are generated in error and how that relies on a facility called apart that sets up a memory resident table of destinations and associated frame types.

802.10 VLAN tagging was discussed as an alternate to ISL that can be used on FDDI trunks. Optimization of spanning tree protocol was discussed and how to set the root bridge for a VLAN and assign trunk priorities on a per VLAN basis to provide load balancing. The mechanics of uplink fast were discussed in an environment that has redundant links between the root bridge and distribution bridges.

Fast EtherChannel was discussed as a means of bundling multiple fast ethernet links together to provide increased throughput for point to point links and as a path to gigabit ethernet. Gigabit ethernet was described as a marrying of the physical layer standards of fiberchannel to the datalink layer standards of 802.3. This allows the speed of fiberchannel to be used without the need for translation within an 802.3 environment.

Finally we looked at multicasts, PIM and CGMP (along with CGMP’s interaction with IGMP) as a way of efficiently implementing multicast applications within a switched environment.

Backward Forward
Chapter: | 1 | 2 | 3