CEF and load sharing
Load-sharing is one of the clumsy areas that is full of confusing parts. In this post we should be covering its ABCs, and latter on we should be covering more parts in details. We chose the name "CEF and load sharing" as the post name due to the main role that CEF plays when talking about load sharing.
In IP routing context the forwarding/switching mechanism that the router uses is the actual controller of the load sharing process (data/forwarding plane operation), having multiple routes in the routing table has no significance on how exactly will load sharing be done, you might be left with poor load sharing or no load sharing at all, although you have multiple routes for a certain destination in the routing table.
The routing protocols are responsible for placing multiple paths in the routing table in the first place (control plane operation), by default all the IGPs are capable of inserting 4 equal cost paths, while BGP defaults to only 1 (BGP behaves completely different than the IGPs, we should be covering load-sharing with BGP in details in a later post). To control the maximum paths allowed per routing protocol we can use the maximum-paths command (The maximum was 4 in IOS releases earlier than 11.0, 8 with IOS Release 12.0S based software, 16 with IOS Release 12.3T based software, and 32 with IOS Release 12.2S based software.
NOTE This post is not meant to explain CEF operation, we'll only be focusing on CEF load-sharing, however we might consider to have a dedicated CEF inside out post later.
The most popular forwarding/switching mechanisms with Cisco routers are; Process switching (performs per-packet load-sharing), fast switching (performs per-destination load-sharing) and CEF (can do both per-packet and per-destination (completely different than fast switching per-destination load-sharing), plus also a new flavor which is per-port load-sharing).
NOTE According to Cisco, IPv4 fast switching is removed with the implementation of the Cisco Express Forwarding infrastructure enhancements for Cisco IOS 12.2(25)S-based releases and Cisco IOS Release 12.4(20)T. For these and later Cisco IOS releases, switching path are Cisco Express Forwarding switched or process switched. This makes the switching decision easier for future development of software features. Starting with the implementation of the Cisco Express Forwarding enhancements and the removal of IPv4 fast switching, components that do not support Cisco Express Forwarding will work only in process switched mode.
Load-sharing with CEF
For each destination with multiple equal cost paths (or unequal-cost in the case of EIGRP using variance, or with BGP using the BGP Link Bandwidth feature and also in the case of MPLS-TE) the router creates a 16 hash buckets, each pointing to one of the available paths.
The load sharing is controlled by the ratio of the number of buckets pointing to each path (outgoing interface), with equal-cost paths the buckets are fairly distributed (two equal cost paths results in 8 buckets per each path, three equal cost paths results in 5 per each (yes, one bucket is omitted), 4 equal cost paths results in 4 per each, and so on). While with unequal-cost scenarios each path will be associated with different number of buckets (according to the load sharing ratio).
CEF has three load-sharing options:
- per-destination (per-session):
I prefer to name it per-session - as stated in the show ip cef x.x.x.x internal command output - since it is actually done based on both the source and the destination IP addresses in the IP packet rather than solely the destination, by hashing both into a 4-bit hash value that is used to select the outgoing interface) - This is the default CEF load sharing option.
It is clear that per-destination load-sharing performs statistical distribution of traffic, and accordingly load sharing becomes more effective as the number of source/destination pairs increases as compared to lower number of source/destination pairs. Obviously this might result in having one link overloaded while the other(s) underutilized, if a relatively heavy session flows between a certain source/destination pair over this link.
The hash calculation depends on the algorithm used. The original algorithm uses only the source and destination IP addresses to compute a 4-bit hash value, giving 16 probabilities, and thus choosing an outgoing bucket from the 16 available buckets pointing to one of the outgoing paths, this results in all the routers in the network running the same algorithm with the same results, which introduced a load sharing hitch called CEF Load-Sharing Polarization (you can see a good example for this in Cisco press book "Cisco Express Forwarding"). To circumvent this behavior the universal algorithm (the default in current IOS versions) adds a 32-bit router-specific value to the hash function (called Fixed ID, which can be manually controlled - a router uses its highest loopback IP address as this value when booting) and thus seeding the hash function on each router with a unique ID, ensuring that the same source/destination pair will hash into a different 4-bit value on different routers along the path and thus provides a better network wide load sharing and circumvent the Polarization issue.
NOTE There is a third available algorithm called the tunnel algorithm, I couldn't find or understand its anatomy, but Cisco stated that this algorithm is meant to solve load sharing when tunneling techniques such as MPLS, GRE and L2TP are in operation, since with tunneling the traffic pattern is taken down to a small number of sessions (between the tunnel head/tail ends) which will introduce another form of traffic polarization. This algorithm also uses a unique per-router ID to work around this issue, again I can't find more details about this algorithm, but if I do I'll let you know.
- per-packet
Packets are handled in a round-robin fashion, ensuring that the traffic is balanced over multiple links. However, using Per-packet load sharing is not generally recommended, because it most commonly results in out-of-order packets, affecting TCP traffic throughput (since TCP will bother to fix the out-of-order) and UDP data loss (since UDP will not bother to fix the out-of-order) and to make things more scary out-of-order packets might be interpreted as an attack by firewalls.
The default CEF load sharing mode is per-destination, and we can change this using the ip load-sharing per-packet interface command on the outgoing interfaces involved.
NOTE Since load sharing decisions are made on the outbound interfaces, thus either choosing to do per-packet or per-destination load sharing should be done on the outbound interfaces.
- per-port (per-flow)
This is the most adequate option (was introduced with IOS 12.4(11)T release) with networks with low number of sources/destinations with the majority of the traffic between hosts that use different port numbers, commonly seen with Real-Time Protocol (RTP) streams, it simply adds the layer 4 source or destination ports or both in the CEF hashing function. This option is enabled via the ip cef load-sharing algorithm include-ports command in the global configuration.
The most common scenario with this option as the only effective solution is when having a subnet of hosts NATed to a single IP then having a router with multiple paths in the path to their traffic destination, per-destination option is obviously useless in this case if all the hosts are communicating with a single destination, since it is always a single source/destination pair, and accordingly if the layer 4 ports are involved in the hashing function this would enhance the load sharing process.
I hope that I've been informative.
BR,
Mohammed Mahmoud.