BGP performance tuning - Convergence, Stability, Scalability and NSF (Part 3)

Lets continue our BGP performance tuning discussion. Sorry for the long delay but I was deeply busy in some other stuff.

During the last couple of days I've attended Cisco Expo 2009, and during the SP - IP Core Technical Breakout, the breakout speaker highlighted Cisco's high availability features, while focusing on BGP he introduced to my knowledge a new BGP feature that I found attractive, BGP PIC (Prefix-Independent Convergence) Core/Edge - Currently only available with Cisco IOS-XR for Cisco CRS-1, Cisco XR 12000 and the new Cisco ASR 9000 routers series, I should be covering this feature later.

Anyway, lets continue our former discussion, in our previous post we have discussed most of the timers that controls the BGP operation, in this post we'll be discussing one more timer, and after then we'll start discussing some tools used for BGP performance tuning as we described in our first post.

ConnectRetry timer

This timer describes the time before the BGP process checks to see if the passive TCP session with a peer is established or not, if the passive TCP session is not established, then the BGP process starts a new active TCP attempt to connect to the remote BGP speaker, in other words it is the time to wait before attempting to reconnect to the BGP neighbor after failing to connect. During the idle time of the ConnectRetry timer, the remote BGP peer can establish a BGP session to the local router.

The default value for Cisco IOS is 120 seconds, and presently the Cisco IOS ConnectRetry timer cannot be changed from its default value. while Juniper routers use a default value of 120 seconds tunable in the range of 1-65535.

This timer is irrelevant in the process of tuning the BGP performance since with Cisco IOS we can't change its default value, although a feature like the fast fall-over causes the BGP session to come up immediately once the route to the neighbor is present once again in the routing table without needing to wait until the ConnecRetry timer expires.

As per RFC 4271,the exact value of the ConnectRetryTimer is a local matter, but it SHOULD be sufficiently large to allow TCP initialization. This is due to the fact that if this timer was too small it would interrupt the TCP initialization, since in response to the ConnectRetry timer expired event, the local system restarts the ConnectRetry timer, initiates a transport connection to its BGP peer, continues to listen for a connection that may be initiated by the remote BGP peer, while on the other hand, if the transport protocol connection succeeds, the local system clears the ConnectRetry timer, completes initialization, sends an OPEN message to its peer and go forward through the session establishment, accordingly if the timer is too small it might interrupt the session initialization.

Fast Fallover

With BGP there are two flavors of event driven Fast Fallover. The first one surfaces when eBGP peering is done via the physical interface IP address (directly connected peering), by default if the interface goes down the peering is reset, which can be disabled by using the no bgp fast-external-fallover command and thus only the timers will be controlling the BGP peering status and not the interface status - to make the timers effective for eBGP sessions its better to disable this feature (mainly if this interface is suffering from interface flaps or it will keep on resetting despite the timers configuration - you won't want your eBGP peering flapping with the interface flaps) - remember that when BGP session is reset like in this case, all the BGP routes learned from this neighbor that were advertised to further neighbors will be withdrawn from the other peers, which will severely affect the network stability.

NOTE We can control the BGP fast external fallover feature on a per interface basis via the ip bgp fast-external-fallover {permit | deny} command, and logically the per interface control overrides the bgp global configuration.

iBGP and multihop eBGP sessions on the other hand are not affected by this functionality, they still rely on BGP hold timer to detect neighbor loss, and here comes the BGP support for fast peering deactivation.

We can simply use the the neighbor x.x.x.x fall-over [route-map <>] command to enable the BGP fast peering session deactivation (introduced with IOS release 12.0S and 12.3T - BGP Support for Fast Peering Deactivation). BGP fast peering session deactivation improves BGP convergence and response time to adjacency changes with BGP neighbors. This feature is event driven and configured on a per-neighbor basis. When this feature is enabled, BGP will monitor the peering session with the specified neighbor. Adjacency changes are detected and terminated peering sessions are deactivated in between the default or configured BGP scanning interval - The idea is quite simple: as soon as the IP address of the BGP peer disappears from the IP routing table (peer route lost), the BGP session with the peer is deactivated, resulting in immediate convergence.

Actually the BGP Support for Fast Peering Deactivation does two things; first brings the BGP session down once the neighbor IP address is removed from the routing table, plus that it will also cause the BGP session to come up immediately once the route is present once again in the routing table (won't depend on the 120 seconds ConnectRetry timer as usual).

NOTE A host route must be available for each peering session that is configured to use BGP fast session deactivation. If a route is aggregated or is an unreachable non-host route (through a loopback interface) but still available to the peer, this feature will not be able to track the route and will be unable to close the session.

We can use a route-map with the fall-over command for selective BGP Peer address tracking, but take care that this feature suffered a couple of problems; first it was not saved in the startup configuration, and second it was not inherited from a peer session template, both issues were fixed in the 12.4(15)T3 code - Ivan has highlighted this issue in his BGP Peer Selective Address Tracking is broken until 12.4(15)T3 post.

Bidirectional Forwarding Detection (BFD)

BFD is a low overhead fast failure detection mechanism that provides faster uniform reconvergence time over any media and at any protocol layer. There was a need for such a global polling method (consider it as an enhanced form of fast hellos at Layer 2.5), to substitute the sheer mechanisms used by the various protocols, let it be the routing protocols (which are no better than a second using fast hellos) or the interface level starting from the  SONET/SDH with its 50ms margin going down to Ethernet with an intermediate switch between two routers and not being able to detect failure at one end.

For more optimization, since BFD is independant of any interface or protocol, this encouraged Cisco to consider implementing BFD in hardware, however I am not aware if this was done yet.

BFD was first described in the IETF internet draft Bidirectional Forwarding Detection draft-katz-ward-bfd-00 by Dave Katz (Juniper Networks) and Dave Ward (Cisco Systems) in 2003, the BFD Working Group was chartered to develop the protocol and its extensions and it is currently described in the IETF internet draft Bidirectional Forwarding Detection draft-ietf-bfd-base-08.

NOTE We might be covering BFD in more details in a future post, you can check Cisco documentation for more details, and also enjoy Jeff Doyle's post Reducing Link Failure Detection Time with BFD, and Ivan Pepelnjak article Improve the Convergence of Mission-Critical Networks with Bidirectional Forwarding Detection (BFD).

Network administrators can use BFD to detect forwarding path failures at a uniform rate, rather than the variable rates for different routing protocol hello mechanisms, network profiling and planning will be easier, and reconvergence time will be consistent and predictable, while offering a faster detection mechanism for the protocols that didn't leverage one.

BFD support for BGP was introduced in Cisco IOS Releases 12.0(31)S, 12.4(4)T, 12.0(32)S, and 12.2(33)SXH, and later releases.

Cisco initially announced that its BFD implementation will be released in phases, with additional functionality and platform support added in each phase; initially BFD was only supported on high speed Ethernet interfaces with Cisco 7600 and Cisco 12000 series routers. Regarding the functionality BFD currently supports BGP, EIGRP, ISIS, OSPF, HSRP and MPLS TE BFD-triggered Fast Reroute (FRR).

NOTE Care should be taken when using BFD on slow speed interfaces, since a 50ms interval might be unachievable, a simple ping can estimate the appropriate timer margin.

In order to configure BGP we need to simply enable it under the interface(s) using the bfd interval <50-999> min_rx <50-999> multiplier <3-50> command, and then under the BGP configuration mode use the neighbor x.x.x.x fall-over bfd command.

NOTE BFD for BGP works only for directly connected neighbors. BFD neighbors must be no more than one IP hop away. Multihop configurations are not supported.

A major thing to note is that for a comprehensive design, BFD should always be used in conjunction with Cisco IP Event Dampening, in order to mitigate the effect of heavily flapping interfaces.

Finally, care should be taken when using BFD in conjunction with NSF, since both features work with a different ideology, accordingly enabling them to work together is a sensitive step, since even a very low packet loss due to a switchover might result in BFD detecting the failure and thus not leveraging the NSF.

Well, I hope that I've been informative, I believe that this is enough for now, and we shall be continuing with BGP route dampening, Gracefull Restart and Peer Session/Policy Templates in the next post.

BR,
Mohammed Mahmoud.

Check Also

Best AI tools list