The endless story of OSPF vs IS-IS - Part 3 "Packets and Database"
In this post we are going to cover the protocol packets and database structure for both routing protocols.
To start let's first highlight a couple of facts. OSPF runs on top of IP, that is it uses IP packets to exchange its messages (and thus it is vulnerable to spoofing and DoS attacks, and accordingly the use of authentication is strongly recommended), while on the other hand IS-IS runs directly over layer 2, it creates its own packet (or PDU (Protocol Data Unit) to be more specific) and then encapsulates it directly inside the layer 2 frame, this leverages IS-IS a point of strength that we'll cover later (and thus it is more difficult to spoofing and DoS attacks).
Let's first list the packets used by each protocol:
OSPF has five types of packets:
- Hello Packet
- Database Descriptor
- Link-state Request
- Link-state Update
- Link-state Acknowledgment
NOTE Link-State Advertisements LSAs (11 types) are sent within the LSU packets, they are not packets by them selves as we'll cover later.
IS-IS has three major types of PDUs each with its sub-types:
1. Hello PDUs:
- Level 1 LAN Hello
- Level 2 LAN Hello
- Point-to-Point Hello
2. Link-state PDUs (LSP):
- Level 1 Link-state PDU
- Level 2 Link-state PDU
3. Sequence Number PDUs (SNP):
- Level 1 Complete SNP
- Level 2 Complete SNP
- Level 1 Partial SNP
- Level 2 Partial SNP
Next let's cover how each protocol forms neighborship and conduct database synchronization:
Both protocols use periodic multicast Hello packets to establish 2-way communication and eventually maintain the neighborship, using tunable timers to help to optimize the protocol convergence according to the network characteristics. Both protocols use another timer, OSPF calls it router dead-interval, while IS-IS calls it hold-time, this timer describes the time that the router should wait to hear a hello from its neighbor before considering it down. OSPF requires hello and dead-interval timers to match on both sides (both timers are exchanged in the Hello packets), however in IS-IS case, each neighbor honors the other neighbor's advertised hold-time (the hello timer is not advertised in the hello PDUs, it is always the hold-time divided by 3) in its hello packet.
NOTE If you manually change the OSPF or the IS-IS timers you are changing the timers of the local router interface.
NOTE OSPF uses different timers for Nonbroadcast multiaccess networks - NBMA (the dead-interval is by default four times the hello interval, unless explicitly configured), for all network types it uses a 10/40 seconds hello/dead-interval timers, while for NBMA it uses a 30/120 seconds hello/dead-interval timers - Be aware that Cisco uses the same timers (30/120) for point-to-multipoint.
NOTE Cisco's and Juniper's IS-IS implementations use different hello/hold-time timers, Cisco's implementation uses 10/30 seconds for all the routers, and 3/10 for the DIS, while Juniper's implementation uses 9/27 seconds for all the routers and 3/9 for the DIS.
In addition, both protocols use the concept of a Designated Router/IS (DR/DIS) on multiaccess networks to reduce the link-state information flooding, however each uses its own different logic in doing this, we'll cover this in deep details in a later post.
Now let's discuss how each protocol forms neighborship and conducts database synchronization:
OSPF uses a complex Neighbor State Machine process to form adjacencies and synchronize databases between neighbors:
1. Initially both routers send Hello packets to each other, when each router sees its own RID in the others hello packet neighbor field this means that bidirectional communication has been established and the peers are now OSPF neighbors (but not fully adjacent until database synchronization is done), the routers enters the 2-Way state.
NOTE When an OSPF router receives a hello packet, it first verifies that the data in the Area ID, Authentication, Network Mask (on broadcast networks), Hello Interval, Router Dead Interval, and Options fields matches its own locally configured values, otherwise the hello packet is discarded and not processed and the neighbor remains in the Down state because it can’t process its advertised hello.
NOTE In case of of multiaccess networks routers goes to the full adjacency only with the DR and BDR, while they stuck in 2-way state with other routers, this should be covered in details later.
2. In the next step the routers enter the ExStart state, during which both agrees which of them is in charge of the database synchronization process (the master - the router with the highest RID is always the master - The Database Descriptor packets are used for the master/slave negotiation), the master controls the rest of the synchronization process, called database exchange.
NOTE Since the IP MTU is recognized as part of the DDP header (not the hello packet), in case of MTU mismatch the routers will stop the database synchronization process and stuck in the ExStart state.
3. Now the routers enter the Exchange state and exchange a summary of their LSAs using Database Descriptor packets, the DDPs contains a summary of the LSAs using the LSA headers.
NOTE The LSA Header contains six main fields: The Type, Link State ID, and Advertising Router fields together identify a specific LSA. The Age, Sequence Number, and Checksum fields together with the later fields identify a specific instance of that LSA, so that when multiple instances exist in a network the most recent can be determined. A couple of other fields also resides in the LSA header; Option and Length.
4. After exchanging the DDPs, now each router knows which LSAs it requires, and thus both routers enters the Loading state and start requesting LSAs using Link-state Requests (LSRs) and the other router responds with Link-state Updates (LSUs), and finally Link-state acknowledgment is used to acknowledge the updates.
NOTE Implicit acknowledgment is also an option, mainly in case the local router has a more recent version of the LSA and thus it sends it in response to the neighbor LSA, and thus this implies an implicit acknowledge.
5. After finishing the database synchronization the routers enters the full state, and thus being full adjacent, and now the link between them can be used for forwarding packets, and both neighbors add the adjacency to their local database and advertise the relationship in a link-state update packet.
IS-IS on the other hand uses a more simple process:
1. Initially the routers sends the hello PDUs and the IS-IS adjacency state is one-way.
2. Just when each router sees itself in the neighbor's hello the IS-IS adjacency state goes to Initializing - 2-Way communication is established.
3. The adjacency is formed and the routers starts the database synchronization by each router sending a Complete Sequence Number PDU (CSNP) to its peer. These contain a complete summary listing of the link-state database, including sequence numbers and the age of each data segment.
3. Each IS-IS router determines its missing or old information and sends a Partial Sequence Number PDU (PSNP) to request it from the other router.
4. The other router responds to this request with a link-state PDU (LSP) containing the requested information.
5. The requesting router issues either a PSNP (on a point-to-point link) or a CSNP (on a broadcast link) to acknowledge that the advertised link-state PDU was received.
As we have seen each protocol used its unique process, both processes seems similar from a helicopter view, but they differ in details.
Finally, let's dive a little into how the routing information is exchanged by each protocol, and how is it stored in the link-state database. OSPF uses Link -state Update (LSU) Packets to exchange the routing information between the neighbors, the routing information itself is carried as Link-state Advertisements (LSAs) inside the LSU packet, and the database structure uses the LSAs as the database entities. OSPF has 11 different LSA types to cover all the types of routing information with the OSPF domain. On the other hand IS-IS uses Link-state PDU (LSP) to exchange the routing information between the neighbors, the routing information itself is carried as TLVs (Type, Length, Value) inside the LSP, and the database structure uses the LSPs as the database entities. IS-IS has 17 different TLV types to cover all the type of information with the IS-IS domain.
An entire link-state PDU (LSP) is readvertised upon a network change in an IS-IS network (since an LSP is the database entity for IS-IS), in other words, when a change occurs to a LSP, the whole LSP needs to be flooded, which implies an issue, since this equals resources (CPU, bandwidth and time and thus convergence). On the other hand a similar change in an OSPF network means that only a specific link state advertisement (LSA) needs to be flooded not the entire LSU. This whole point is due to fact that with IS-IS the routing database consists of the LSPs (rather than the TLVs) and the access to the database is per LSP basis. While in the case of OSPF the database access is per LSA basis. And this means that with IS-IS LSP are the flooding entities, while with OSPF you can flood all or part of the LSAs via LSUs.
According to the above statement we can say that OSPF database is more optimized than IS-IS (however, IS-IS is more extensible than OSPF as we'll see in a later post). It is clear that OSPF database is more granular than IS-IS database, but this granularity implies a potential back fire mainly initially when two routers are syncing their databases. In OSPF, because of its high database granularity there are a lot of entries which needs to be synchronized requiring a complicated overload of DBD/LSR/LSU/LSAck packets being exchanged back and forth between the routers. However with IS-IS there isn't any finite state machine to go through to synchronize databases, simplifying the database synchronization process and the DIS election enabling it to be preemptive unlike the DR in the case of OSPF as we'll see later. Another critical thing, a single large packet implies less overhead on a router's control plane than many small packets, imagine an OSPF router with 100 external LSAs, if for any reason one of them flapped this means that only this one is advertised, however with IS-IS this means the whole LSP and thus all of them are, this means that with OSPF small changes results in small packets, however with IS-IS any change results in the whole LSP being advertised.
NOTE One thing to always remember, with OSPF LSUs are built on each hop and LSAs are grouped into LSUs during flooding, however with IS-IS LSPs are always flooded intact and unchanged from the originator to each hop in the network.
To conclude; IS-IS PDU is analogical to OSPF packet (an IP packet), while IS-IS LSP is analogical to OSPF Link-state Update Packet (both represent the respective routing protocol update packet), and finally IS-IS TLV is analogical to OSPF LSA (both represent the routing information carried in the update packet).
The final thing to cover is that IS-IS is considered more extensible than OSPF. This comes back to the fact that IS-IS tends to be easier to extend than OSPF. New features are added to OSPF through the addition of new LSAs, while new features are added to IS-IS through the addition of new TLVs. It is logical that adding new TLVs to an LSP without changing the structure and the behavior of the LSP itself is simpler than defining a new LSA in the case of OSPF. The strongest practical witness to this issue is that if you traced the history you'll find that routers vendors introduced new features support by IS-IS (specifically in the case of MPLS TE and IPv6) around one year earlier than by OSPF.
Two main things added in this extensibility discrimination (and ironically where solved in OSPFv3), the first is that OSPF drops unknown LSA types, however IS-IS ignores and flood unknown TLVs. The second is the inclusion of IPv4 addressing semantics in the OSPF LSA, which was a killer for OSPFv2 to support IPv6, and thus OSPFv3 was there (OSPFv3 didn't follow those mistakes). On the other hand, IS-IS is not itself an IP protocol and exchange IP addressing semantics as dedicated TLVs and there is null dependence on IPv4 addressing semantics, and as a result it is very relaxed to support anything other than IPv4 (extending it to do IPv6 was a piece of cake, and more over you can read about TRILL).
I hope that I was informative.