This Bridge is the Root

View Original

DMVPN Deep Dive – NHRP, mGRE, Routing Scenarios and IPsec

Recently, at Cisco Live 2024, I passed the Cisco ENCOR exam to try and re-earn my CCNP certification that I unfortunately let expire. A day later, I was having a conversation where I mentioned that there was a small chance that I thought I could pass the CCNP ENARSI exam just based on my experience and studying that I had done recently and that I might take it that week if they would let me because exams booked at the conference are half-off. I was encouraged to do so, and I did. It turns out that I didn’t do bad but not well enough to pass. One of the topics that really stuck with me that I did poorly on was DMVPN. This blog post is my attempt to remedy that problem for good.

Both the CCNP ENARSI and CCIE Enterprise Infrastructure blueprints feature sections on DMVPN.

From the ENARSI Blueprint:

2.3 Configure and verify DMVPN (single hub)

  • 2.3.a GRE/mGRE

  • 2.3.b NHRP

  • 2.3.c IPsec

  • 2.3.d Dynamic neighbor

  • 2.3.e Spoke-to-spoke

And from the CCIE EI v1.1 Blueprint:

3.3 DMVPN

  • 3.3.a Troubleshoot DMVPN Phase 3 with dual hub

    • 3.3.a (i) NHRP

    • 3.3.a (ii) IPsec/IKEv2 using preshared key

This article will try and dive deep into DMVPN as a technology and build step-by-step from the very beginning all the way through a production-ready DMVPN design. I hope to explain every single piece of the puzzle in detail and provide lab scenarios, wireshark captures, show command outputs and diagrams so that no command is a mystery.

Introduction

Let’s start at the surface, DMVPN (Dynamic Multipoint Virtual Private Network) is a hub-and-spoke site-to-site VPN technology. The pros of DMVPN are relatively great, easy to configure, low overhead, scalability, direct spoke-to-spoke communication, and can be encrypted with IPsec.

The heart of DMVPN is the GRE (Generic Routing Encapsulation) encapsulation, one of the simplest packet formats ever, coupled with NHRP (Next Hop Resolution Protocol) which adds the “dynamic” to DMVPN. In the case of DMVPN, we actually configure the tunnel interfaces as multipoint interfaces so that we can talk to multiple routers using the same tunnel interface, reducing the configuration and increasing the scale over point-to-point tunnels.

Let’s build up to DMVPN through the iterations starting at a point-to-point GRE tunnel, moving to a multipoint GRE tunnel, then moving to DMVPN phase 1 tunnels and then change this to a phase 2 and finally a phase 3 DMVPN tunnel.

  1. Lab Setup

  2. GRE Point-to-Point

  3. Multipoint GRE

  4. Next Hop Resolution Protocol

  5. DMVPN Phase 1

  6. DMVPN Phase 2

  7. DMVPN Phase 3

  8. NHRP Authentication

  9. NHRP Multicast Mapping

  10. Modernizing the NHRP Commands

  11. EIGRP Over DMVPN

  12. OSPF over DMVPN

    1. DMVPN with OSPF Point-to-Multipoint

    2. DMVPN with OSPF Broadcast

  13. Multiple Tunnels - Single Hub

  14. DMVPN Multihub - Single Cloud

  15. DMVPN Multihub - Dual Cloud

  16. MTU and Maximum Segment Size

  17. DMVPN Encryption - IPsec

  18. IPv6 over DMVPN

  19. IPv6 over DMVPN with IPv6 Underlay

See this content in the original post

Lab Setup

The lab setup is quite simple, using an unmanaged switch for transport between our Hub router (Router-1) with two spoke routers (Router-2 and Router-3). This transport network is configured with the IPv4 network 10.0.1.0/24, with the last octet matching the router number. The tunnel 1 interfaces are assigned with addresses in the 10.55.0.0/24 network with the last octet matching the router number. The loopback 0 interface is configured with each of the octets matching the router number (1.1.1.1 for Router-1).

The lab was built in Cisco Modeling Labs (version 2.7.0+build.4), Routers are running on the IOL instance which is IOS-XE version 17.12.1.

Don’t ask me how I landed on the addressing for the tunnel. It was pretty random, but I also acknowledge that it might have subconsciously come from an INE CCIE lab.

See this content in the original post

GRE Point-to-point

The default tunnel-type on Cisco routers is a GRE point-to-point. GRE is about as simple as a protocol gets.

I created a tunnel interface on both Router-1 and Router-2, added an IP address, in this case from the 10.55.0.0/24 subnet and then configured the tunnel destination IP address as well as the outgoing interface that the tunnel traffic will be sourced from.

Router-1

See this content in the original post

Router-2

See this content in the original post

Once this configuration is completed, I’m able to ping across this tunnel. I could introduce static routes or a dynamic routing protocol to send traffic across this tunnel.

See this content in the original post

As you can see in the wireshark output, the GRE header has a handful of flags and then a reference to the next protocol type, IP. The outer IP header is used for routing in the underlay, 10.0.1.0/24 network while the inner IP packet is the source and destination of the ping. You can download and inspect this ping within a GRE encapsulation for yourself with the PCAP below:

The problem with this configuration is scalability. To build a tunnel from router-3 to router-1, we run into a problem.

I added the exact same tunnel configuration from router-2 to router-3:

Router-3

See this content in the original post

Pings across this tunnel fail.

See this content in the original post

 To investigate why, I turned on “debug tunnel” and received this output from the pings:

See this content in the original post

 I can’t just build the tunnel on router-3 using the same information because the tunnel destination is hard-coded on router-1 as router-2’s address. To work around this, I created a new tunnel on router-1 with a new subnet, and set the destination as router-3’s Ethernet0/0 address.

Router-1

See this content in the original post

 I also changed the IP address on the tunnel 1 interface configured on Router-3 to 10.55.3.3. Now I am able to ping from Router-3 to Router-1’s tunnel 2 IP address.

See this content in the original post

 It’s obvious the issue with scalability here, every new router that I want to add to this network requires its own dedicated tunnel interface and associated subnet for routing. Additionally, any traffic from router-2 to router-3 needs to traverse the tunnel to router-1 first. It would be preferred if spoke-to-spoke traffic was able to go directly to the other spoke without being routed at Router-1 first. We could solve the spoke-to-spoke traffic issue with configuring additional P2P tunnels to those routers but that exacerbates our configuration problem even more.

See this content in the original post

Multipoint GRE

The scalability issues of point-to-point are well established, so let’s start building pieces to this to get to DMVPN. First is solving the tunnel configuration issue. In steps, Multipoint GRE (mGRE).

From the P2P GRE configuration, I’ve removed the tunnel 2 interface from Router-1, and converted the tunnel 1 interface on Router-3 back to the 10.55.0.0/24 subnet.

I’m going to remove the tunnel destination command from tunnel 1 on Router-1 and set the tunnel mode to “gre multipoint”.

Router-1

See this content in the original post

Unfortunately, we have a problem with this configuration though, Router-1 has no idea how to reach either Router-2 or Router-3. A ping from Router-2 to Router-1 fails.

See this content in the original post

 Output from the debug on Router-1 tells us the obvious answer, encapsulating the echo-reply packets fails.

See this content in the original post

What we need is a mechanism to map the inside tunnel address to the outside tunnel destination. In this case, we’re going to use the Next Hop Resolution Protocol (NHRP).

On the router-1 tunnel interface we’re going to statically map the inside tunnel address to the outside tunnel destination with the “ip nhrp map…” command. The NHRP map command says that the inner tunnel interface is available at the “NBMA” address, or the tunnel destination. NBMA (non-broadcast multi-access”, in our particular scenario isn’t exactly true because it’s a single layer 2 broadcast domain, but in production, these tunnels may be formed over the Internet and would not have the ability to broadcast traffic to one another.

 The static mapping looks like this:

See this content in the original post

Additionally, with NHRP, we need to define a network-ID on this interface. This is a locally significant value that does not need to match across tunnel spokes. It’s not included in any packets. Think of it similar to the OSPF process ID.

Our new tunnel 1 interface configuration looks like this:

Router-1

See this content in the original post

We can now ping from Router-1 to Router-2 and Router-3 

See this content in the original post

We also have spoke-to-spoke connectivity as well via Router-1 

See this content in the original post

We can view these static NHRP mappings with the command “show dmvpn” or “show ip nhrp”. You’ll get largely the same information. 

See this content in the original post

We solved one issue with the interface scalability issue but we still have other issues from our point-to-point problems. We have no dynamic registration of spoke routers so all new routers will require a static mapping on the hub router. As well, spoke-to-spoke traffic is hairpinning through Router-1 which will quickly eat up available bandwidth on Router-1’s ethernet0/0 interface.

There’s one other issue here that we haven’t discussed yet, and that’s the lack of multicast functionality available here. Since EIGRP and OSPF both use multicast for hellos, we would have some issues with a neighborship forming.

See this content in the original post

Next Hop Resolution Protocol

The NHRP protocol is an IETF standard protocol defined originally in RFC 2332 for address resolution on NBMA (Non-Broadcast, Multi-Access) networks. This is sort of ARP-ish in function. In ARP we have a layer 3 IP address and we want to know what the corresponding layer 2 address (MAC address) so that we can forward a frame to the correct next hop. With NHRP, we know the next hop that we want to forward traffic to in our overlay (mGRE tunnel) but need to resolve what the next hop address should be in our underlay NBMA network, or the tunnel destination.

Below are the NHRP packet types that are relevant to our discussion today. There are a couple more but I can’t identify if or when they are used in the context of DMVPN.

See this content in the original post

DMVPN Phase 1

To move to DMVPN Phase 1, we need to allow NHRP to dynamically learn these NHRP mappings. To do this, we need to enable NHRP on our spoke routers so that they can register their NBMA address with the hub router.

To move our mGRE configuration to an actual DMVPN configuration, I’m going to remove the static NHRP mappings from the hub, Router-1, leaving just the NHRP network-ID.

Router-1

See this content in the original post

On the spoke routers, I’m going to enable NHRP with the network-ID and configure the NHRP “next hop server” or NHS. In this case, the NHS for our network is Router-1’s tunnel 1 interface.

Router-2

See this content in the original post

Router-3

See this content in the original post

We now have dynamic registration of the spoke routers with the hub router and reachability amongst all routers in the design.

See this content in the original post

We also have connectivity between spoke routers.

See this content in the original post

This is a working DMVPN phase 1 implementation. There’s no multicast at this point but we’ll get to that.

At this point, I’d recommend that you download this PCAP and check it out in wireshark to understand what’s happening. When the spoke routers are enabled, they send an NHRP registration message to the hub router and get a successful reply back from the hub.

Additionally, in the PCAP are a series of pings from router-2 to router-3 to illustrate that with a DMVPN phase 1 design, all spoke-to-spoke traffic is routed through the hub router first. In our PCAP, there’s two copies of each ICMP packet because the capture is coming from the Router-1 Ethernet0/0 interface.

See this content in the original post

DMVPN Phase 2

The change from DMVPN phase 1 to phase 2 fixes our spoke-to-spoke traffic issues. The hub configuration remains identical to DMVPN phase 1, but we change the spoke configurations.

The main two changes are moving from a GRE point-to-point configuration on the spokes to a GRE multipoint configuration, and a static NHRP mapping on the spoke routers to assist with hub registration.

As we showed above, mGRE will allow for communication with any other router in the network, but we can’t configure a tunnel destination address for this mode. To get over this, we configure a singular static NHRP mapping that tells the router how to get to the hub, and then the router will use the next hop server address for all other dynamic NHRP registrations.

Router-1

See this content in the original post

Router-2

See this content in the original post

Router-3

See this content in the original post

The output of show DMVPN on the hub router shows both Router-2 and Router-3 as dynamically registered.

See this content in the original post

A ping from Router-2 to Router-3 is successful and the output of “show dmvpn” shows the static mapping of the hub at Router-1 and a dynamic registration of Router-3.

See this content in the original post

So DMVPN Phase 2 allows for this direct spoke-to-spoke traffic by each spoke learning about adjacent spokes through a NHRP registration message sent to the other spoke via the hub. That process looks like this:

  1. Spoke A registers its protocol address (tunnel inside IP address) to the NBMA address (tunnel outside IP address) with the hub

  2. Spoke B registers its protocol address (tunnel inside IP address) to the NBMA address (tunnel outside IP address) with the hub

  3. Spoke A sends traffic destined to spoke B to the hub router

  4. The hub router sends spoke A’s traffic to spoke B.

  5. Spoke A will initiate an NHRP resolution request to the hub router requesting the NBMA address of spoke B. The hub router will forward this request to spoke B.

  6. Spoke B will respond directly to the NHRP resolution request from spoke A to the NBMA address of spoke A.

  7. Spoke B will initiate an NHRP resolution request to the hub router requesting the NBMA address of spoke A. The hub router will forward this request to spoke A.

  8. Spoke A will respond directly to the NHRP resolution request from spoke B to the NBMA address of spoke B.

See this content in the original post

DMVPN Phase 3

The migration from DMVPN phase 2 to phase 3 is more about optimization of the NHRP lookup process rather than a dramatic change to traffic flow like the difference between phase 1 and phase 2.

Older guides to DMVPN phase 3 configurations will have you configure the command “ip nhrp shortcut” on spoke routers. This is unnecessary on later versions of IOS-XE because it’s enabled by default. Something I discovered while writing the article and became confused by the behavior of my routers in my lab.

On the hub router, we need to enable “ip nhrp redirect”.

Router-1

See this content in the original post

This is where I got very confused by the actual functional difference between DMVPN phase 2 and phase 3. At first, there appeared to be very little difference between the traffic flows while I was inspecting wireshark captures. The behavior between phase 2 configurations and phase 3 configurations was very similar, with the only exception was an additional NHRP traffic indication (redirect) packet sent from hub router to the spoke router upon receipt of the first packet destined to another spoke. This didn’t seem that different, but in investigating the CEF table on the spoke router, that’s where the difference was apparent.

To illustrate the difference, this is what changes to the CEF table on Router-2 before and after traffic is sent to a destination on Router-3 (Loopback0 - 3.3.3.3).

See this content in the original post

As you can see in the original output, the route to the 3.3.3.3 network is a default route through the hub routers VPN address (10.55.0.1). Upon sending traffic to the 3.3.3.3 address, the hub router sends a redirect to Router-2 telling it that the correct route is found through Router-3’s tunnel 1 address. This information is represented in the CEF table now with a next-hop of 10.55.0.3. Until this entry expires, (10 minutes by default) traffic destined to 3.3.3.3 from Router-2 will take the direct path across the DMVPN tunnel to Router-3.

We can see this shortcut cache in the output of “show ip nhrp shortcut”:

See this content in the original post

What this means is that with DMVPN phase 3 configurations, we can summarize spoke routes with only a route to the hub router. In DMVPN phase 2 configurations, if we do not have specific routes to networks on other spoke routers with the correct next-hop address, traffic will be relayed through the hub router first.

To put this more succinctly, DMVPN Phase 3 configurations improve efficiency with summarized networks. Networks where all spokes have a complete routing table will see NO functional change to traffic flow between phase 2 and phase 3 configurations.

See this content in the original post

NHRP Authentication

NHRP does include an authentication option in the protocol, however, this information is sent in plain-text and is best used for maintaining a distinction between tunnels that are operating on the same router. This is not at all intended to be used for securing your DMVPN tunnels, if that’s your need, IPsec is the clear choice in that regard for data privacy and data integrity.

The NHRP authentication string has a maximum value of 8 characters, further reinforcing that it’s not meant for security purposes. It’s worth noting that by NOT configuring an NHRP authentication string on the hub router will allow it to accept ALL authentication strings sent by spoke routers.

Here is what happened when I deliberately configured mis-matching authentication strings between the hub router and one of the spokes.

Router-1

See this content in the original post

Router-2

See this content in the original post

In the logs on my hub router, I receive the following message every time a packet comes in from the spoke with the mis-matched authentication string.

See this content in the original post

You can see in this wireshark screenshot, that the authentication string is in plain-text and is readable in the right-hand panel.

The output of “show ip nhrp nhs detail” on the misconfigured spoke shows a couple of things indicating a problem with the next hop server.

  1. The neighbor code is listed as “E” for “Expecting replies”. This spoke has never seen a response from the NHS.

  2. The number of requests sent is “6”, however the number of replies received is 0.

  3. There is a pending registration request at the bottom.

See this content in the original post

NHRP Multicast Mapping

Up until now, I had not configured any dynamic routing protocols so that we could focus on the behavior of NHRP and its associated commands. You could absolutely run a small DMVPN instance without using a dynamic routing protocol, but in all likelihood, you’re going to want to run one in production to make your life easier.

For the purposes of this section, I’m going to go ahead and run EIGRP and OSPF at the same time over our DMVPN tunnels. For now, I’m going to simply enable them on the tunnel 1 interface. I made this same configuration on all three routers in the topology.

See this content in the original post

I purposefully did not configure either routing protocol to form across the ethernet0/0 link or to advertise this network across the tunnel. I did not want routes to be passed across the transport network.

Next up, let’s check and see if we’re forming neighbors across the tunnels.

See this content in the original post

Router 1 interestingly does not show any neighbors for either EIGRP or OSPF. However, when we look at the neighbor table on one of the spokes, we see something interesting, the hub router.

See this content in the original post

We have an EIGRP neighbor of our hub router and a listing of our hub router (1.1.1.1) in the OSPF neighbor table but it’s showing as being stuck in an INIT state. Additionally, the Q (Queue) count column in the EIGRP neighbor table is showing 1, which means that the router is waiting to be acknowledged. This indicates the we have one-way multicast traffic flow rather than two-way. This is due to a NHRP command that was moved to a default configuration on the hub router. Specifically, the command “ip nhrp map multicast dynamic” is enabled by default on Cisco routers tunnel interfaces.

The “ip nhrp map multicast dynamic” command automatically maps dynamically learned NHRP neighbors for multicast addresses. We can see this mapping on the hub router with the command “show ip nhrp multicast”.

See this content in the original post

However, we need to statically map the next-hop NBMA address for our spoke routers in order to get two-way multicast traffic. I updated the spoke router tunnel 1 interfaces to the following.

Router-2

See this content in the original post

Router-3

See this content in the original post

As soon as I enable this static multicast mapping, our dynamic routing protocols are able to exchange hello messages in both directions. With EIGRP, this is a bit more straightforward but OSPF comes with some additional complications.

Let’s first show the NHRP multicast table on the spoke router to validate our static mapping:

See this content in the original post

Modernizing the NHRP Commands

Throughout this article, I’ve been using separate NHRP commands on the spoke routers for clarity in discussion of the different aspects of the network design. You can absolutely go forward with this configuration in production, but there is a cleaner configuration approach that you may want to opt for.

Cisco introduced a single command to combine the next-hop-server command, the static NHRP mapping for the hub router, and the multicast mapping configuration. Let’s take a look at the configuration line:

See this content in the original post

As you can see, we don’t have to repeat the same IP addresses in the configuration, we can just do it once per tunnel. We define the NHS address, map it to the NBMA address, and tell it that we can use it for multicast all in one command.

The behavior of the underlying protocol is identical between these two command formats. It’s up to your preference which you’d like to run on your equipment.

The command-line will not let you configure the same NHRP mapping in both formats. You’ll need to remove the map commands if you want to convert to this all-in-one command.

See this content in the original post

EIGRP over DMVPN

Let’s check the EIGRP neighbor table on the hub router now that we have two-way multicast traffic:

See this content in the original post

There are two things to note in this output, first is the presence of both spoke routers, and the second is the Q count on both of them is at 0, indicating a healthy two-way relationship. Let’s investigate the neighbor table on Router-2:

See this content in the original post

We’re only seeing the hub router in the spoke neighbor table, but we’re not seeing Router-3 as a neighbor on Router-2.

Let’s take a look at the topology table on Router-2 as well:

See this content in the original post

Interesting to note here, we’re not seeing Router-3’s Loopback 0 address (3.3.3.3) that’s advertised to the hub. If we think about this, each spoke router only forms a neighborship with the hub router, and only exchanges routes with it, not the other spoke routers.

I’m going to borrow a quote directly from Cisco here, “Never advertise a route out of the interface through which you learned it.” This is called split-horizon and it’s meant as a loop-prevention mechanism. If another router forwarded a route to you, it doesn’t make much sense to turn around and tell that router the same information.

Now, this is where our phase 2 vs phase 3 configurations can make a difference. In a phase 2 tunnel, if we have a summary route (eg. a default route) originating from the hub, our traffic will always be relayed through the hub, despite there being a better path. With phase 3 tunnels, a shortcut entry is installed in the CEF table to improve the efficiency of the tunnel.

With EIGRP, if we want to have spokes receive all routes from their neighbors, we need to disable split horizon on the hub tunnel interface. This is a requirement for efficient traffic flows in DMVPN Phase 2 tunnels.

See this content in the original post

After disabling split-horizon on our hub router, the EIGRP topology table shows a specific route to the 3.3.3.3 network, but, the next hop to this destination is the address of the hub router. This is because EIGRP assumes that the next-hop of any route received is the IP address of the neighbor who sent it. This would be perfectly fine for a DMVPN phase 1 tunnel where there is no spoke-to-spoke connectivity, but in phase 2 and 3, but especially phase 2 tunnels, we want the next-hop to be the correct spoke router.

See this content in the original post

For our phase 2 and phase 3 DMVPN tunnels, we want the hub router to relay the route received from a spoke router to the other spoke routers, but to have the next-hop address be the IP address of the spoke that originally advertised the route.

See this content in the original post

This causes the hub router to fill in the “next-hop” field in the advertised route, a field that is normally sent with all zeros. We can see this in the wireshark capture. In the output of “show ip eigrp topology” and in the routing table, we can see the correct next-hop address and that we received it from the hub router.

See this content in the original post

In a DMVPN phase 3 tunnel, you can get away with just sending a default or summary address. I accomplished this in the lab by creating a summary-address on the hub tunnel 1 interface. You could accomplish the same thing upstream from the hub router in a production network, it does not need to be done on the hub.

See this content in the original post

Now, once routes are resolved and redirected via NHRP, we get a route in the routing table that looks like this:

See this content in the original post

OSPF over DMVPN

I did lie a little earlier when I told you I configured OSPF over the tunnel interfaces. As soon as the static multicast mapping was created on the spoke routers, the console of the hub router was immediately filled with the following log messages and console performance was pretty spotty:

See this content in the original post

The console was constant until I could shut down the tunnel interface. So what’s the issue? OSPF has different rules for neighbor relationships depending on the network type. In this particular case, without manually configuring the OSPF network type, OSPF defaults tunnel interfaces to a “point-to-point”. If you’ve been reading this entire time, you’ll know that our DMVPN configuration is absolutely NOT a point-to-point network.

So let’s go through the OSPF network types to see which one would work best.

  • Point-to-point - Like I just said, our DMVPN configuration is not point-to-point. We could possibly configure spoke routers as point-to-point network types and the hub router as a point-to-multipoint but the issue here is that the timers do not match between those two network types. Point-to-point defaults to a hello timer of 10 seconds and a dead timer of 40 second. Point-to-multipoint defaults to a hello timer of 30 seconds and a dead timer of 120 seconds. We would need to adjust these if there was a mismatch or we would constantly have churn in the network.

  • Point-to-multipoint - Point-to-multipoint seems like a good idea across all of our routers. The configuration is simple and the timers will match.

  • Broadcast - Broadcast could also work across our routers, however we need to be sure that our hub router is the designated router so that it can form a relationship with all spokes.

  • Point-to-multipoint non-broadcast - point-to-point non-broadcast would have similar perks to point-to-multipoint, with the obvious disadvantage of requiring static neighbor configuration. We want to keep the “dynamic” in DMVPN so let’s avoid this one.

  • Non-broadcast - Non-broadcast network types require static neighbor configuration. As with P2PNB, let’s keep this dynamic.

See this content in the original post

DMVPN with OSPF Point-to-Multipoint

Router-1

See this content in the original post

Router-2

See this content in the original post

Router-3

See this content in the original post

Here is the output of “show ip ospf neighbor” on the hub router

See this content in the original post

And here is the output of “show ip ospf interface tunnel1” on the hub router as well. Notice the change to the network type, the hello intervals, and the neighbor adjacencies on the interface.

See this content in the original post

DMVPN with OSPF Broadcast

Let’s explore what a DMVPN configuration with an OSPF network type of broadcast looks like.

First thing we need to know is that the behavior of broadcast network type and point-to-multipoint are fundamentally different in how the spread of information occurs. Point-to-multipoint is intended to flood LSAs to all neighbors on the link. Broadcast network types introduce a mechanism known as the Designate Router (DR) and Backup Designate Router (BDR) for the purposes of scaling the link. In a broadcast network segment, if every router needed to flood its entire LSA database to every other router on the segment, we would end up quickly overwhelming links and could have instability in a network when an reconvergence event occurs.

The DR and BDR assists with the scaling issues by limiting the amount of flooding needed on a segment. Routers on a segment will send their LSA database to only the DR and BDR routers, the DR will then flood its database to the routers on the segment. By doing this, it drastically cuts down on the amount of OSPF traffic needed.

So let’s quickly change the network type on our tunnel interfaces to broadcast and see what the results are.

See this content in the original post

After changing the network type on all three routers in my topology, let’s look at the output of “show ip ospf interface tunnel1”

See this content in the original post

You can see here that the hub router believes that the DR is 3.3.3.3 (Router-3) and that it is the BDR for the network. It has an adjacency with Router-2 and Router-3. Let’s next take a look at the same output from Router-2’s perspective:

See this content in the original post

As you can see here, the hub router and Router-2 disagree on which router is the designated router. As well, Router-2 only has an adjacency with Router-1.

You may already know why we’re getting a disagreement between the routers as to what the designated router is. Let’s look at the election criteria for the DR:

  1. The router with the highest priority is elected as the designated router. By default, routers are set with a priority of 1.

  2. If all router priorities are equal, the router with the highest router ID is elected.

In our configuration, we have not manually configured the router ID for our router instances, rather letting the OSPF process choose its router ID based on the loopback address. In our DMVPN network, that means that the hub router, with a loopback address of 1.1.1.1 is almost always going to lose to the an election to the spoke routers with router IDs 2.2.2.2 and 3.3.3.3. Because OSPF doesn’t have a preemption option, there could be a scenario with careful timing of the election of the hub where it becomes the DR and keeps it.

Let’s explore what effect this mismatch of DR has on the routing table on Router-2:

See this content in the original post

We are getting routes from Router-3 on Router-2, but notice that the next-hop for network 3.3.3.3 is Router-1 rather than Router-3 as should be expected. This would be a huge problem in a DMVPN phase 2 tunnel, all traffic would have to traverse up through the hub router. With DMVPN phase 3, we would be a little more efficient with the Next hop override being installed in the CEF table, but it’s still not ideal by a long stretch.

Let’s resolve the OSPF DR issue so that we can be a little more deterministic in our network. We can choose to accomplish this in one of two ways, we can raise the priority of the hub router to above the default of 1. We can go as high as 255. Or, alternatively, we can reduce the priority of the spoke routers to 0. Given that changing the priority on only one router is easier to manage and is less prone to misconfiguration, let’s raise the priority on the hub router.

See this content in the original post

With this change, we can see that our hub router is now the DR for the network segment:

See this content in the original post

Checking the routing table on one of the spoke routers show the correct next-hop address for the loopback address of the other loopback router.

See this content in the original post

The takeaway from this is, while you could run a DMVPN network with the OSPF network type of broadcast, choosing point-to-multipoint is a significantly better choice.

See this content in the original post

Multiple Tunnels - Single Hub

There may be a situation where you require a hub router to have two separate DMVPN clouds hosted on the same physical interface, say from an Internet or WAN connection. You may want spokes from one DMVPN tunnel to not form tunnels with spokes in the other tunnel. This may be for multi-tenancy, security, separate VRFs, etc. If we build these two tunnels on the hub router, the router will not have the ability to tell the tunnels apart, and there may be issues with NHRP registration and resolutions not functioning correctly.

What’s the solution to this? There is a tunnel key feature that we haven’t discussed so far that can mark the GRE packet header with a unique value so that the receiving router can tell which tunnel a GRE packet belongs to. That configuration looks like this:

Router-1

See this content in the original post

I added the tunnel key to each of the existing routers in our DMVPN cloud. Each router in a DMVPN tunnel need to have matching tunnel keys in order to function. You can see in the wireshark output that now, the GRE header is marked with the key value of 1.

I changed the lab here to introduce two new spoke routers as well. I also configured a tunnel 2 interface on the hub router to support connectivity to both of these new spoke router. That configuration looks like this:

Router-1

See this content in the original post

Router-4

See this content in the original post

Router-5

See this content in the original post

The effect of this change can be seen in the output of “show dmvpn” on the hub router.

See this content in the original post

Traffic routing between spokes in DMVPN cloud 1 can traverse directly to the other spoke in that cloud, while traffic from cloud 1 to a spoke in DMVPN cloud 2 will need to traverse through the hub router first.

See this content in the original post

DMVPN Multihub - Single Cloud

Up until this point, all of the DMVPN scenarios have been with a single hub router for simplicity’s sake. Of course, in a production network, redundancy is extremely important. Backhoes are going to seek out their favorite food source (fiber) at the most in-opportune time and you don’t want to sit on an outage call when everything is out of your hands.

For DMVPN, we have essentially two options to add redundant hub routers. The simplest, and what we’re going to discuss first is DMVPN multihub with a single cloud. It has essentially the same configuration as our single hub, we just need to add an additional next-hop-server to our spoke router’s tunnel configuration. That looks like this:

See this content in the original post

On our Router-2 spoke, we have this output from “show dmvpn” now and “show ip nhrp nhs”. Notice that we have two static NHRP mappings, and our next hop server is responding.

See this content in the original post

If we want to treat one of our hub routers as a preferred next-hop-server, we can adjust the priority. However, I should say that I think the priority command here is a little odd. The default priority value is 0 unless otherwise specified, however, 0 is the most preferred value in priority. So you cannot configure a NHS to be more preferred than the default, but rather configure certain servers to be less preferred than the default.

At least, that’s the theory. In my testing, the version of IOS-XE that I’m running seems to not care what the priority of a NHS is, but rather what order it was configured in. In the below example, even though the 10.55.0.4 server has a less preferred priority than the 10.55.0.1 server, because it was configured first in the list of servers, resolution requests will go to it first.

See this content in the original post

DMVPN Multihub - Dual Cloud

The second option for DMVPN multihub, we can create a separate DMVPN tunnel, and use our dynamic routing protocol to determine the best path. This does involve more configuration than the single cloud option discussed above, but I personally think it’s a preferred option because most people are more comfortable with using routing protocols to choose the best path rather than NHRP. Additionally, with tuning, a routing protocol can identify a failure quicker than NHRP.

Router-2

See this content in the original post

Router-3

See this content in the original post

As I said above, the perk here is that we can treat these as two different paths from a routing point of view, and therefore we can adjust either the bandwidth, delay or both to influence the metric of our routing protocol. Before we start changing either of those values, let’s look at the Tunnel interface defaults:

See this content in the original post

With OSPF, the cost of the interfaces is dynamically determined by the bandwidth divided by the reference bandwidth. By default, the reference bandwidth is 100Mbps and the default bandwidth of the tunnel interfaces is 100Kbps. This means that, by default, tunnel interfaces have an OSPF cost of 1000.

This is where I would have a think about your network design to determine your best path forward. If your spoke networks are only able to reach the hub network or other spoke networks through the DMVPN network, then adjusting the bandwidth on the tunnel interface is probably inconsequential to the chosen path. If there is a separate path through and alternate path (say a private WAN connection or similar), you’ll probably want to adjust the interface bandwidth to allow for the DMVPN path to be chosen.

With OSPF, you can adjust the bandwidth command or set the interface cost directly. I’ve shown both here:

See this content in the original post

In this case, the cost calculated from the bandwidth command is overridden by the static OSPF cost value, but you can understand the effect. The bandwidth command is expresses in Kilobits per second so the value expressed here is 100Mbps or equal to a cost of 1 with the default OSPF reference bandwidth.

With EIGRP, setting the bandwidth is important from a sense of it will almost certainly be the lowest bandwidth value in the EIGRP hop path. I would start by setting the bandwidth of the tunnel interface to match that of the underlay network circuit. The other value important to the EIGRP metric is delay. With EIGRP, the metric uses the cumulative delay value for the composite metric and this is most often the value changed for path engineering.

See this content in the original post

There is something to note about EIGRP here. Because delay is only calculated on INGRESS, not egress, we have an interesting action here for spoke-to-spoke traffic. That means that although the hub router is not “in the path” of the traffic between spokes in a DMVPN phase 2 or phase 3 tunnel based on the next-hop-address in the RIB, because the hub router relayed the route from one spoke to another, the bandwidth and delay are included in the route calculation. You can see this in the output of “show ip eigrp topology” from Router-2. In this scenario, the Hub-1 tunnel delay was set to 100 while the Hub-2 tunnel delay was 20. The tunnel interface bandwidth on both was identical, 100Mbps. As you can see, the delay of the hub routers is included in the composite metric although the next-hop is the Tunnel 1 and Tunnel 2 interface addresses. The bandwidth and delay values are the default values on the spoke routers.

See this content in the original post

MTU and Maximum Segment Size

All throughout this article, I’ve been discussing the actions of NHRP, routing protocols, but I’d like to step back to the beginning and go over the behavior of GRE tunnels. Let’s state the obvious here: Tunnels encapsulate IP traffic within other IP traffic. That encapsulation contributes to the overall size of the packet. When a packet is too large for a link, a router needs to break that packet apart in a process known as fragmentation. IPv6 does not allow for routers to fragment packets but that’s a different subject.

We have a couple of problems now, fragmentation can increase load on a router, increase load on the destination endpoint, increase overhead of the link, create traffic black holes in IPv6 networks (in networks where PMTUD is broken), and in general is just not preferred.

MTU (maximum transmission unit) of a link is the largest frame payload allowed over a link. The MTU is measures from the start of the Layer 3 header to the end of the packet. This means that both the IP header and TCP header are included in that measurement. Because we’re going to package the IP packet into another IP packet, we need to reduce the allowable size of the packet allowed over the tunnel so that the tunnel packets themselves are not fragmented.

The TCP MSS (maximum segment size) is a TCP value that is exchanged for TCP connections. The MSS value is included in the TCP SYN, SYN/ACK headers and the two endpoints will use the lower of the two exchanged values.

When a packet that is too large for a given link MTU, an ICMP Packet Too Big message is returned to the source address.

Within Cisco routers, we actually have the ability to rewrite the MSS value in the SYN/ACK packet of a TCP 3-way handshake. By setting the “ip tcp adjust-mss [value]”, we can change this value in the SYN/ACK packet and avoid some unnecessary fragmentation. This only works for TCP connections, but we can adjust both IPv4 and IPv6 packets.

In the below image, I set the MSS value (1260) much lower than would be set in production, but note that this was rewritten by the router rather than the destination endpoint.

If we were not going to run any encryption over this tunnel, the additional IP and GRE headers would only be an additional 24 bytes. Because we are going to be configuring IPsec encryption, there will be additional overhead we need to consider. Cisco recommends the following values for MTU and MSS:

See this content in the original post

The IPv4 header is generally 20 bytes, but could be more with extension headers. The IPv6 header is fixed at 40 bytes. Because of this increase in header size, the IPv6 TCP MSS is adjusted down by 20 additional bytes.

See this content in the original post

DMVPN Encryption - IPsec

I’m not going to pretend like this is a complete guide on IPSec, but I’m going to quickly go over the basics to get us to an encrypted DMVPN tunnel.

We’re going to configure an IPsec profile that calls an IKEv2 profile that uses a preshared key for authentication. I’m not going to get into the details of the various IKEv2 and IPsec hashing and encryption algorithms because for our purposes here today, the default profile values are good enough. A couple of years ago, Cisco phased out older algorithms and set the default values to be pretty good choices.

What we do need to do is start by creating an IKEv2 keyring. This keyring is going to contain the preshared key that will authenticate our spokes. In a real world environment, you might choose to use certificates to do this authentication but I’ve chosen to go with a PSK because it’s what is specified in the CCIE Enterprise Infrastructure blueprint.

Note that you specify the address that is valid for this key. We’ve specified 0.0.0.0 0.0.0.0 for the lab here but in a real world example, you’d want to be more specific.

See this content in the original post

Next we need to configure the IKEv2 profile. This IKEv2 profile is going to specify that we use the PSK for both local and remote authentication purposes, and we’re going to match remote addresses again against the 0.0.0.0 0.0.0.0 subnet mask. We’re also referencing the keyring that was configured in the previous step.

See this content in the original post

Next, we define the IPsec profile and reference the IKEv2 profile that was just created. We could specify the IPsec transform set here, but the default encryption setting are sufficient.

See this content in the original post

The last step is now to apply our IPsec profile to our tunnel interfaces.

See this content in the original post

After applying to all of the routers in the lab, I’m able to confirm traffic is flowing across the tunnels with a ping.

We can view some show commands for our IPsec tunnel. First let’s look at our configured IKEv2 profile with “show crypto ikev2 profile”. Here we can see the authentication methods, the keyring, and the identity addresses.

See this content in the original post

Next would be to check that the IKEv2 profiles are negotiating SAs with our DMVPN neighbors. “show crypto ikev2 sa” will give us the other tunnel endpoints that we’ve negotiated with. Note that the 10.0.1.3 SA is only created after I’ve sent a ping to that router.

See this content in the original post

Let’s next look at the IPsec profiles. Notice that there is a default configured profile available with different settings from our “DMVPN-PROFILE”.

See this content in the original post

And lastly let’s look at the output of “show crypto ipsec sa”

See this content in the original post

There’s a lot of information present there, but importantly, we’re seeing a complete security association for our hub router as well as the other spoke router.

See this content in the original post

IPv6 over DMVPN

As IPv6 is becoming more and more mainstream, there’s likely to be a scenario where you want to run IPv6 over a DMVPN tunnel. I’m going to first show you IPv6 NHRP commands over an IPv4 underlay; in the next section, I’ll show IPv6 over an IPv6 underlay.

The migration to IPv6 from an NHRP point of view is fairly easy. It’s mostly a migration from the “ip nhrp …” command to the “ipv6 nhrp …” commands. Of note, with an IPv4 underlay, we’re going to specify that the NBMA address of the NHS is an IPv4 address. I did use a separate tunnel interface for IPv6 in this config, you could technically run this over the same tunnel interface as IPv4. I would prefer separate IPv4 and IPv6 tunnels to make troubleshooting a little easier, as well you’ll see in the next session that the tunnel needs to be built specific for the underlay. So, a single tunnel can’t do IPv4 inside IPv4 and IPv6 inside IPv6, you would need two tunnels to make that happen.

Also, note that because I have multiple tunnels running on the same physical interface, that I have the tunnel key command configured.

Router 1

See this content in the original post

Router 2

See this content in the original post

Router 3

See this content in the original post

Here is the output of the “show dmvpn” command on router-1:

See this content in the original post

IPv6 over DMVPN with IPv6 Underlay

To migrate DMVPN to using an IPv6 NBMA network, we need to do two things, change the NBMA address in the next-hop-server command and change the tunnel mode to encapsulate using IPv6.

When we use the command “tunnel mode gre multipoint” the router will encapsulate using IPv4. Configuring the NHS with an IPv6 NBMA address will result in the following errors on the spoke routers:

See this content in the original post

To fix the above errors and allow the DMVPN tunnels to form over an IPv6 underlay network, change the tunnel mode to “tunnel mode gre multipoint ipv6”, and encapsulation will occur with IPv6 rather than IPv4.

As a setup to the below configuration, I’ve configured the network 2001:db8::/64 as the underlay network and 2001:db8::1 is the ethernet0/0 address of the hub router.
Router-1

See this content in the original post

Router-2

See this content in the original post

Router-3

See this content in the original post

As I mentioned above section, because we specify the encapsulation protocol (GRE over IPv4 or IPv6) via the tunnel mode command rather that it being an automatic decision based on the NHRP destination mapping, I would recommend maintaining two separate tunnels, one for IPv4 and one for IPv6. Using two tunnels is a more flexible design that allows for better redundancy, easier decommissioning of the IPv4 networks in the future, and maintains parity between the underlay and the overlay.