Internet-Draft | EVPN-IRB Extended Mobility | December 2024 |
Malhotra, et al. | Expires 5 June 2025 | [Page] |
This document specifies extensions to Ethernet VPN (EVPN) Integrated Routing and Bridging (IRB) procedures specified in RFC7432 and RFC9135 to enhance the mobility mechanisms for EVPN-IRB based networks. The proposed extensions improve the handling of host mobility and duplicate address detection in EVPN-IRB networks to cover a broader set of scenarios where a host's unicast IP address to MAC bindings may change across moves. These enhancements address limitations in the existing EVPN-IRB mobility procedures by providing more efficient and scalable solutions. The extensions are backward compatible with existing EVPN-IRB implementations and aim to optimize network performance in scenarios involving frequent IP address mobility.¶
NOTE TO IESG (TO BE DELETED BEFORE PUBLISHING): This draft lists six authors which is above the required limit of five. Given significant and active contributions to the draft from all six authors over the course of six years, we would like to request IESG to allow publication with six authors. Specifically, the three Cisco authors are the original inventors of these procedures and contributed heavily to rev 0 draft, most of which is still intact. AT&T is also a key contributor towards defining the use cases that this document addresses as well as the proposed solution. Authors from Nokia and Juniper have further contributed to revisions and discussions steadily over last six years to enable respective implementations and a wider adoption.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 5 June 2025.¶
Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
EVPN-IRB facilitates the advertisement of both MAC and IP routes via a single MAC+IP Route Type 2 (RT-2) advertisement. The MAC address is integrated into the local MAC-VRF bridge table, enabling Layer 2 (L2) bridged traffic across the network overlay. The IP address is incorporated into the local ARP/NDP table in an asymmetric IRB design, or into the IP-VRF routing table in a symmetric IRB design, facilitating routed traffic across the network overlay. For additional context on EVPN-IRB forwarding modes, refer to [RFC9135].¶
To support the EVPN mobility procedure, a single sequence number mobility attribute is advertised with the combined MAC+IP route. This approach, which resolves both MAC and IP reachability with a single sequence number, inherently assumes a fixed 1:1 mapping between IP and MAC. While this fixed 1:1 mapping is a common use case and is addressed via the existing MAC mobility procedure defined in [RFC7432], there are additional IRB scenarios that do not adhere to this assumption. Such scenarios are prevalent in virtualized host environments where hosts connected to an EVPN network are virtual machines (VMs) or containerized workloads. The following IRB mobility scenarios are considered:¶
A VM move results in the VM's IP and MAC moving together.¶
A VM move results in the VM's IP moving to a new MAC association.¶
A VM move results in the VM's MAC moving to a new IP association.¶
While the existing MAC mobility procedure can manage the MAC+IP move in the first scenario, the subsequent scenarios lead to new MAC-IP associations. Therefore, a single sequence number assigned independently per-{MAC, IP} is insufficient to determine the most recent reachability for both MAC and IP unless the sequence number assignment algorithm allows for changing MAC-IP bindings across moves.¶
This document updates the sequence number assignment procedures defined in [RFC7432] to adequately address mobility support across EVPN-IRB overlay use cases that permit MAC-IP bindings to change across VM moves and support mobility for both MAC and IP components carried in an EVPN RT-2 for these use cases.¶
Additionally, for hosts on an ESI multi-homed to multiple PE devices, additional procedures are specified to ensure synchronized sequence number assignments across the multi-homing devices.¶
This document addresses mobility for the following cases, independent of the overlay encapsulation (e.g., MPLS, SRv6, NVO Tunnel):¶
Following sections of the document are informative:¶
section 3 provides the necessary background and problem statement being addressed in this document.¶
section 4 lists the resulting design considerations for the document.¶
section 5 lists the main solution components that are foundational for the sepecifications that follow in subsequent sections.¶
Following sections of the document are normative:¶
section 6 describes the mobility and sequence number assigment procedures in an EVPN-IRB overlay required to address the scenarios described in section 4.¶
section 7 describes the mobility procedures for a routed overlay network as opposed to an IRB overlay.¶
section 8 describes corresponding duplicate detection procedures for EVPN-IRB and routed overlays.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
EVPN-IRB: A BGP-EVPN distributed control plane based integrated routing and bridging fabric overlay discussed in [RFC9135].¶
Underlay: IP, MPLS, or SRv6 fabric core network that provides routed reachability between EVPN PEs.¶
Overlay: L3 and L2 Virtual Private Network (VPN) enabled via NVO3, VXLAN, SRv6, or MPLS service layer encapsulation.¶
SRv6: Segment Routing IPv6 protocol as specified in [RFC8986].¶
NVO3: Network Virtualization Overlays as specified in [RFC8926].¶
VXLAN: Virtual eXtensible Local Area Network as specified in [RFC7348]¶
MPLS: Multi-Protocol Label Switching as specified in [RFC3031].¶
EVPN PE: A PE switch-router in a data-center fabric that runs overlay BGP-EVPN control plane and connects to overlay CE host devices. An EVPN PE may also be the first-hop layer-3 gateway for CE/host devices. This document refers to EVPN PE as a logical function in a data-center fabric. This EVPN PE function may be physically hosted on a top-of-rack switching device (ToR) OR at layer(s) above the ToR in the Clos fabric. An EVPN PE is typically also an IP or MPLS tunnel end-point for overlay VPN flow.¶
Symmetric EVPN-IRB: is a specific design approach used in EVPN-based networks [RFC9135] to handle both Layer 2 (L2) and Layer 3 (L3) forwarding within the same network infrastructure. The key characteristic of symmetric EVPN-IRB is that both ingress and egress PE routers perform routing for inter-subnet traffic.¶
Asymmetric EVPN-IRB: is a design approach used in EVPN-based networks [RFC9135] to handle Layer 2 (L2) and Layer 3 (L3) forwarding. In this approach, only the ingress Provider Edge (PE) router performs routing for inter-subnet traffic, while the egress PE router performs bridging.¶
ARP: Address Resolution Protocol [RFC826]. ARP references in this document are equally applicable to both ARP and NDP.¶
NDP: IPv6 Neighbor Discovery Protocol [RFC4861].¶
Ethernet-Segment: Physical ethernet or LAG port that connects an access device to an EVPN PE, as defined in [RFC7432].¶
EVPN all-active multi-homing: is a redundancy and load-sharing mechanism used in EVPN networks. This method allows multiple PE devices to simultaneously provide Layer 2 and Layer 3 connectivity to a single CE device or network segment.¶
RT-2: EVPN route type 2 carrying both MAC and IP reachability as specified in [RFC7432].¶
RT-5: EVPN route type 5 carrying IP prefix reachability as specified in [RFC7432].¶
MAC-IP: IPv4 and/or IPv6 address and MAC binding for an overlay host.¶
SYNC MAC route: In the context of EVPN multi-homing, this refers to a local MAC route SYNCed from another PE sharing the same ESI.¶
SYNC MAC-IP route: In the context of EVPN multi-homing, this refers to a local MAC-IP route SYNCed from another PE sharing the same ESI.¶
SYNC MAC sequence number: In the context of EVPN multi-homing, this refers to sequence number received with a SYNC MAC route.¶
SYNC MAC-IP sequence number: In the context of EVPN multi-homing, this refers to sequence number received with a SYNC MAC-IP route.¶
VM: Virtual Machine or containerized workloads. VM in this document generically refers to any host or endpoint attached to an EVPN-IRB network.¶
In an EVPN-IRB scenario, where a single MAC+IP RT-2 advertisement carries both IP and MAC routes, a MAC-only RT-2 advertisement becomes redundant for host MACs already advertised via MAC+IP RT-2. Consequently, the advertisement of a local MAC-only RT-2 is optional at an EVPN PE. This consideration is important for mobility scenarios discussed in subsequent sections. It is noteworthy that a local MAC and its assigned sequence number are still maintained locally on a PE, and only the advertisement of this route to other PEs is optional.¶
MAC-only RT-2 advertisements may still be issued for non-IP host MACs that are not included in MAC+IP RT-2 advertisements.¶
This section outlines the IRB mobility use cases addressed in this document. Detailed procedures to handle these scenarios are provided in Sections 6 and 7.¶
A host move results in both the host's IP and MAC addresses moving together.¶
A host move results in the host's IP address moving to a new MAC address association.¶
A host move results in the host's MAC address moving to a new IP address association.¶
This is the baseline scenario where a host move results in both the host's MAC and IP addresses moving together without altering the MAC-IP binding. The existing MAC mobility procedures defined in [RFC7432] can be leveraged to support this MAC+IP mobility scenario.¶
This scenario involves a host move where the host's IP address is reassigned to a new MAC address.¶
A host reload or orchestrated move may cause a host to be re-spawned at a new location, resulting in a new MAC assignment while retaining the existing IP address. This results in the host's IP moving to a new MAC binding, as shown below:¶
IP-a, MAC-a ---> IP-a, MAC-b¶
This scenario considers cases where multiple hosts, each with a unique IP address, share a common MAC address. A host move results in a new MAC binding for the host IP. For example, hosts running on a single physical server might share the same MAC. Alternatively, an L2 access network behind a firewall may have all host IP addresses learned with a common firewall MAC. In these "shared MAC" scenarios, multiple local MAC-IP ARP/NDP entries may be learned with the same MAC. A host IP address move to a new physical server could result in a new MAC association for the host IP.¶
In the aforementioned scenarios, a combined MAC+IP EVPN RT-2 advertised with a single sequence number attribute assumes a fixed IP-to-MAC mapping. A host IP address move to a new MAC breaks this assumption and results in a new MAC+IP route. If this new route is independently assigned a new sequence number, the sequence number can no longer determine the most recent host IP reachability in a symmetric EVPN-IRB design or the most recent IP-to-MAC binding in an asymmetric EVPN-IRB design.¶
Figure 1 illustrates a topology with host VMs sharing the physical server MAC. In steady state, the IP1-M1 route is learned at PE1 and PE2 and advertised to remote PEs with a sequence number N. If VM-IP1 moves to Server-M2, ARP or NDP-based local learning at PE3 and PE4 would result in a new IP1-M2 route. If this new route is assigned a sequence number of 0, the mobility procedure for VM-IP1 will not trigger across the overlay network.¶
A sequence number assignment procedure must be defined to unambiguously determine the most recent IP reachability, IP-to-MAC binding, and MAC reachability for such MAC sharing scenarios.¶
This is a scenario where a host move or re-provisioning behind a new gateway location may result in the host getting a new IP address assigned, while keeping the same MAC.¶
The complication in this scenario arises because MAC reachability can be carried via a combined MAC+IP route, whereas a MAC-only route may not be advertised. Associating a single sequence number with the MAC+IP route implicitly assumes a fixed MAC-to-IP mapping. A MAC move that results in a new IP association breaks this assumption and creates a new MAC+IP route. If this new route independently receives a new sequence number, the sequence number can no longer reliably indicate the most recent host MAC reachability.¶
For instance, consider host VM IP1-M1 learned locally at PE1 and PE2 and advertised to remote hosts with sequence number N. If this VM with MAC M1 is re-provisioned at Server2 and assigned a different IP address (e.g., IP7), the new IP7-M1 route learned at PE3 and PE4 would be advertised with sequence number 0. Consequently, L3 reachability to IP7 would be established across the overlay, but the MAC mobility procedure for M1 would not trigger due to the new MAC-IP route advertisement. Advertising an optional MAC-only route with its sequence number would trigger MAC mobility per [RFC7432]. However, without this additional advertisement, a single sequence number associated with a combined MAC+IP route may be insufficient to update MAC reachability across the overlay.¶
A MAC-IP sequence number assignment procedure is required to unambiguously determine the most recent MAC reachability in such scenarios without advertising a MAC-only route.¶
Furthermore, PE1 and PE2, upon learning new reachability for IP7-M1 via PE3 and PE4, must probe and delete any local IPs associated with MAC M1, such as IP1-M1.¶
It could be argued that the MAC mobility sequence number defined in [RFC7432] applies only to the MAC part of a MAC-IP route, thus covering this scenario. This interpretation could serve as a clarification to [RFC7432] and supports the need for a common sequence number assignment procedure across all MAC-IP mobility scenarios detailed in this document.¶
Consider an EVPN-IRB overlay network illustrated in Figure 3, where hosts are multi-homed to two or more PE devices via an all-active multi-homed ES. MAC and ARP/NDP entries learned on a local ES may also be synchronized across the multi-homing PE devices sharing this ES. This synchronization enables local switching of intra- and inter-subnet ECMP traffic flows from remote hosts. Thus, local MAC and ARP/NDP entries on a given ES may be learned through local learning and/or synchronization from another PE device sharing the same ES.¶
For a host that is multi-homed to multiple PE devices via an all-active ES interface, the local learning of host MAC and MAC-IP at each PE device is an independent asynchronous event, dependent on traffic flow or ARP/NDP response from the host hashing to a directly connected PE on the MC-LAG interface. Consequently, the sequence number mobility attribute value assigned to a locally learned MAC or MAC-IP route at each device may not always be the same, depending on transient states on the device at the time of local learning.¶
For example, consider a host VM that is deleted from ESI-2 and moved to ESI-1. It is possible for the host to be learned on PE1 following the deletion of the remote route from PE3 and PE4, while being learned on PE2 prior to the deletion of the remote route from PE3 and PE4. In this case, PE1 would process local host route learning as a new route and assign a sequence number of 0, while PE2 would process local host route learning as a remote-to-local move and assign a sequence number of N+1, where N is the existing sequence number assigned at PE3 and PE4.¶
Inconsistent sequence numbers advertised from multi-homing devices:¶
Creates ambiguity regarding how remote PEs should handle paths with the same ESI but different sequence numbers. A remote PE might not program ECMP paths if it receives routes with different sequence numbers from a set of multi-homing PEs sharing the same ESI.¶
Breaks consistent route versioning across the network overlay that is needed for EVPN mobility procedures to work.¶
For instance, in this inconsistent state, PE2 would drop a remote route received for the same host with sequence number N (since its local sequence number is N+1), while PE1 would install it as the best route (since its local sequence number is 0).¶
To support mobility for multi-homed hosts using the sequence number mobility attribute, local MAC and MAC-IP routes learned on a multi-homed ES must be advertised with the same sequence number by all PE devices to which the ES is multi-homed. There is a need for a mechanism to ensure the consistency of sequence numbers assigned across these PEs.¶
To summarize, the sequence number assignment scheme and implementation must consider the following:¶
Synchronization Across Multi-Homing PE Devices: MAC+IP may be learned on an ES multi-homed to multiple PE devices, requiring synchronized sequence numbers across these devices.¶
Optional MAC-Only RT-2: In an IRB scenario, MAC-only RT-2 is optional and may not be advertised alongside MAC+IP RT-2.¶
Multiple IPs Associated with a Single MAC: A single MAC may be linked to multiple IP addresses, indicating multiple host IPs sharing a common MAC.¶
Host IP Movement: A host IP address move may result in a new MAC association, necessitating a new IP to MAC association and a new MAC+IP route.¶
Host MAC Movement: A host MAC move may result in a new IP association, requiring a new MAC to IP association and a new MAC+IP route.¶
Local MAC-IP Learning via ARP/NDP: Local MAC-IP learning via ARP/NDP always accompanies a local MAC learning event resulting from the ARP/NDP packet. However, MAC and MAC-IP learning can occur in any order.¶
Separate Sequence Numbers for MAC and IP: Use cases that do not maintain a constant 1:1 MAC-IP mapping across moves could potentially be addressed by using separate sequence numbers for MAC and IP components of the MAC+IP route. However, maintaining two separate sequence numbers adds significant complexity, debugging challenges, and backward compatibility issues. Therefore, this document addresses these requirements using a single sequence number attribute.¶
This section outlines the main components of the EVPN-IRB mobility solution specified in this document. Subsequent sections will detail the exact sequence number assignment procedures based on the concepts described here.¶
The key concept presented here is to treat a local MAC-IP route as a child of the corresponding local MAC route within the local context of a PE. This ensures that the local MAC-IP route inherits the sequence number attribute from the parent local MAC-only route. In terms of object dependencies, this could be represented as MAC-IP route being a dependent child of the parent MAC:¶
Mx-IPx -----> Mx (seq# = N)¶
Thus, both the parent MAC and child MAC-IP routes share a common sequence number associated with the parent MAC route. This ensures that a single sequence number attribute carried in a combined MAC+IP route represents the sequence number for both a MAC-only route and a MAC+IP route, making the advertisement of the MAC-only route truly optional. This enables a MAC to assume a different IP address upon moving and still establish the most recent reachability to the MAC across the overlay network via the mobility attribute associated with the MAC+IP route advertisement. For instance, when Mx moves to a new location, it would be assigned a higher sequence number at its new location per [RFC7432]. If this move results in Mx assuming a different IP address, IPz, the local Mx+IPz route would inherit the new sequence number from Mx.¶
Local MAC and local MAC-IP routes are typically sourced from data plane learning and ARP/NDP learning, respectively, and can be learned in the control plane in any order. Implementation can either replicate the inherited sequence number in each MAC-IP entry or maintain a single attribute in the parent MAC by creating a forward reference local MAC object for cases where a local MAC-IP is learned before the local MAC.¶
For the shared MAC scenario, multiple local MAC-IP siblings inherit the sequence number attribute from the common parent MAC route:¶
In such cases, a host-IP move to a different physical server results in the IP moving to a new MAC binding. A new MAC-IP route resulting from this move must be advertised with a sequence number higher than the previous MAC-IP route for this IP, advertised from the prior location. For example, consider a route Mx-IPx currently advertised with sequence number N from PE1. If IPx moves to a new physical server behind PE2 and is associated with MAC Mz, the new local Mz-IPx route must be advertised with a sequence number higher than N and the previous Mz sequence number M. This allows PE devices, including PE1, PE2, and other remote PE devices, to determine and program the most recent MAC binding and reachability for the IP. PE1, upon receiving this new Mz-IPx route with sequence number N+1, would update IPx reachability via PE2 for symmetric IRB and update IPx's ARP/NDP binding to Mz for asymmetric IRB, while clearing and withdrawing the stale Mx-IPx route with the lower sequence number.¶
This implies that the sequence number associated with local MAC Mz and all local MAC-IP children of Mz at PE2 must be incremented to N+1 or M+1 if the previous Mz sequence number M is greater than N and re-advertised across the overlay. While this re-advertisement of all local MAC-IP children routes affected by the parent MAC route adds overhead, it avoids the need for maintaining and advertising two separate sequence number attributes for IP and MAC components of MAC+IP RT-2. Implementation must be able to look up MAC-IP routes for a given IP and update the sequence number for its parent MAC and its MAC-IP children.¶
To support mobility for multi-homed hosts, local MAC and MAC-IP routes learned on a shared ES must be advertised with the same sequence number by all PE devices to which the ES is multi-homed. This applies to local MAC-only routes as well. Local MAC and MAC-IP may be learned natively via data plane and ARP/NDP respectively, as well as via SYNC from another multi-homing PE to achieve local switching. Local and SYNC route learning can occur in any order. Local MAC-IP routes advertised by all multi-homing PE devices sharing the ES must carry the same sequence number, independent of the order in which they are learned. This implies:¶
On local or SYNC MAC-IP route learning, the sequence number for the local MAC-IP route must be compared and updated to the higher value.¶
On local or SYNC MAC route learning, the sequence number for the local MAC route must be compared and updated to the higher value.¶
If an update to the local MAC-IP sequence number is required as a result of the comparison with the SYNC MAC-IP route, it essentially amounts to a sequence number update on the parent local MAC, resulting in an inherited sequence number update on the MAC-IP route.¶
The following sections specify the sequence number assignment procedures required for local and SYNC MAC and MAC-IP route learning events to achieve the objectives outlined.¶
A local Mx-IPx learning via ARP or NDP should result in the computation or re-computation of the parent MAC Mx's sequence number, following which the MAC-IP route Mx-IPx inherits the parent MAC's sequence number. The parent MAC Mx sequence number MUST be computed as follows:¶
MUST be higher than any existing remote MAC route for Mx, as per [RFC7432].¶
MUST be at least equal to the corresponding SYNC MAC sequence number, if present.¶
If the IP is also associated with a different remote MAC "Mz," it MUST be higher than the "Mz" sequence number.¶
Once the new sequence number for MAC route Mx is computed as per the above criteria, all local MAC-IPs associated with MAC Mx MUST inherit the updated sequence number.¶
The local MAC Mx Sequence number MUST be computed as follows:¶
MUST be higher than any existing remote MAC route for Mx, as per [RFC7432].¶
MUST be at least equal to the corresponding SYNC MAC sequence number if one is present. If the existing local MAC sequence number if less than the SYNC MAC sequence number, PE MUST update the local MAC sequence number to be equal to the SYNC MAC sequence number. If the existing local MAC sequence number is equal to or greater than the SYNC MAC sequence number, no update is required to the local MAC sequence number.¶
Once the new sequence number for MAC route Mx is computed as per the above criteria, all local MAC-IPs associated with MAC Mx MUST inherit the updated sequence number. Note that the local MAC sequence number might already be present if there was a local MAC-IP learned prior to the local MAC, in which case the above may not result in any change in the local MAC's sequence number.¶
Upon receiving a remote MAC or MAC-IP route update associated with a MAC Mx with a sequence number that is:¶
Either higher than the sequence number assigned to a local route for MAC Mx,¶
Or equal to the sequence number assigned to a local route for MAC Mx, but the remote route is selected as best due to a lower VTEP IP as per [RFC7432],¶
the following actions are REQUIRED on the receiving PE:¶
Upon receiving a REMOTE SYNC, the corresponding local MAC Mx (if present) sequence number should be re-computed as follows:¶
If the current sequence number is less than the received SYNC MAC sequence number, it MUST be increased to be equal to the received SYNC MAC sequence number.¶
If a local MAC sequence number is updated as a result of the above, all local MAC-IPs associated with MAC Mx MUST inherit the updated sequence number.¶
Receiving a SYNC MAC-IP for a locally attached host results in a derived SYNC MAC Mx route entry, as the MAC-only RT-2 advertisement is optional. The corresponding local MAC Mx (if present) sequence number should be re-computed as follows:¶
If the current sequence number is less than the received SYNC MAC sequence number, it MUST be increased to be equal to the received SYNC MAC sequence number.¶
If a local MAC sequence number is updated as a result of the above, all local MAC-IPs associated with MAC Mx MUST inherit the updated sequence number.¶
Generally, if all PE nodes in the overlay network follow the above sequence number assignment procedures and the PE is advertising both MAC+IP and MAC routes, the sequence numbers advertised with the MAC and MAC+IP routes with the same MAC would always be the same. However, an interoperability scenario with a different implementation could arise, where a non-compliant PE implementation assigns and advertises independent sequence numbers to MAC and MAC+IP routes. To handle this case, if different sequence numbers are received for remote MAC+IP and corresponding remote MAC routes from a remote PE, the sequence number associated with the remote MAC route MUST be computed and interpreted as:¶
The highest of all received sequence numbers with remote MAC+IP and MAC routes with the same MAC.¶
The MAC sequence number would be re-computed on a MAC or MAC+IP route withdraw as per the above.¶
A MAC and/or IP move to the local PE would then result in the MAC (and hence all MAC-IP) sequence numbers being incremented from the above computed remote MAC sequence number.¶
If MAC-only routes are not advertised at all, and different sequence numbers are received with multiple MAC+IP routes for a given MAC, the sequence number associated with the derived remote MAC route should still be computed as the highest of all received MAC+IP sequence numbers with the same MAC.¶
Note that it is not required for a PE to maintain explicit knowledge of a remote PE being compliant or non-compliant with this specification as long as it implements the above logic to handle remote sequence numbers that are not synchronized between MAC route and MAC-IP route(s) for the same remote MAC.¶
In a MAC sharing use case described in section 5.2, a race condition is possible with simultaneous host moves between a pair of PEs. Example scenario below illustrates this race condition and its remediation:¶
PE1 with locally attached host IPs I1 and I2 that share MAC M1. PE1 as a result has local MAC-IP routes I1-M1 and I2-M1.¶
PE2 with locally attached host IPs I3 and I4 that share MAC M2. PE2 as a result has local MAC-IP routes I3-M2 and I4-M2.¶
A simultaneous move of I1 from PE1 to PE2 and of I3 from PE2 to PE1 will cause I1's MAC to change from M1 to M2 and cause I3's MAC to change from M2 to M1.¶
Route I3-M1 may be learnt on PE1 before I1's local entry I1-M1 has been probed out on PE1 and/or route I1-M2 may be learnt on PE2 before I3's local entry I3-M2 has been probed out on PE2.¶
In such a scenario, MAC sequence number assignment rules defined in section 6.1 will cause new mac-ip routes I1-M2 and I3-M1 to bounce between PE1 and PE2 with seuence number increments until stale entries I1-M1 and I3-M2 have been probed out from PE1 and PE2 respectively.¶
An implementation MUST ensure proper probing procedures to remove stale ARP, NDP, and local MAC entries, following a move, on learning remote routes as defined in section 6.3 (and as per [RFC9135]) to minimize exposure to this race condition.¶
This section is optional and details ARP and NDP probing procedures that MAY be implemented to achieve faster host re-learning and convergence on mobility events. PE1 and PE2 are used as two example PEs in the network to illustrate the mobility convergence scenarios in this section.¶
Following a host move from PE1 to PE2, the host's MAC is discovered at PE2 as a local MAC via data frames received from the host. If PE2 has a prior remote MAC-IP host route for this MAC from PE1, an ARP/NDP probe MAY be triggered at PE2 to learn the MAC-IP as a local adjacency and trigger EVPN RT-2 advertisement for this MAC-IP across the overlay with new reachability via PE2. This results in a reliable "event-based" host IP learning triggered by a "MAC learning event" across the overlay, and hence faster convergence of overlay routed flows to the host.¶
Following a host move from PE1 to PE2, once PE1 receives a MAC or MAC-IP route from PE2 with a higher sequence number, an ARP/NDP probe MAY be triggered at PE1 to clear the stale local MAC-IP neighbor adjacency or to re-learn the local MAC-IP in case the host has moved back or is duplicated.¶
Following a local MAC age-out, if there is a local IP adjacency with this MAC, an ARP/NDP probe MAY be triggered for this IP to either re-learn the local MAC and maintain local L3 and L2 reachability to this host or to clear the ARP/NDP entry if the host is no longer local. This accomplishes the clearance of stale ARP/NDP entries triggered by a MAC age-out event even when the ARP/NDP refresh timer is longer than the MAC age-out timer. Clearing stale IP neighbor entries facilitates traffic convergence if the host was silent and not discovered at its new location. Once the stale neighbor entry for the host is cleared, routed traffic flow destined for the host can re-trigger ARP/NDP discovery for this host at the new location.¶
The above probing logic may be generalized as probing for an IP neighbor anytime a resolving parent MAC route is inconsistent with the MAC-IP neighbor route, where inconsistency is defined as being not present or conflicting in terms of the route source being local or remote. The MAC-IP to MAC parent relationship described in section 5.1 MAY be used to achieve this logic.¶
An additional use case involves traffic to an end host in the overlay being entirely IP routed. In such a purely routed overlay:¶
A host MAC is never advertised in the EVPN overlay control plane.¶
Host /32 or /128 IP reachability is distributed across the overlay via EVPN Route Type 5 (RT-5) along with a zero or non-zero ESI.¶
An overlay IP subnet may still be stretched across the underlay fabric. However, intra-subnet traffic across the stretched overlay is never bridged.¶
Both inter-subnet and intra-subnet traffic in the overlay is IP routed at the EVPN PE.¶
Please refer to [RFC7814] for more details.¶
Host mobility within the stretched subnet still needs support. In the absence of host MAC routes, the sequence number mobility Extended Community specified in [RFC7432] section 7.7 MAY be associated with a /32 or /128 host IP prefix advertised via EVPN Route Type 5. MAC mobility procedures defined in [RFC7432] can be applied to host IP prefixes as follows:¶
On local learning of a host IP on a new ESI, the host IP MUST be advertised with a sequence number higher than what is currently advertised with the old ESI.¶
On receiving a host IP route advertisement with a higher sequence number, a PE MUST trigger ARP/NDP probe and deletion procedures on any local route for that IP with a lower sequence number. The PE will update the forwarding entry to point to the remote route with a higher sequence number and send an ARP/NDP probe for the local IP route. If the IP has moved, the probe will time out, and the local IP host route will be deleted.¶
Note that there is only one sequence number associated with a host route at any time. For previous use cases where a host MAC is advertised along with the host IP, a sequence number is only associated with the MAC. If the MAC is not advertised, as in this use case, a sequence number is associated with the host IP.¶
This mobility procedure does not apply to "anycast IPv6" hosts advertised via NA messages with the Override Flag (O Flag) set to 0. Refer to [RFC9161] for more details.¶
Duplicate host detection scenarios across EVPN-IRB can be classified as follows:¶
Scenario A: Two hosts have the same MAC address (host IPs may or may not be duplicates).¶
Scenario B: Two hosts have the same IP address but different MAC addresses.¶
Scenario C: Two hosts have the same IP address, and the host MAC is not advertised.¶
As specified in [RFC9161], Duplicate detection procedures for Scenarios B and C do not apply to "anycast IPv6" hosts advertised via NA messages with the Override Flag (O Flag) set to 0.¶
In cases where duplicate hosts share the same MAC address, the MAC is detected as duplicate using the duplicate MAC detection procedure described in [RFC7432]. Corresponding MAC-IP routes with the same MAC do not require separate duplicate detection and MUST inherit the duplicate property from the MAC route. If a MAC route is marked as duplicate, all associated MAC-IP routes MUST also be treated as duplicates. Duplicate detection procedures need only be applied to MAC routes.¶
Misconfigurations may lead to different MAC addresses being assigned the same IP address. This scenario is not detected by [RFC7432] duplicate MAC detection procedures and can result in incorrect routing of traffic destined for the IP address.¶
Such situations, when detected locally, are identified as a move scenario through the local MAC sequence number computation procedure described in section 6.1:¶
If the IP is associated with a different remote MAC "Mz," the sequence number MUST be higher than the "Mz" sequence number.¶
This move results in a sequence number increment for the local MAC due to the remote MAC-IP route associated with a different MAC, counting as an "IP move" against the IP, independent of the MAC. The duplicate detection procedure described in [RFC7432] can then be applied to the IP entity independent of the MAC. Once an IP is detected as duplicate, the corresponding MAC-IP route should be treated as duplicate. Associated MAC routes and any other MAC-IP routes related to this MAC should not be affected.¶
The duplicate IP detection procedure for this scenario is specified in [RFC9161]. An "IP move" is further clarified as follows:¶
Upon learning a local MAC-IP route Mx-IPx, check for existing remote or local routes for IPx with a different MAC association (Mz-IPx). If found, count this as an "IP move" for IPx, independent of the MAC.¶
Upon learning a remote MAC-IP route Mz-IPx, check for existing local routes for IPx with a different MAC association (Mx-IPx). If found, count this as an "IP move" for IPx, independent of the MAC.¶
A MAC-IP route SHOULD be treated as duplicate if either:¶
In a purely routed overlay scenario, as described in section 7, where only a host IP is advertised via EVPN RT-5 with a sequence number mobility attribute, duplicate MAC detection procedures specified in [RFC7432] can be applied intuitively to IP-only host routes for duplicate IP detection.¶
Upon learning a local host IP route IPx, check for existing remote or local routes for IPx with a different ESI association. If found, count this as an "IP move" for IPx.¶
Upon learning a remote host IP route IPx, check for existing local routes for IPx with a different ESI association. If found, count this as an "IP move" for IPx.¶
Using configurable parameters "N" and "M," if "N" IP moves are detected within "M" seconds for IPx, IPx should be treated as duplicate.¶
Once a MAC or IP is marked as duplicate and frozen, corrective action must be taken to un-provision one of the duplicate MAC or IP addresses. Un-provisioning refers to corrective action taken on the host side. Following this correction, normal operation will not resume until the duplicate MAC or IP ages out unless additional action is taken to expedite recovery.¶
Possible additional corrective actions for faster recovery include:¶
Unfreezing the duplicate or frozen MAC or IP via a CLI can be used to recover from the duplicate and frozen state following corrective un-provisioning of the duplicate MAC or IP. Unfreezing the MAC or IP should result in advertising it with a sequence number higher than that advertised from the other location.¶
Two scenarios exist:¶
Scenario A: The duplicate MAC or IP is un-provisioned at the location where it was not marked as duplicate.¶
Scenario B: The duplicate MAC or IP is un-provisioned at the location where it was marked as duplicate.¶
Unfreezing the duplicate and frozen MAC or IP will result in recovery to a steady state as follows:¶
Scenario A: If the duplicate MAC or IP is un-provisioned at the non-duplicate location, unfreezing the route at the frozen location results in advertising with a higher sequence number, leading to automatic clearing of the local route at the un-provisioned location via ARP/NDP PROBE and DELETE procedures.¶
Scenario B: If the duplicate host is un-provisioned at the duplicate location, unfreezing the route triggers an advertisement with a higher sequence number to the other location, prompting re-learning and clearing of the local route at the original location upon receiving the remote route advertisement.¶
Probes referred to in these scenarios are event-driven probes resulting from receiving a route with a higher sequence number. Periodic probes resulting from refresh timers may also occur independently.¶
In addition to the above, route clearing CLIs may be used to clear the local MAC or IP route after the duplicate host is un-provisioned:¶
Clear MAC CLI: Used to clear a duplicate MAC route.¶
Clear ARP/NDP: Used to clear a duplicate IP route.¶
The route unfreeze CLI may still need to be executed if the route was un-provisioned and cleared from the non-duplicate location. Given that unfreezing the route via the CLI would result in auto-clearing from the un-provisioned location, as explained earlier, using a route clearing CLI for recovery from the duplicate state is optional.¶
Security considerations discussed in [RFC7432] and [RFC9135] apply to this document. Methods described in this document further extend the consumption of sequence numbers for IRB deployments. They are hence subject to same considerations if the control plane or data plane was to be compromised. As an example, if host facing data plane is compromised, spoofing attempts could result in a legitimate host being perceived as moved, eventually resulting in the host being marked as duplicate. Considerations for protecting control and data plane described in [RFC7432] are equally applicable to such mobility spoofing use cases.¶
None.¶
Authors would like to thank Gunter van de Velde for significant contribution to improve the readability of this document. Authors would also like to thank Sonal Agarwal and Larry Kreeger for multiple contributions through the implementation process. Authors would like to thank Vibov Bhan and Patrice Brissette for early feedback during implementation and testing of several procedures defined in this document. Authors would like to thank Wen Lin for a detailed review and valuable comments related to MAC sharing race conditions. Authors would also like to thank Saumya Dikshit for a detailed review and valuable comments across the document.¶