As Enterprises build Data Centers at different locations for disaster recovery and traffic distribution, there is a need to interconnect them transparently. Stretching Layer 2 across a WAN poses some challenges.
1) Workload Mobility aka VM migration from one DC to another.
2) Fast convergence in a multi homed environment.
3) Load balancing across multiple active paths between data centers.
The Trombone effect when migrating VMs across a WAN.
When VM1 is moved from one Hypervisor in DC1 to the other Hypervisor in DC2, the default GW for VM1 still resides on DC1. When VM1 sends traffic to VM2, the traffic will traverse the core before tromboning back to DC2.
EVPN solves this. EVPN is a similar technology to VPLS except that mac addresses are learned and exchanged through the control plane using BGP as the transport protocol. A new BGP family is introduced called EVPN.
bgp {
group IBGP {
local-address 1.1.1.1;
family evpn {
signaling;
}
neighbor 2.2.2.2;
}
}
First an understanding of how EVPN works.
In a multi-tenant environment, each tenant will correspond to an EVPN instance (EVI). Route Distinguishers are used to distinguish between each EVI and Route Targets are used to share learned mac addresses between EVIs.
For mac learning, each PE router snoops for DHCP and/or ARP(IPv4)/ND(IPv6) packets for a particular EVI. The PE can then advertise the locally learned MAC address to remote PE nodes through MP-iBGP. MAC addresses are aggregated and a MAC prefix is advertised rather than advertising every single MAC address, thus allowing the ability to scale thousands of MAC addresses. When a remote PE receives this bgp update it will extract the mac address and build a table with the next-hop pointing to the LSP of the advertising PE. Because this is BGP, policies can be created to filter and manipulate forwarding decisions.
When a local PE router sees an ARP request for an IP address and if the PE router has the MAC address binding for that IP address across the wan, the PE router performs a proxy ARP and responds to the ARP Request and can make the forwarding decision locally. This reduces (BUM) flooding (Broadcast, Unknown Unicast and Multicast) across WAN links.
Gateway IP and MAC addresses syncing in EVPN allows the host to use the nearest gateway to route traffic. You do this by creating IRBs on both PEs using different GW IP addresses. To accomplish this IRBs (IP + MAC addresses) are advertised using a BGP extended community. When VM1 migrates to DC2, it sends packets to the mac address associated to GW IP address of DC1. The IRB in DC2 notices that the destination mac address for these packets is across the WAN, so it does the routing locally. When the arp entry for the GW in VM1 expires, the VM will arp again and the IRB in DC2 will send a reply to VM1 with it's updated mac address.
Another thing that happens when VM migration is performed in an EVPN network, the MAC address of the VM is now advertised in DC2, the PE in DC2 updates their mac table table while the PE in DC1 withdraws the entry.
To address fast convergence in a multi homed environment, a concept called an Ethernet Segment is introduced. The set of links connecting to two or more local PE routers are called an Ethernet Segment. Each segment has an unique identifier called an ESI. An ethernet tag is also used to identify each broadcast domain such a vlan. When an Ethernet segment fails, the local PE withdraws the corresponding Ethernet "route" from BGP which triggers all remote PE routers to update their forwarding tables to update the corresponding next-hop to the backup PE.
EVPN introduces Split Horizon. BUM flooding aka, Broadcast, Unknown unicast or Multicast traffic are encapsulated in a MPLS packet with the Ethernet Segment Identifier. This allows the Egress PE to make a forwarding decision and prevents loops, because the PEs know where the packet originated from.
This in turn makes it possible to forward traffic over multiple active links through the WAN and allows for the ability to load balance.
With these advantages EVPN makes it a viable choice for interconnecting Data Centers.
No comments:
Post a Comment