Skip to main content

220 - EVPN VXLAN

·
ccie notes
Hugo
Author
Hugo
DevOps Engineer in London

VXLAN is extending Layer 2 networks across Layer 3.

There are challenges in bringing Layer 2 and Layer 3 networks. BGP EVPN route types help solve these problems:

  1. Distributing L2 and L3 Information
  • BGP EVPN advertises both L2 and L3 info in a single control plane.
  • It’s better for scaling and control than traditional flood-and-learn methods.
  1. Reducing ARP Broadcasts
  • Route Type 2 brings L2 info (MAC addresses) into L3 via BGP.
  • It replaces ARP broadcasts with centralized MAC-IP binding, reducing ARP traffic.
  1. Managing Multicast Traffic
  • Route Type 3 handles VTEP discovery and BUM traffic.
  • It simplifies multicast by removing the need for complex protocols like PIM in the underlay.
  1. Preventing Loops and Supporting Multi-Homing
  • Route Types 1 and 4 eliminate the need for STP in overlay networks.

Route Types
#

Type 2: MAC/IP Advertisement
#

MAC-VTEP mapping
#

Contains: MAC address, VTEP IP, VNI

  • Router looks up the destination MAC in its ARP table
  • Router then looks up this MAC in the EVPN table to find the associated VTEP and VNI
  • Router encapsulates the packet with the correct VXLAN header and sends to the appropriate VTEP

IP-MAC-VTEP mapping
#

Contains: IP address, MAC address, VTEP IP, VNI

  • Router looks up the destination IP directly in the EVPN table
  • Router gets the associated MAC, VTEP, and VNI in one lookup
  • Router can create the packet with the correct MAC address and VXLAN encapsulation without needing a separate ARP process

Type 3: Inclusive Multicast Ethernet Tag
#

Contains: VTEP IP + VNI

  • Used for BUM (Broadcast, Unknown Unicast, Multicast) traffic
  • Identifies which VTEPs are interested in receiving BUM traffic for a specific VNI
  • Allows efficient distribution of BUM traffic without unnecessary flooding

Type 5: IP Prefix
#

Contains: IP Prefix + VTEP IP + VNI

  • Used for inter-subnet routing in EVPN
  • Advertises IP prefixes across the EVPN domain
  • Enables IP routing between different VNIs or between EVPN and external networks

Type 1: Ethernet Auto-Discovery
#

There are two type of Type 1 Route. They look very similar but for different purposes:

Per-EVI Route
#

  • Used for aliasing and load balancing in multi-homing scenarios.
  • Per-EVI (EVPN Instance) refers to per VRF configuration, allowing different VRFs to have distinct multi-homing settings.

Why per VRF needed? - Different VRFs may have different multi-homing configurations, requiring specific settings for each VRF.

Example:

Configure physical interfaces but use subinterfaces for different VRFs:

interface Ethernet1/1
  no switchport
  ethernet-segment 1 identifier 00:01:02:03:04:05:06:07:08:09
  lacp system-id 0000.0000.0001

interface Ethernet1/1.100
  encapsulation dot1q 100
  vrf member VRF_A

interface Ethernet1/1.200  
  encapsulation dot1q 200
  vrf member VRF_B

Configure different Ethernet Segment settings per VRF:

evpn
  ethernet-segment 1
    vrf VRF_A
      redundancy mode all-active
    vrf VRF_B  
      redundancy mode single-active

How it is used for Load Balancing/Aliasing?

  • PE1 and PE2 are multi-homed to a server (ES1).
  • PE3 learns the server’s MAC address from PE1’s MAC advertisement (RT-2).
  • PE3 also receives Per-EVI routes from both PE1 and PE2 for the same ethernet segment (ES1).
  • PE3, although it learned the MAC from PE1, can now load-balance traffic to the server across both PE1 and PE2.

PE1

BGP routing table for L2VPN EVPN
Route Distinguisher: 192.168.1.1:100 (VRF1)
*>i[1]:[0]:[0]:[48]:[aa:bb:cc:dd:ee:ff]:[0]/104
     192.168.1.1 (PE1)
     Ethernet Tag ID: 0
     ESI: 00:00:00:00:00:aa:bb:cc:dd:ee
     Label: 315168

Route Distinguisher: 192.168.1.1:200 (VRF2)
*>i[1]:[0]:[0]:[48]:[aa:bb:cc:dd:ee:ff]:[0]/104
     192.168.1.1 (PE1)
     Ethernet Tag ID: 0
     ESI: 00:00:00:00:00:aa:bb:cc:dd:ee
     Label: 315169
BGP routing table for L2VPN EVPN
Route Distinguisher: 192.168.1.2:100 (VRF1)
*>i[1]:[0]:[0]:[48]:[aa:bb:cc:dd:ee:ff]:[0]/104
     192.168.1.2 (PE2)
     Ethernet Tag ID: 0
     ESI: 00:00:00:00:00:aa:bb:cc:dd:ee
     Label: 315170

Route Distinguisher: 192.168.1.2:200 (VRF2)
*>i[1]:[0]:[0]:[48]:[aa:bb:cc:dd:ee:ff]:[0]/104
     192.168.1.2 (PE2)
     Ethernet Tag ID: 0
     ESI: 00:00:00:00:00:aa:bb:cc:dd:ee
     Label: 315171

PE1 advertises the MAC/IP Advertisement (RT-2) route:

Route Distinguisher: 192.168.1.1:100
*>i[2]:[0]:[48]:[11:22:33:44:55:66]:[0]/104
     192.168.1.1 (PE1)           100      0 i
     ESI: 00:00:00:00:00:aa:bb:cc:dd:ee
     Ethernet Tag ID: 0
     MAC: 11:22:33:44:55:66
     Label: 315168

Per-ESI Route
#

  • Identifies which VTEPs belong to the same Ethernet Segment.
  • Enables fast convergence, followed by mass MAC address withdrawals associated with an Ethernet Segment (ES).
  • Used for split-horizon filtering to prevent network loops.

Example:

From PE1:
BGP routing table for L2VPN EVPN
Route Distinguisher: 192.168.1.1:100
*>i[1]:[0]:[48]:[aa:bb:cc:dd:ee:ff]:[0]/104
     192.168.1.1 (PE1)
     Ethernet Tag ID: 0xFFFFFFFF
     ESI: 00:00:00:00:00:aa:bb:cc:dd:ee
     Label: 0

From PE2:
BGP routing table for L2VPN EVPN
Route Distinguisher: 192.168.1.2:100
*>i[1]:[0]:[48]:[aa:bb:cc:dd:ee:ff]:[0]/104
     192.168.1.2 (PE2)
     Ethernet Tag ID: 0xFFFFFFFF
     ESI: 00:00:00:00:00:aa:bb:cc:dd:ee
     Label: 0
  • ESI: 00:00:00:00:00:aa:bb:cc:dd:ee
  • MPLS label: 0 and Ethernet Tag ID: 0xFFFFFFFF - indicate Per-ESI routes
Fast Convergence
#
  • When a link (PE-CE) or PE fails, the affected PE immediately withdraws its Per-ESI route.
  • This triggers fast convergence, allowing remote PEs to update their forwarding tables quickly.
  • Traffic is then redirected to the remaining active links or PEs within the Ethernet Segment.
Mass MAC Address Withdrawal
#
  • Occurs after fast convergence.

  • Per-ESI route withdrawal signals to remote PEs that all MAC addresses linked to the Ethernet Segment should be removed (Route Type 2).

  • Remote PEs then clear all MAC entries learned from the failed Ethernet Segment.

  • All-active multi-homing: Only MAC addresses advertised by the failed PE (RT-2) are withdrawn. Other MAC addresses learned from other PEs remain intact.

  • Single-active multi-homing: All MAC addresses linked to the failed ES are withdrawn, regardless of which PE advertised them.

Loop prevenation
#

Route Type 4:

  • Used for DF election
  • Helps identify switches in the same ESI group

Route Type 1:

  • Carries the ESI label used for split-horizon filtering

Route Types 1 and 4: Work together to prevent address duplication and loops in multicast traffic.

Duplication Prevention:

  • In an active-active multi-homing setup, only the DF decapsulates VXLAN packets to prevent duplicate packets from reaching the multi-homed destination.

Looping in Traffic without Split Horizon:

  • Both DF and non-DF switches encapsulate and forward BUM (Broadcast, Unknown Unicast, Multicast) traffic.
  • A non-DF switch might forward encapsulated traffic to the DF, which decapsulates it and forwards it back, potentially causing a loop.

Type 4: Ethernet Segment
#

Contains: VTEP IP + ESI

  • Used for Designated Forwarder (DF) election among PEs connected to the same Ethernet Segment
  • Enables PEs to discover other PEs connected to the same Ethernet Segment

L3VNI
#

Anycast Gateway MAC: The same MAC address is configured across all leaf switches in the fabric to enable consistent routing.

fabric forwarding anycast-gateway-mac AAAA.AAAA.AAAA

VLAN of L2VNI: Configures the anycast gateway for routing. This gateway will be used as the default gateway for end devices.

int Vlan20
    no shutdown
    vrf member CUSTOMER1
    ip address 10.0.20.254/24
    fabric forwarding mode anycast-gateway

VLAN of L3VNI: Using ip forward allows traffic to be passed between subnets without requiring IP address configuration on the VLAN.

int Vlan100
    no shutdown
    vrf member CUSTOMER1
    ip forward

When an SVI is configured on a VTEP with an anycast gateway, routing occurs locally on the VTEP, from the source L2VNI to the destination L2VNI.

With L3VNI:

  • Route-Type 5 (RT-5) advertisements are generated. These routes contain IP prefixes, L3VNI, and the associated VRF, helping VTEPs identify which other VTEPs are interested in a specific VRF.

Without L3VNI:

  • VTEPs cannot advertise their locally connected subnets to other VTEPs in the fabric.
  • There is no way to determine which VTEPs should receive traffic for a specific VRF.

As a result, routing between subnets cannot extend across the VXLAN fabric, and communication across the fabric is restricted to Layer 2.

Asymmetric IRB
#

  • Bridge and Route at Source VTEP
  • Bridge at Destination VTEP
  • Uses the destination L2VNI for encapsulation.

Packet Flow

  1. Source VTEP receives the packet from the host.
  2. Performs routing to find the destination subnet.
  3. Routes the packet from source to destination VNI.
  4. Encapsulates the packet with VXLAN headers.
  5. Changes the source MAC to the gateway MAC for the destination VLAN.
  6. Sets the destination MAC to the host’s MAC.
  7. Sends the encapsulated packet to the destination VTEP using L2VNI.

At Destination VTEP

  1. Receives and decapsulates the VXLAN packet.
  2. Performs a Layer 2 lookup (bridging).
  3. Forwards the packet to the destination host.

Symmetric IRB
#

  • Bridge and Route at Both VTEPs
  • Uses the L3VNI for encapsulation in inter-subnet traffic

Packet Flow

  1. Source VTEP receives the packet from the host.
  2. Routes the packet from source VNI to L3VNI for inter-subnet routing.
  3. Encapsulates the packet with VXLAN headers using L3VNI.
  4. Changes the source MAC to the gateway MAC for L3VNI.
  5. Sets the destination MAC to the destination VTEP’s MAC.
  6. Sends the encapsulated packet to the destination VTEP using L3VNI.

At Destination VTEP

  1. Receives and decapsulates the VXLAN packet.
  2. Routes the packet from L3VNI to the destination VNI.
  3. Updates the source MAC to the destination VTEP’s gateway MAC.
  4. Forwards the packet to the destination host.

Configuration
#

  • Set up an IGP (like OSPF) for underlay routing between all routers.
  • Spine routers connect to all leaf routers, but do not connect to each other.
  • Spines only run BGP with L2VPN and EVPN address families.
  • Leaf routers manage VLANs, VNIs, VTEPs, and BGP configuration.
  • Set up access ports on the interfaces between leaf routers and end devices.

Leaf Switch
#

  1. VNI
  • Configure Route Target (RT) and Route Distinguisher (RD)
vni 10001
rd auto
route-target import auto
route-target export auto

Auto RT & RD:

  • RT: Use RouterID:Number
  • RD: Use ASN:VNI

The RD controls whether the route gets installed in the

  • MAC-VRF (for L2VNI) or
  • IP-VRF (for L3VNI).
  1. NVE (VTEP)
  • Configure IP address
  • Configure peers (static or dynamic via BGP)
  • Enable host-reachability protocol and ARP suppression
  • Associate with VNI
interface nve1
  source-interface loopback1
  host-reachability protocol bgp
  member vni 10001
    suppress-arp
    ingress-replication protocol bgp

ingress-replication protocol bgp (Dynamic): Advertise Route Type 3 routes for identifying which VTEPs are interested in receiving BUM traffic for a specific VNI

host-reachability protocol bgp: Enables the VTEP to advertise MAC-to-IP bindings (Route Type 2)

suppress-arp:

  • VTEP intercepts ARP requests
  • Checks its local cache (Route Type 2 route)
  • If a match is found, VTEP responds on behalf of the destination host
  • If no match, ARP request is flooded
  1. VLAN
vlan 10
  vn-segment 10010
  1. BGP
  • Configure peering with spines
  • Activate address-family l2vpn evpn
  • Send extended community
router bgp 65000
  template peer SPINE
    remote-as 65000
    update-source loopback0
    address-family ipv4 unicast
      send-community
      send-community extended
    address-family l2vpn evpn
      send-community
      send-community extended
  neighbor 1.1.1.1
    inherit peer SPINE

L3VNI Configuration:

  1. VRF
  • Enable MAC for anycast gateway
  • Create VRF
  • Configure RT and RD
  • Associate VNI to VRF
fabric forwarding anycast-gateway-mac AAAA.AAAA.AAAA
vrf context CUSTOMER1
  vni 900001
  rd auto
  address-family ipv4 unicast
    route-target both auto
    route-target both auto evpn
  1. L2VNI Vlan

Configure IP and enable anycast gateway forwarding

interface Vlan10
  vrf member CUSTOMER1
  no shutdown
  ip address 10.1.1.1/24
  fabric forwarding mode anycast-gateway
  1. L3VNI Vlan
  • Configure IP
  • enable anycast gateway forwarding
vlan 100
  vn-segment 10100
interface Vlan100
  vrf member CUSTOMER1
no shutdown
  ip forward
  1. NVE
int nve1
  member vni 10100 associate-vrf
  1. BGP
router bgp 65000
  vrf CUSTOMER1
    address-family ipv4 unicast
      advertise l2vpn evpn

Spine Switch
#

  1. Config BGP
router bgp 65000
  template peer LEAF
    remote-as 65000
    update-source loopback0
    address-family ipv4 unicast
      send-community
      send-community extended
      route-reflector-client
    address-family l2vpn evpn
      send-community
      send-community extended
      route-reflector-client
  neighbor 3.3.3.3
    inherit peer LEAF