Border Gateway Protocol (BGP)
Contents
-
Border Gateway Protocol (BGP)
- Subpages
- About BGP
- Abbreviations
- Path attributes (PA)
- Path Selection
- Neighbors
- Messages
- Timers
- Network advertisement
- Synchronization (with IGP)
- Graceful restart (GR)
- Graceful shutdown procedure
- Route aggregation
- eBGP multihop
- iBGP
- Route-map
- Enable IPv6 unicast routing (Cisco)
- Multiprotocol BGP (IPv4/IPv6)
- Equal cost multipath (ECMP)
- Security
Subpages
About BGP
Wiki EN - Path-Vector routing protocol
- In general distance-path routing protocols (RPs) have higher convergency times in comparison to Link-State RPs (like OSPF).
- Two flavors
eBGP External BGP Wiki EN - Exterior Gateway Protocol (EGP)
- Only other EGP is the equally named
iBGP Interal BGP Wiki EN - Interior Gateway Protocol (IGP)
Youtube - Kevin Wallace Training, LLC - BGP Deep Dive
- Many thanks for the introduction!
- Web-Tools
- Interesting reads
- Forms neighborships
- Explicitly configured neighbors
Neighborships communicate over TCP-Session tcp/179
- No dynamic neighbor discovery using multicast (like in e. g. OSPF)
- Advertises address prefix and prefix length in NRLIs packages
- Advertises a collection if path attributes used for path selection
- Does not consider bandwidth or cost for path selection
- Scales good
- OSPF may not scale to the size required in large enterprises.
Standards
- Best Current Practices (BCP)
- IANA
Implementations
- Licenses:
- Source code: various Licenses
- Composite binary: GPLv2 or later
- Evolved from
Wiki EN - Quagga, GPLv2
Evolved from GNU Zebra, GPL
- Licenses:
Abbreviations
- ACL
- Access Control List
- AF
- Address Family
- AFI
- Address Family Identifier
- AIGP
- Accumulated Interior Gateway Protocol
- AS
- Autonomous System
- ASN
- Autonomous System Number
- BGP
- Border Gateway Protocol
- CE
- Customer Edge
- DD
- Database Description packages
- DUAL
- Diffusing Update ALgorithm
- eBGP
- External BGP
- EGP
- Exterior Gateway Protocol
- EIGRP
- Enhanced Interior Gateway Routing Protocol
- EOR
- End Of RIB
- EVPN
- Ethernet Virtual Private Network
- FIB
- Forwarding Information Base
- GR
- Graceful Restart
- HA
- High Availability
- iBGP
- Internal BGP
- IGP
- Interior Gateway Protocol
- IGRP
- Interior Gateway Routing Protocol
- IRR
- Internet Routing Registry
- IXP
- Internet Exchange Point
- L2VPN
- Layer 2 Virtual Private Network
- LIR
- Local Internet Registry
- LSA
- Link State Advertisments
- LSDB
- Link State Database
- LSR
- Link State Request
- LSU
- Link State Update
- Packet that may contain multiple LSAs
- MBGP
- Multicast BGP
- MP-BGP
- Multi-Protocol BGP
- NLRI
- Network Layer Reachability Information
- NSF
- Non Stop Forwarding
- NSR
- Non Stop Routing
- OSPF
- Open Shortest Path First
- PA
- Path Attribute
- PE
- Provider Edge
- PMTUD
- Path MTU Discovery
- RIB
- Routing Information Base
- RIR
- Regional Internet Registry
- RR
- Route Reflector
- RS
- Route-Server
- RTP
- Reliable Transport Protocol
- SAFI
- Subsequent Address Family Identifier
- SIA
- Stuck In Action
- SNM
SubNet Mask
- Tier 1 transit provider
- An IP transit provider that can reach any network on the Internet without purchasing transit services.
- SSO
- Stateful Switchover
- uRPF
- Unicast Reverse Path Forwarding
- VXLAN
- Virtual eXtensible LAN
- EoR
- End of RIB
- VTEP
- Virtual Tunnel Endpoint
- VLSM
- Variable Length Subnet Mask
Path attributes (PA)
# |
Scope |
PA |
Default |
Mnemonic |
BGP field |
Preferred |
Comment |
1 |
Local to router |
(Local) Weight |
"Off" |
We |
|
higher |
influence outbound routing decisions; |
2 |
Internal to AS |
Local preference |
"Off", |
Love |
LOCAL_PREF |
higher |
influence outbound routing decisions; |
3 |
Internal to AS |
Originate / Accumulated Interior Gateway Protocol (AIGP) |
"Off" |
Oranges |
AIGP |
Lowest |
rfc7311 |
4 |
External to AS |
AS path length |
"On", |
AS |
AS-path |
Lowest |
influence outbound routing decisions; |
5 |
External to AS |
Origin Type |
IGP |
Oranges |
ORIGIN |
Lowest |
0 = IGP (i) (maybe a network command) |
6 |
External to AS |
Multi-Exit Discriminator (MED) |
"On", |
Mean |
MULTI_EXIT_DISC |
Lowest |
By default only route with the same autonomous system (AS) is compared. Can be set to ignore same autonomous system (AS). |
7 |
Local to router |
eBGP over iBGP paths |
"On" |
Pure |
|
|
Directly connected, over indirectly |
8 |
Local to router |
IGP metric to BGP next hop |
"On", imported from IGP |
|
|
Lowest |
Continue, even if bestpath is already selected. Prefer the route with the lowest interior cost to the next hop, according to the main routing table. If two neighbors advertised the same route, but one neighbor is reachable via a low-bitrate link and the other by a high-bitrate link, and the interior routing protocol calculates lowest cost based on highest bitrate, the route through the high-bitrate link would be preferred and other routes dropped. |
9 |
Local to router |
Path that was received first |
"On" |
|
|
oldest |
Used to ignore changes on the steps 10+ |
10 |
Local to router |
Router ID |
"on" |
Refreshment |
|
Lowest |
|
11 |
Local to router |
Cluster list length |
"On" |
|
|
Lowest |
|
12 |
Local to router |
Neighbor address |
"On" |
|
|
Lowest |
|
Path Selection
Cost or bandwidth are not considered for path selection.
During path selection PAs are compared. If they are not equal a routing decision can be done, but if they are tie the next attribute is compared.
Neighbors
- There is no dynamic BGP neighbor discovery
- Neighbors must be configured explicitly
- eBGP neighbors have different AS numbers
iBGP neighbors simply have the same remote-as AS-NUMBER during configuration
Show neighbors
Neighbor formation states
State Idle: Doing nothing
State Connect: establish TCP-session
State Active: established TCP-session
State Open Sent: initiate establish neighborship
State Open Confirm: confirm establish neighborship
State Established: established neighborship
Peer group
- Simplify configuration by grouping of neighbors
- Various configuration (like policies, route-maps, filters, …) can be applied to the peer group
Example: apply #Prefix filtering to peer group
- Removing a peer group
Messages
Open: BGP version number, Local AS Number, Hold Time, BGP router ID, optional parameters
Keepalive: Keeps holdtime timer from expiring
- Defaults:
- Keepalive: 60s
- Holdtime: 180s (3 * default Keepalive)
- Defaults:
Update: Can contain withdrawn routes, path attributes and NLRI
Notification: Contains an error code, error sub-code, and information about the error
Timers
IETF RFC4271 - A Border Gateway Protocol 4 (BGP-4) #10. BGP Timers
networkstraining.com - The Most Important Border Gateway Protocol (BGP) Timers Explained
Defaults
- Keepalive: 60s
- Holdtime: 180s (3 * keepalive)
Other timers
- update-delay: 0s (disabled)
- readonly BGP until update-delay expired
Network advertisement
- You should not advertise the networks of the peers of other AS, unless you want to become a transit AS/area and you want to route the traffic through your devices
- Only advertise your AS-internal networks you want to propagate throughout the internet
Synchronization (with IGP)
- To avoid unintentional black holes
- Only advertises a route learned from an iBGP peer to an eBGP peer when there is an exact match of that route learned from an IGP in the routing table
- Disabled by default in Cisco IOS 12.2(8)T
- Not really necessary
- There may be no IGP on the eBGP router
= Hard restart ==
Cisco: In EXEC mode
1 clear ip bgp *
Graceful restart (GR)
FRR Docs - BGP #Send Hard Reset CEASE Notification for Administrative Reset
FRR Docs - BGP #Graceful Restart Graceful restart is a mechanism for BGP that would help minimize the negative effects on routing caused by BGP restart. An End-of-RIB marker is specified and can be used to convey routing convergence information. A new BGP capability, termed "Graceful Restart Capability", is defined that would allow a BGP speaker to express its ability to preserve forwarding state during BGP restart.
- Also available for OSPF, EIGRP, IS-IS, …
- GR towards RR is especially useful
- GR is preferred to Non-Stop-Routing
- Timers
- Restart timer:
- Default: 120 s
- Speaker must come back in this time or is considered dead.
- Selection_Deferral_Timer:
- Default: ? s
- Stale-path timer:
- Default: ? s
- Restart timer:
- Each routing protocol has to be checked, that the peer is capable and aware of GR
- Usage of EoR marker is beneficial for convergence time, since computation can start on reception
Process:
- Capability to perform graceful restart (Capability: 64) is announced to neighbors in OPEN message.
- May be enabled on a per neighbor basis
- Peers are notified that a GR takes place using an OPEN message with restart bit set.
- On a peer
- the routes in the routing table are marked as stale
- traffic is still forwarded to the restarting speaker
- the restart timer is started
- On a peer
- On the restarting speaker the Data-plane continues to forward traffic, while the control-plane restarts.
- When a new BGP session is established
- On the peer
- the restart timer is stopped (heard back)
- the stale-path timer is started
- On the restarting router
- initial updates are exchanged with neighbors
- performs Path-Selection
- on End-Of-RIB (EoR) by all peers with GR capability set or
- Selection_deferal_timer has expired
- send updates and EoR to neighbors
- On the peer
- On the peer
- the stale-path timer is stopped
- the stale prefixes are flushed
- new prefixes are injected
- Convergence
Graceful shutdown procedure
Youtube.com - NLNOG day 2017 - Job Snijders (NTT) - BGP Graceful Shutdown
- Simple procedure to reduce negative impact of shutting down BGP sessions (e.g. during maintenance)
- Can be combined with
- Can be part of the operational procedure as outlined in
IETF BCP214 - Mitigating the Negative Impact of Maintenance through BGP Session Culling
- Graceful shutdown is a "Make Before Break" mechanism
- Does not help against unplanned outages
- Not to be confused with "BGP Graceful Restart"
- Only useful for multi-homed AS
Scenario topology
Normal shutdown
Router A (RA) is shutdown
- BGP session to ASBR2 is torn down
- ASBR2 withdraws routes sending a NLRI to the route reflector (RR)
- ASBR2 has no more routes to prefixes announced by RA and blackholes legitimate traffic
- RR selects a new best path and updates ASBR2
- ASBR2 starts forwarding traffic using the new alternative path
Graceful shutdown
- Before maintenance Router A (RA)
Attaches the GRACEFUL_SHUTDOWN well-known community 65535:0 (as per IANA) to the prefixes
- Waits for a reasonable time (some minutes)
- ASBR2 receives updates from RA
Finds the community GRACEFUL_SHUTDOWN in the NLRI
- Lowers the LOCAL_PREFERENCE (AS-wide) to 0
- Updates RR
- RR p
RR sends new best path to ASBR2 and other peers
Community GRACEFUL_SHUTDOWN propagates through AS
- Other intermediate AS may select a new path prior to shutdown
- ASBR2 selects new best path received from RR
- Alternative path are already in place
- RA is drained
- Minimal package loss
- RA can now be shutdown without impact
- BGP session to ASBR2 is torn down without impact
- …
Graceful shutdown configuration
Allow peers to graceful shutdown
Configure this on each and every BGP session
- Example Cisco IOS XR
1 conf term 2 route-policy AS-ASN_NEIGH-ebgp-inbound 3 if community matches-any (65535:0) then 4 set local-preference 0 5 endif 6 end-policy 7 ! 8 router bgp ASN_OWN 9 neighbor IP_NEIGH remote-as ASN_NEIGH 10 address-family ipv6 unicast 11 send-community-ebgp 12 route-policy AS-ASN_NEIGH-ebgp-inbound in 13 end
- Example ARISTA/Brocade/IOS/Quagga/FRR
Graceful shutdown your systems
Cisco IOS XR (automatically shuts down BGP session)
Route aggregation
- Reduces the size and therefor increases the efficiency of the routing table.
- Algorithm
- Binary AND of network prefixes to determine common bits.
- Appending bits with value 0 of the resulting network address are counted, subtracted from the maximum length of the SNM (32/128) and converted to decimal to calculate the subnet mask.
- Summarized network address and SNM are concatenated to the summarized network prefix.
- Summarized address
- Be careful, because summarization will be performed also with wholes, which may cause issues.
- Should work because advertisements of network falling into that route are more specific.
- Network with the least cost is picked as cost for the summarized route
- Blueprint
eBGP multihop
- Form a neighborship with a not neighboring router
iBGP
- Assumptions
- Full Mesh - All routers running BGP are directly connected in the same AS
- The number of neighborships grows exponentially with the number of routers participating in iBGP in the same AS.
- Solutions
- BGP Confederation using Sub Autonomous Systems (Sub-AS from private AS address space)
- Sub-Autonomous Systems require a full mesh …
- BGP Route reflectors
- Mirrors received advertisements to other routers
- BGP Confederation using Sub Autonomous Systems (Sub-AS from private AS address space)
- Solutions
- The number of neighborships grows exponentially with the number of routers participating in iBGP in the same AS.
- Full Mesh - All routers running BGP are directly connected in the same AS
An iBGP neighbors simply have the same remote-as AS-NUMBER during configuration
iBGP next hop self
- iBGP router has learned a route from another eBGP peer and the route is shown in, but the next hop is set to an unreachable peer of the propagating eBGP router,
where the iBGP router has no route to reach him
show ip bgp The route is not in the routing table
show ip route- The eBGP router does not adjust the attribute "next hop"
- On the eBGP router set the attribute next hop for a specific neighbor
iBGP route reflector
- To overcome the iBGP requirement of a full mesh an iBGP router can be configured as a route reflector
- Therefor the iBGP neighbors of this [designated] router, that should receive mirrored route advertisements are configured to be a route-reflector-client
iBGP Confederation
- Makes sense in bigger topologies
Route-map
Outbound path selection (LOCAL_PREF)
- Create a route-map that sets LOCAL_PREF
- Apply route-map to ISP-routes to influence outbound path selection (AS-wide). (9
Inbound path selection (AS-path prepending)
- Create a route-map that prepends additional instances of your own AS-number
- Apply route-map to outbound routes to ISP1 to influence inbound path selection.
Enable IPv6 unicast routing (Cisco)
Multiprotocol BGP (IPv4/IPv6)
- Using address-families for IPv4/IPv6
- Configure IPv6 addresses to the interfaces
- Create route-map
- Setup BGP router
1 conf term 2 router bgp AS-NUMBER 3 neighbor IP-ADDRESS_ISP2 remote-as AS-ISP2 4 !ENABLE BOTH ADDRESS FAMILIES 5 address-family ipv4 6 network IPV4-PREFIX mask SNM 7 exit 8 address-family ipv6 9 network IPV6-PREFIX mask SNM 10 ! ACTIVATE IPV6 PEER 11 neighbor IP-ADDRESS_ISP2 activate 12 neighbor IP-ADDRESS_ISP2 route-map IPV6-NEXT-HOP out 13 exit
- Test with
Equal cost multipath (ECMP)
notes.networklessons.com - BGP Equal Cost Multipath AS_Path Attribute
First make sure to have equal costs on the routes.
- There are multiple ways to achieve ECMP
- Inject multiple paths in routing table
- Configure loopback with address, set static host routes to neighbor loopbacks, enable ebgp-multihop and run BGP-session from loopback to loopback
1 conf term 2 ip route REMOTE-IP-ADDRESS1 255.255.255.255 REMOTE-IFACE-ADDRESS 3 ip route REMOTE-IP-ADDRESS1 255.255.255.255 REMOTE-IFACE-ADDRESS 4 router bgp AS-NUMBER 5 neighbor REMOTE-IP-ADDRESS1 remote-as PEER-AS-NUMBER 6 neighbor 172.16.31.2 update-source loopback0 7 neighbor 172.16.31.2 ebgp-multihop 2
- Use LACP/802.3ad, … and configure a BGP session over this joint link.
Security
Filter TCP/197
- ACL filters input on tcp/179 for devices not eligible to become neighbors.
- Protection from DoS by resource exhaustion (bandwidth, CPU, …)
- Spoofed TCP RST can bring down BGP sessions
- On control-plane or interface level
- Please also see:
- Rate-limit tcp/179 to reserve platform resources
Filter inbound own source-addresses
- Block spoofed packets (packets with a source IP address belonging to their IP address space) at all edges of their network
- protects the TCP session used by Internal BGP (IBGP) from attackers outside the Autonomous System
Prefix filtering
- in- or outbound
- match IPs, ASNs, or any other attribute of a prefix (e.g. communities)
- Filters
- Drop IPv4/6 special purpose prefixes
- prefixes of non-global scope
- Drop IPv4/6 unallocated prefixes
- RIR-Allocated Prefix Filters
Internet Routing Registry ToolSet
github.com irrtoolset/irrtoolset
- IRRToolSet is not actively maintained
- Instead use BGPQ4
github.com bgp/bgpq4 - bgp filtering automation tool
- Drop IPv4/6 special purpose prefixes
Show prefix lists
1 show ip prefix list
Create a prefix list
Session Protection
- Please also see:
Policies
- BGP is typically restricted by policies.
- Common policies are
- Bogon filtering
- Limit the routes learned to a maximum AS-path length
- Filter routes with origin "255" (devel)
- Common policies are
Authentication
- Peer may be authenticated (using MD5)
IETF RFC2385 - Protection of BGP Sessions via the TCP MD5 Signature Option
IETF - The TCP Authentication Option (TCP-AO)
- Also possible using IPsec
- [[|IETF - ]]
Error tolerance
- Do not tear down BGP session, just withdraw the route with the potentially faulty attribute.
Some keywords