Repeat after me, BGP is not scary

By joeneville posted Jun 15, 2016 02:32 PM

Kudos

Just a few years ago, BGP was seen as a routing protocol for service providers, most network engineers in Enterprise would have little exposure to it. BGP was just something on the WAN routers. Fast forward to 2016 and BGP in the data centre is being discussed like it is a given, just another option alongside OSPF. That is a major change in how the networking industry perceives BGP and one that some networkers maybe a little apprehensive about. Well I say feel the love and embrace change, but do take the time to learn the basics of BGP, it will pay off dividends in the future.

Now I’m one of those people that actually has a favourite routing protocol. RIP? Too slow. OSPF? Like an M.C. Escher diagram, looks simple until you look closer at the detail and have a ‘dude, what???’ moment. BGP, it’s a pro of a protocol, so many attributes and features. It can be daunting at first, but once you start to tame that beast it can becomes like a trusty tool; dependable, knowable and satisfyingly tweakable.

So it came as a bit of a shock to me when, a few months ago, in conversation with @netmanchris, we were discussing OSPF versus BGP for spine-and-leaf design, and he was building his initial configs around OSPF. He dropped the line on me, ‘well, a lot of data centre guys see BGP as scary’. Obviously I jumped to BGP’s defence but that brief exchange really made me think about just how much of a change it is to go from a data centre built on spanning-tree and maybe a bit of OSPF, to Layer 3 throughout with BGP on your top of rack switches. For those that are feeling a little uneasy about this I’m afraid things are about to get turned up to 11, there is a very real prospective of BGP even further into the DC network, at the virtual machine level, oh and let’s not forget even using BGP to control containers.

I cut my networking teeth working NetOps for a service provider, BGP was the norm, so all this is music to my ears. But I think Chris had a very valid point, and the industry as a whole is not exactly cognizant of the fact that a Layer 3 DC fabric is a big deal. Any NetOps process built primarily to deal with spanning-tree and stretched L2 is going to need a major rewrite, and that’s not to mention the new skills that the Operations workforce are going to require. Forget Marvel’s technicoloured also-rans, anyone that can fix a spanning-tree meltdown is a true superhero in my eyes. Keeping networks up and running is a stressful job, when the screen goes all shades of red and yellow it is experience and true understanding of the infrastructure that gets things back to green within SLA. Changing the fundamental protocols of the network should not be underestimated.

So what’s the good news?

There are a number of benefits to BGP that I feel make it a good fit in the DC:

Knowable - BGP, to me at least, is much more predictable than other networking protocols. Sure, there are whole books filled with the rules, the configuration caveats, and the ‘gotchas’. But at least the information is laid out there for you to learn, you just need to put the time in.
Trustworthy - BGP was originally built for networking between different autonomous systems in the real world i.e. different carriers, so there is an inherent lack of trust between you and your neighbours. Think about OSPF, you just need to put it on an interface and the protocol will happily build an adjacency, flooding the network with LSAs, creating one huge domain of link-state updates. BGP needs to be told who to peer with, you are in control from the outset.
Controllable – again flowing from the lack of trust, BGP filters prefixes like no other protocol. OSPF just doesn’t come close to the granularity of control that you have over what you send, and receive, from a peer. An OSPF Link-State Database is like reading an old telephone directory, ‘here’s everyone in the whole city’, a BGP table is like your whatsapp contact list, only those you want to talk to are in there.

If you are reading this and you’ve never configured BGP, or you’re a little hazy about the Path decision process, here a list of suggestions for how to go deep:

Get some hands-on: The vast majority of HPE Comware devices that run Layer 3 support a full suite of BGP.

If you do not have access to hardware, and who does have that luxury nowadays, fear not. HPE’s Virtual Services Router, the VSR1000, can be run as an OVA appliance in virtualbox or VMware workstation, you can download and run it for free. Fire up a number of VSRs and build a BGP spine-and-leaf. It is actually pretty easy. Find out more here.

Another very interesting option is to use the open source Network Operating System, Openswitch that runs, you’ve guessed it, BGP. More information here.

Learn the basics:

Config guides are your friend. If you’re looking at BGP on the VSR there is a full writeup in the VSR’s ‘IP Routing Configuration Guide’.

OSPF may seem like an easier routing protocol to get to grips with at first but BGP offers unparalleled control and scale, it is a perfect fit for todays’ L3 data centre fabrics, it doesn’t need to be ‘scary’. Just take the time to understand the basics and do not underestimate what a major change having this L3 protocol, or any other, in the data centre is.

Well that’s just my opinion. Agree? Disagree? Part of the OSPF fanclub? Want to start an East Coast/West Coast style beef and call out my boy, BGP? Feel free to hit me back with a comment below.

4 comments

2 views

Comments

Denis Corbin

Nov 09, 2017 04:44 AM

Hi Joe,

I just came accross this nice article you wrote some time ago. Let me just put my two cents about what I agree and disagree about BGP versus OSPF and second let me explain my point of view about the presence of BGP in datacenters today.

First thing first: if you find RIP protocol slow, you should not find BGP much faster, compared to OSPF, unless you rely on some tricks that make a BGP session end upon interface getting down: OSPF can converge in less than a second, BGP without such trick and using default timers needs 180 seconds to detect a session loss! Worse, BPG convergence is similar to RIP one (distant vector of AS versions distant vector of routers): it is a wave that propagates from the point of perturbation and lead each node to calculate the new best path before propagating the wave further. while OSPF propagates the minimal information without delay letting all routers recalculate the new best paths at the same time in parallel.

knowable?

to my point of view, for having troubleshooted both, BGP is less predictable than OSPF, because you may have a route pointing to a next-hop that is not directly reachable, so you have to read the routing table a second time to see which way to get to that first next hop, information that can be brought by another routing protocol (an IGP for example). The best path selection is done based on 10 different attributes, you may spend time playing with AS prepending, MED, local-pref ... to get BGP act as you want. Last the best path selection is not based on the number of router crossed or on a sum of link distance, nor round-trip time, but on the number of AS... an AS can range from a single router crossed in microseconds to a whole worldwide network crossed in milliseconds ... and not speaking about link bandwith considerations like OSPF does.

Trustworthly and controllable?

Yes, BGP's initial target design was to be used between different administratively managed networks (two companies with their own responsibilities), note however that you can avoid having OSPF blindly building an adjacency by using MD5 ... ok MD5 is easily crackable today :-) but OSPF stays an IGP (Internal Gateway Protocol)... after than came MP-BGP where BGP is used inside an single administrative network for supporting different AFI/SAFI (aka address-famillies) for L3VPN/MPLS, VPLS, EVPN and so on.. things network providers/telcos use. But you are right, MP-BGP is an extension of BGP, most of the time you will use BGP.

BGP inside the Datacenter

In datacenter the need for BGP seems to come from Facebook choice they made for their VXLAN underlay (they proudly published an article about that topic some years ago). IMHO, this can be explained by the fact Facebook datacenter is a huge one, something only the GAFA have: OSPF has some scalability drawbacks where from the notion of OSPF area (the usage dictates not to put more than 50 to 64 routers in a single area, depending on the sources). But Facebook could have quite as many areas as they wanted in order to overcome this restriction, right? Yes but, try if you want, a leaf&spine topology (and worse a hierarchical leaf&spine in the case of Facebook) is not adapted to multi-areas where area 0 is the backbone that connects all others... so BGP is probably a more scalable solution, a the cost of convergence time. The link-down detection the triggers the BGP peer down can mitigate this problem putting a fast detection but still a slow recovery, which is not a so big problem when you have a high level of redundant links and devices as is at Facebook Datacenter.

Does it mean that OSPF is not good for VXLAN underlay/leaf&Spine topology? I guess no, unless you have or expect to have more than around 120 racks per datacenter, pairing them by two with a switch virtualization technology like IRF, VSS or backplane stacking for active/active server boundings/etherchannel (LACP), these 60 OSPF instances plus the spines should fit into a single OSPF area without the pain of configuring BGP peerings, associated tricks for better link loss detection, managing AS number and so on. So OSPF should be simpler to implement, provide faster converge and to my point of view be simpler to troubleshoot.

But... which CIO today would like to take the risk to be criticized for having implemented something the big ones have rejected? Wouldn't it just show his lack of ambition/perspectives/understanding? ;-)

cappalli

Jun 22, 2016 02:05 PM

BGP is also incredibly resource intensive.

joeneville

Jun 22, 2016 01:59 PM

Thanks!

Regarding BGP support on Aruba controllers, personally I think it is because of the product target market, which focus on Layer 2 & Campus networks. While BGP use cases are expanding, I believe that, in the campus, the demand for support is relatively limited today.

Christoffer

Jun 22, 2016 07:30 AM

Sounds great, any thoughts on why there´s no BGP support on Aruba controllers?

Blogs