Security

Reply
Regular Contributor I

How does ClearPass handle a 'Split-Brain' failure?

Here is the current network configuraiton

 

2 DCs connected via a WAN.

 

each DC had 10 CPPM 25k nodes. DC1 had a publisher, DC2 has a standby publisher. Each DC has 9 Subscribers which server RADIUS traffic.

 

In the event that the WAN link drops betwene DCs, the DC2 standby publisher will promote itself to active publisher, and will try to contact all nodes to bring them under its control. However since WAN is down, it can only talk to subscribers in DC2, and DC1 subscribers will remain connected to DC1.

 

When the WAN link comes back online, DC1 publisher will see that DC2 has taken over, and will go into a 'cleanup state' and stop all it services. This causes the subscibers in DC1 to lose their publisher, and form what i see inthe lab, they do not call back to DC2 Publisher to get managed. So we end up with a bunch of orphaned nodes. Is there an issue in our config, or is this expected behaviour?

 

Just a small side question. During an authentication we add attributes to endpoints in the endpoint DB. If the primary publisher is down, and clients authenticate to a subscirber, will these endpoint updates be published when the publisher comes back online, or is this data lost?

 

Thanks,


_ELiasz

-------------------
ACDX, ACCP, CISSP, CWNA

Re: How does ClearPass handle a 'Split-Brain' failure?

I'd recommend having a read of the clustering technote. Split brain should be avoided by having pub and standby pub on the same L2 broadcast domain.

 

There are warning regarding this.

 

splitbrain.jpg

 

 


Cheers
James
----------------------------------------------------------------------
--------------------------@whereisjrw--------------------------
---------------------------------blog-------------------------------
ACCX #540 | ACMX #353 | ACDX #216 | AMFX #11
----------------------------------------------------------------------
----------------------------------------------------------------------

If a reply adequately addresses your issue, please click on the "Accept as Solution" and "Give Kudos" button so this information can benefit other users via search.
Regular Contributor I

Re: How does ClearPass handle a 'Split-Brain' failure?

Technically this is Layer 2, but it is with a Layer 2 extension between sites. So the Layer 2 subnet can still be split. I guess thats why we didn't see the warning while configuring the settings. I had looked in the tech note, but i guess i missed that, and when i did a find for 'split' nothing came up, as its in the screen shot.

 

I get the concern, the customer had a requirement for multi site full redundancy. I guess maybe i should propose that they do Standby publisher in the same DC, and if that DC goes down and they are in a bind, they can then promote one of the Subscribers in DC2 manually to publisher.

 

How does the cluster handle if a subscriber is promoted while both of the other 2 publishers are offline/not contactable. Will they see that a 3rd device was promoted, and both put themselves in a 'waiting' state?

 

Thanks for the info,

 

_ELiasz

-------------------
ACDX, ACCP, CISSP, CWNA

Re: How does ClearPass handle a 'Split-Brain' failure?

Hi Eliasz,

Did you ever get an answer to your question?

I am sort of in a similar situation with the potential of the layer-2 link between the DCs fail....what happens?

the original PUB keeps the VIP and keeps working while the SUB will begin auto-promotion and take the VIP over...and become the PUB....

not sure how this gets handled...
Pasquale Monardo | Senior Network Solutions Consultant
ACDX #420 | ACCA
[If you found my post helpful, please give kudos!]

Re: How does ClearPass handle a 'Split-Brain' failure?

This link should answer your question:
http://www.arubanetworks.com/techdocs/ClearPass/Aruba_DeployGd_HTML/Content/Cluster%20Deployment/Standby_publisher.htm
Thank you

Victor Fabian
Lead Mobility Architect @WEI
AMFX | ACMX | ACDX | ACCX | CWAP | CWDP | CWNA

Re: How does ClearPass handle a 'Split-Brain' failure?

Thanks Victor.

I have it setup that way (standby-publisher) will auto-promote itself after 10 min.

What I can't seem to understand is how Clearpass will behave when the PUB is still up and no longer sees a SUB (layer-2 link dead).
The SUB promotes itself to a PUB and the VIP flips over (layer-2 link dead).

I now have 2 PUBs.

What happens when the layer-2 link comes back??
Pasquale Monardo | Senior Network Solutions Consultant
ACDX #420 | ACCA
[If you found my post helpful, please give kudos!]

Re: How does ClearPass handle a 'Split-Brain' failure?

Ahh i see , unfortunately the preempt functionality doesnt exist today.

For this particular situation the solution is not pretty .

Once the L2 connection between the two nodes is restored you will need to
manually convert the standby pub back to a subscriber
Thank you

Victor Fabian
Lead Mobility Architect @WEI
AMFX | ACMX | ACDX | ACCX | CWAP | CWDP | CWNA

Re: How does ClearPass handle a 'Split-Brain' failure?

Ya ok I see the light now.

This makes sense and I can live with making the new pub back to a sub.

EDIT:

During the time the L2 link is broken, my assumption is that both CPPMs handle authentications.




Get Outlook for Android

Pasquale Monardo | Senior Network Solutions Consultant
ACDX #420 | ACCA
[If you found my post helpful, please give kudos!]

Re: How does ClearPass handle a 'Split-Brain' failure?

Yes .

The only issue is that while both are up as Pubs then no database sync will
happen during that time
Thank you

Victor Fabian
Lead Mobility Architect @WEI
AMFX | ACMX | ACDX | ACCX | CWAP | CWDP | CWNA

Re: How does ClearPass handle a 'Split-Brain' failure?

Ah yes that makes sense....

Preempt would be nice in this case...

Get Outlook for Android
Pasquale Monardo | Senior Network Solutions Consultant
ACDX #420 | ACCA
[If you found my post helpful, please give kudos!]
Search Airheads
cancel
Showing results for 
Search instead for 
Did you mean: