Security

last person joined: 16 hours ago 

Forum to discuss Enterprise security using HPE Aruba Networking NAC solutions (ClearPass), Introspect, VIA, 360 Security Exchange, Extensions, and Policy Enforcement Firewall (PEF).
Expand all | Collapse all

CPPM Slow replication, and communication issues

This thread has been viewed 3 times
  • 1.  CPPM Slow replication, and communication issues

    Posted Dec 02, 2015 01:10 PM

    Hi,

     

    I was just wondering, in a situation where you have both the management interface configured and the data port interface configured, which is interface is used to transfer the different kinds of data produced by the CPPM?

     

    I know that the data interface is used to handle all client communnication (Onboarding, radius, etc).

    For cluster replication data, which interface is used to send data between the Publisher and the Subscriber?

     

    Thank you,

     

    Cheers



  • 2.  RE: CPPM Slow replication, and communication issues

    Posted Dec 02, 2015 01:40 PM

    2015-12-02 13_39_18-Microsoft Edge.png



  • 3.  RE: CPPM Slow replication, and communication issues

    EMPLOYEE
    Posted Dec 02, 2015 01:47 PM


  • 4.  RE: CPPM Slow replication, and communication issues

    Posted Dec 02, 2015 02:00 PM

    Take a look at this technote I wrote that cover in most what you should require.

     

    CPPM Service Routing TechNote - V3

     

    Its found where all my other TechNotes are located........

     

    https://support.arubanetworks.com/Documentation/tabid/77/DMXModule/512/Default.aspx?EntryId=7961

     

     

     



  • 5.  RE: CPPM Slow replication, and communication issues

    Posted Dec 02, 2015 02:51 PM

    I found a KB that talks about how data is transfers here in this post.

    It references the interfaces used when sending data, but what about the receiving interface on the destination CPPM?

     

    We are trying to understand this to help answer some new issues that have been arising with our global cluster implementation. 



  • 6.  RE: CPPM Slow replication, and communication issues

    Posted Dec 02, 2015 03:15 PM
    What do you mean by this:
    "It references the interfaces used when sending data, but what about the receiving interface on the destination CPPM?"

    Are you load balancing your traffic ?
    - From the controller
    - Do you have a load balancer in front of your CPPM servers



  • 7.  RE: CPPM Slow replication, and communication issues

    Posted Dec 02, 2015 03:49 PM

    Thanks guys so much for the response! I was in the middle of writing a response for to long and didn't notice.

     

    The KB makes things clearer!

     

    With this knowledge, it makes the issues we are experiencing even stranger. 

    We have two routes between one of our global locations. When we route traffic using route A, the publisher and subscriber see each other and replication occurs (albeit, slowly). When we route traffic using route B, the CPPM in our remote location loses contact with the Publisher CPPM. The two servers can ping each other, but to the subscriber the publisher is completely offline. The publisher though does not report the subscriber is down.

     

    We tested to ensure routing and firewall rules are okay. Everything looks okay, but still the issue exists.

     

    We are not using any load balancing anywhere. We have no VIP.

     

    Sorry for not explaining myself properly.

     

    In the case where, a publisher sends replicated data to the subscriber, that data would leave the management interface on the publisher and enter the subscriber on the subscribers management interface? Is this correct?



  • 8.  RE: CPPM Slow replication, and communication issues

    Posted Dec 26, 2015 09:03 AM

    @bourne wrote:

    In the case where, a publisher sends replicated data to the subscriber, that data would leave the management interface on the publisher and enter the subscriber on the subscribers management interface? Is this correct?


    not sure if you are still looking into this, but i don't believe we can predict how it goes without knowing your full setup.

     

    from the technote: In    reference    to    clustering    traffic,    the    management    IP    address    of    the    publisher    needs    to    be    accessible    to    all    subscribers.    The    subscribers    may    reach    the    publisher’s    management    IP    either    through    the    subscriber’s    management    interface    or    data    interface    based    on    network    routing    set    up.

     

    so it depends on how you route the traffic. in the general that is how it works with CPPM, it does what you tell it to do.



  • 9.  RE: CPPM Slow replication, and communication issues

    Posted Dec 26, 2015 09:48 AM

    I am still working on this yes.

    I am still unable to explain the behavior.

     

    We have done a bunch of different tests on the routes and we have no idea why connectivity is lost when switching routes. 

     

    There must be something configured incorrectly somewhere, but given this issue is only occuring between the publisher and subscriber it is hard to pin down what could possibly be the cause.

     

    All this leads back to the issues we are experiencing with replication. I have asked our reseller to get Aruba support involved to help validate our setup. I want to make sure that I haven't done something stupid that is causing these issues.



  • 10.  RE: CPPM Slow replication, and communication issues

    Posted Mar 10, 2016 08:33 AM

    I have an update on this issue.

     

    There are two problems that eventually came out of all of this.

    1. Replication between our two sites was very slow
    2. When we switch between the available routes, the subscriber loses all access to the publisher

    Issue 1 resolution

    • It is still early, but I believe this issue has been resolved by increasing the "Replication Batch Interval"
    • This was recommended to us by an Aruba tech
    • Basically what appeared to be happening was that because the latency between the two sites is high, the subscriber could keep up only during none peak times. During high peak times it would slowly fall behind and would almost never be able to catch up. By increasing the replication interval, replication occurs less frequenctly with more changes being added to the batch. The trade off is though there is more time available for the batch to be sent and processed from the publisher to the subscriber (someone please correct me if I am wrong on this!)

    Issue 2 resolution

    • Full disclosure, we are still waiting on confirmation from Aruba on this
    • After a bunch of testing from the commandline (performed by an Aruba tech) we determined that it appeared as though no traffic from the subscriber going to the publisher was actually making it to the publisher. I say "appeared" because it was difficult for us to validate this 100%. The wireshark captures we took appeared so show that in fact some traffic was making, but yet the subscriber was still reporting the publisher as down.
    • My initial thought that it was one of our core switch that for some reason wasn't forwarding the traffic. The thing that was perplexing though was that the two devices could ping each other.
    • We eventually decided to try and reset the routing table on both the publisher and subscriber using the following command: network ip reset
    • After issue this command on both the  publisher and subscriber, communication was restored, much to the amazement of everyone
    • There was one key route that was removed from the routing table, the the Aruba tech wasn't sure of what it was, and this is what we are waiting on confirmation of. The routing table below
    • 0: from all lookup local
      220: from all lookup 220
      10020: from all to x.x.x.x/24 lookup mgmt
      10040: from x.x.x.x lookup mgmt
      10060: from x.x.x.x lookup data
      32766: from all lookup main
      32767: from all lookup default
    • The bolded line was the only line removed after the network ip reset, and none of use knew what the line was doing or where it came.

    Sorry for the long winded reply, I just wanted to share our experience.

     

    Once I get additional details from Aruba, I will update this thread.



  • 11.  RE: CPPM Slow replication, and communication issues

    Posted Mar 14, 2016 10:12 AM

    Just wanted to give another quick update on this.

     

    We were able to actually reproduce this issue by changing the routes again between our Chinese and Canadian locations. 

    The strange 200 route was added back to the Chinese CPPM server and subsequently lost connectivity to our Canadian Publisher. 

    Communication was lost after the following events

    • traffic taking route A
    • traffic forced to take route B - ClearPass servers still communicating without issue
    • traffic forced back to route A - ClearPass in China can no longer communicate with Canadian Publisher

     

    After resetting the routing table, communication was restored.

     

    Curious if anyone else has experienced this type of behavior before?