Controllerless Networks

 View Only
Expand all | Collapse all

Clients losing connection

This thread has been viewed 11 times
  • 1.  Clients losing connection

    Posted Mar 31, 2016 07:20 PM

    Hi everyone,

     

    I have about 20 IAPs deployed and I manage them through Airwave, since yesterday I noticed that at random times most clients would lose connectivity to the IAP, after a few minutes they would rejoin like nothing happened.

     

    I took a look in the VC logs and found a LOT of errors like the one below:

     

    <WARN> |AP AP03A001@10.17.38.197 cli|  Check sum error for AP-10.17.38.8, slave 39377 vs master 9804, error_cnt 1, recover_sent 6.

     

     

    Spoiler
    AP03A001# show log debug

    Mar 31 18:42:14 cli[2904]: <541004> <WARN> |AP AP03A001@10.17.38.197 cli| recv _sta_update: receive station msg, mac-18:89:5b:0e:9e:b4 bssid-40:e3:d6:a4:05:22 ssid-wifi-i.
    Mar 31 18:42:16 cli[2904]: <541004> <WARN> |AP AP03A001@10.17.38.197 cli| recv _sta_update: receive station msg, mac-90:8d:6c:88:84:b0 bssid-40:e3:d6:a4:17:72 ssid-wifi-i.
    Mar 31 18:42:17 cli[2904]: <341132> <WARN> |AP AP03A001@10.17.38.197 cli| Chec k sum error for AP-10.17.38.7, slave 39377 vs master 9804, error_cnt 1, recover_ sent 6.
    Mar 31 18:42:17 cli[2904]: <341132> <WARN> |AP AP03A001@10.17.38.197 cli| Chec k sum error for AP-10.17.38.8, slave 39377 vs master 9804, error_cnt 1, recover_ sent 6.
    Mar 31 18:42:18 cli[2904]: <541004> <WARN> |AP AP03A001@10.17.38.197 cli| recv _sta_update: receive station msg, mac-28:5a:eb:b7:5d:73 bssid-40:e3:d6:a4:0b:02 ssid-wifi-i.
    Mar 31 18:42:19 cli[2904]: <341132> <WARN> |AP AP03A001@10.17.38.197 cli| Chec k sum error for AP-10.17.38.2, slave 39377 vs master 9804, error_cnt 3, recover_ sent 6.
    Mar 31 18:42:20 cli[2904]: <541004> <WARN> |AP AP03A001@10.17.38.197 cli| recv _sta_update: receive station msg, mac-18:89:5b:0e:9e:b4 bssid-40:e3:d6:a4:05:22 ssid-wifi-i.
    Mar 31 18:42:23 cli[2904]: <541023> <WARN> |AP AP03A001@10.17.38.197 cli| swar m_timer_handler,8645: del client e4:98:d1:71:0d:a4, client count 164.
    Mar 31 18:42:23 cli[2904]: <541023> <WARN> |AP AP03A001@10.17.38.197 cli| swar m_timer_handler,8645: del client 18:34:51:e9:92:e8, client count 163.
    Mar 31 18:42:23 cli[2904]: <341132> <WARN> |AP AP03A001@10.17.38.197 cli| Chec k sum error for AP-10.17.38.158, slave 39377 vs master 9804, error_cnt 2, recove r_sent 6.
    Mar 31 18:42:27 cli[2904]: <341132> <WARN> |AP AP03A001@10.17.38.197 cli| Chec k sum error for AP-10.17.38.7, slave 39377 vs master 9804, error_cnt 2, recover_ sent 6.
    Mar 31 18:42:27 cli[2904]: <541004> <WARN> |AP AP03A001@10.17.38.197 cli| recv _sta_update: receive station msg, mac-90:8d:6c:88:66:ab bssid-40:e3:d6:a4:17:72 ssid-wifi-i.
    Mar 31 18:42:27 cli[2904]: <341132> <WARN> |AP AP03A001@10.17.38.197 cli| Chec k sum error for AP-10.17.38.8, slave 39377 vs master 9804, error_cnt 2, recover_ sent 6.
    Mar 31 18:42:28 cli[2904]: <541004> <WARN> |AP AP03A001@10.17.38.197 cli| recv _sta_update: receive station msg, mac-90:8d:6c:88:66:ab bssid-40:e3:d6:a4:17:72 ssid-wifi-i.
    Mar 31 18:42:29 cli[2904]: <341132> <WARN> |AP AP03A001@10.17.38.197 cli| Chec k sum error for AP-10.17.38.2, slave 39377 vs master 9804, error_cnt 4, recover_ sent 6.
    Mar 31 18:42:30 cli[2904]: <541004> <WARN> |AP AP03A001@10.17.38.197 cli| recv _sta_update: receive station msg, mac-34:aa:8b:c4:23:25 bssid-40:e3:d6:a4:07:d2 ssid-wifi-i.
    Mar 31 18:42:32 cli[2904]: <541004> <WARN> |AP AP03A001@10.17.38.197 cli| recv _sta_update: receive station msg, mac-20:78:f0:e4:b8:06 bssid-40:e3:d6:95:59:e2 ssid-wifi-i.
    Mar 31 18:42:33 cli[2904]: <341132> <WARN> |AP AP03A001@10.17.38.197 cli| Chec k sum error for AP-10.17.38.158, slave 39377 vs master 9804, error_cnt 3, recove r_sent 6.
    Mar 31 18:42:34 cli[2904]: <541004> <WARN> |AP AP03A001@10.17.38.197 cli| recv _sta_update: receive station msg, mac-28:5a:eb:b7:5d:73 bssid-40:e3:d6:a4:0b:12 ssid-wifi-i.
    Mar 31 18:42:35 cli[2904]: <541004> <WARN> |AP AP03A001@10.17.38.197 cli| recv _sta_update: receive station msg, mac-80:be:05:eb:fb:ef bssid-40:e3:d6:a4:08:c2 ssid-wifi-i.
    Mar 31 18:42:35 cli[2904]: <541004> <WARN> |AP AP03A001@10.17.38.197 cli| recv _sta_update: receive station msg, mac-34:aa:8b:c4:23:25 bssid-40:e3:d6:a4:07:c2 ssid-wifi-i.
    Mar 31 18:42:36 cli[2904]: <541004> <WARN> |AP AP03A001@10.17.38.197 cli| recv _sta_update: receive station msg, mac-e0:75:7d:f7:36:98 bssid-40:e3:d6:95:54:e2 ssid-wifi-i.
    Mar 31 18:42:37 cli[2904]: <341132> <WARN> |AP AP03A001@10.17.38.197 cli| Chec k sum error for AP-10.17.38.7, slave 39377 vs master 9804, error_cnt 3, recover_ sent 6.
    Mar 31 18:42:38 cli[2904]: <341132> <WARN> |AP AP03A001@10.17.38.197 cli| Chec k sum error for AP-10.17.38.8, slave 39377 vs master 9804, error_cnt 3, recover_ sent 6.

    Mar 31 18:42:38 cli[2904]: <541004> <WARN> |AP AP03A001@10.17.38.197 cli| recv _sta_update: receive station msg, mac-78:fd:94:0a:ec:d9 bssid-40:e3:d6:a4:05:32 ssid-wifi-i.
    Mar 31 18:42:39 cli[2904]: <341132> <WARN> |AP AP03A001@10.17.38.197 cli| Chec k sum error for AP-10.17.38.2, slave 39377 vs master 9804, error_cnt 5, recover_ sent 6.
    Mar 31 18:42:42 mini_httpd[10669]: handle_request: 1932: got nothing, child exi t after 0 requests
    Mar 31 18:42:42 mini_httpd[10670]: handle_request: 1932: got nothing, child exi t after 0 requests
    Mar 31 18:42:43 mini_httpd[10671]: send_error: 3670: child exit after 0 request s
    Mar 31 18:42:43 cli[2904]: <341132> <WARN> |AP AP03A001@10.17.38.197 cli| Chec k sum error for AP-10.17.38.158, slave 39377 vs master 9804, error_cnt 4, recove r_sent 6.
    Mar 31 18:42:44 cli[2904]: <541004> <WARN> |AP AP03A001@10.17.38.197 cli| recv _sta_update: receive station msg, mac-90:8d:6c:72:0f:8e bssid-40:e3:d6:a4:08:92 ssid-wifi-i.
    Mar 31 18:42:44 cli[2904]: <541004> <WARN> |AP AP03A001@10.17.38.197 cli| recv _sta_update: receive station msg, mac-80:65:6d:58:2f:9b bssid-40:e3:d6:95:3c:52 ssid-wifi-i.
    Mar 31 18:42:44 cli[2904]: <541004> <WARN> |AP AP03A001@10.17.38.197 cli| recv _sta_update: receive station msg, mac-18:f6:43:56:86:a7 bssid-40:e3:d6:a4:08:52 ssid-wifi-i.
    Mar 31 18:42:45 cli[2904]: <541004> <WARN> |AP AP03A001@10.17.38.197 cli| recv _sta_update: receive station msg, mac-bc:6e:64:ae:0b:8c bssid-40:e3:d6:a4:04:d2 ssid-wifi-i.
    Mar 31 18:42:45 cli[2904]: <541004> <WARN> |AP AP03A001@10.17.38.197 cli| recv _sta_update: receive station msg, mac-18:89:5b:0e:9e:b4 bssid-40:e3:d6:a4:05:22 ssid-wifi-i.
    Mar 31 18:42:45 cli[2904]: <541004> <WARN> |AP AP03A001@10.17.38.197 cli| recv _sta_update: receive station msg, mac-bc:6e:64:ae:0b:8c bssid-40:e3:d6:a4:04:d2 ssid-wifi-i.
    Mar 31 18:42:47 cli[2904]: <341132> <WARN> |AP AP03A001@10.17.38.197 cli| Chec k sum error for AP-10.17.38.7, slave 39377 vs master 9804, error_cnt 4, recover_ sent 6.
    Mar 31 18:42:48 cli[2904]: <341132> <WARN> |AP AP03A001@10.17.38.197 cli| Chec k sum error for AP-10.17.38.8, slave 39377 vs master 9804, error_cnt 4, recover_ sent 6.
    Mar 31 18:42:49 cli[2904]: <341132> <WARN> |AP AP03A001@10.17.38.197 cli| Chec k sum error for AP-10.17.38.2, slave 39377 vs master 9804, error_cnt 6, recover_ sent 6.
    Mar 31 18:42:50 cli[2904]: <541004> <WARN> |AP AP03A001@10.17.38.197 cli| recv _sta_update: receive station msg, mac-80:65:6d:58:2f:9b bssid-40:e3:d6:95:3c:52 ssid-wifi-i.
    Mar 31 18:42:50 cli[2904]: <541004> <WARN> |AP AP03A001@10.17.38.197 cli| recv _sta_update: receive station msg, mac-18:89:5b:0e:9e:b4 bssid-40:e3:d6:a4:05:22 ssid-wifi-i.
    Mar 31 18:42:50 dropbear[10731]: Child connection from 10.17.40.68:20262
    Mar 31 18:42:52 stm[2934]: stm_send_sta_ageout_offline: Sending sta ageout offl ine msg to CLI0, mac='0c:e7:25:6f:ef:4c'
    Mar 31 18:42:52 stm[2934]: rap_bridge_user_handler: 12684: user entry deleted f or '10.17.40.175' '0c:e7:25:6f:ef:4c'
    Mar 31 18:42:52 cli[2904]: <541004> <WARN> |AP AP03A001@10.17.38.197 cli| recv _sta_ageout_offline: receive station msg, mac-0c:e7:25:6f:ef:4c bssid-00:2d:66:0 0:00:00 ssid-.
    Mar 31 18:42:52 cli[2904]: <541004> <WARN> |AP AP03A001@10.17.38.197 cli| recv _sta_update: receive station msg, mac-0c:e7:25:6f:ef:4c bssid-40:e3:d6:a4:04:c2 ssid-wifi-i.
    Mar 31 18:42:52 sapd[2914]: sapd_proc_stm_reset_key: Got STM Reset key bss=40:e 3:d6:a4:11:82 mac=0c:e7:25:6f:ef:4c, idx=0
    Mar 31 18:42:52 cli[2904]: <541023> <WARN> |AP AP03A001@10.17.38.197 cli| recv _sta_move_req,2247: del client 0c:e7:25:6f:ef:4c, client count 162.
    Mar 31 18:42:52 cli[2904]: <541013> <WARN> |AP AP03A001@10.17.38.197 cli| recv _user_sync_message,6356: add client 0c:e7:25:6f:ef:4c, client count 163.
    Mar 31 18:42:52 cli[2904]: <541004> <WARN> |AP AP03A001@10.17.38.197 cli| recv _sta_update: receive station msg, mac-0c:e7:25:6f:ef:4c bssid-40:e3:d6:a4:04:c2 ssid-wifi-i.

    31 18:42:54 awc[2903]: papi_receive_callback: 4599: received CLI_AWC_POST_REQU EST
    Mar 31 18:42:54 cli[2904]: <541004> <WARN> |AP AP03A001@10.17.38.197 cli| recv _sta_update: receive station msg, mac-70:3e:ac:d1:31:f9 bssid-40:e3:d6:a4:17:72 ssid-wifi-i.
    Mar 31 18:42:54 awc[2903]: awc_post: 3438: sent header 'POST /swarm HTTP/1.1^M Host: 10.17.0.47^M Content-Length: 43122^M X-Type: stat^M X-Guid: 2dac442c010160 caee096318ddbcacb3b08be78d284a6166b8^M Cookie: 92c5794bc0e978be7a51d7afb320e9a6^ M Content-Encoding: gzip^M X-OEM-Tag: Dell^M ^M '
    Mar 31 18:42:54 awc[2903]: awc_post: 3444: wrote header 'POST /swarm HTTP/1.1^M Host: 10.17.0.47^M Content-Length: 43122^M X-Type: stat^M X-Guid: 2dac442c01016 0caee096318ddbcacb3b08be78d284a6166b8^M Cookie: 92c5794bc0e978be7a51d7afb320e9a6 ^M Content-Encoding: gzip^M X-OEM-Tag: Dell^M ^M ' and body
    Mar 31 18:42:54 awc[2903]: Message over SSL from 10.17.0.47, SSL_read() returne d 118, errstr=Success, Message is "HTTP/1.1 200 OK^M X-Manage-Mode: manage^M X-N o-Stat: no^M Cookie: 92c5794bc0e978be7a51d7afb320e9a6^M Content-Length: 0^M ^M " , AWC response: (null)
    Mar 31 18:42:55 dropbear[10731]: User: admin login by ssh successful.
    Mar 31 18:42:55 syslog: trace_rotate_file: rotating /var/log/trace/cli1_2.log
    Mar 31 18:42:55 syslog: trace_on: tracing to "/var/log/trace/cli1_2.log" starte d
    Mar 31 18:42:57 cli[2904]: <541004> <WARN> |AP AP03A001@10.17.38.197 cli| recv _sta_update: receive station msg, mac-70:14:a6:58:67:b8 bssid-40:e3:d6:a4:17:72 ssid-wifi-i.
    Mar 31 18:42:57 cli[2904]: <341132> <WARN> |AP AP03A001@10.17.38.197 cli| Chec k sum error for AP-10.17.38.7, slave 39377 vs master 9804, error_cnt 5, recover_ sent 6.
    Mar 31 18:42:58 cli[2904]: <341132> <WARN> |AP AP03A001@10.17.38.197 cli| Chec k sum error for AP-10.17.38.8, slave 39377 vs master 9804, error_cnt 5, recover_ sent 6.
    Mar 31 18:42:59 cli[2904]: <541004> <WARN> |AP AP03A001@10.17.38.197 cli| recv _sta_update: receive station msg, mac-70:14:a6:58:67:b8 bssid-40:e3:d6:a4:17:72 ssid-wifi-i.
    Mar 31 18:42:59 cli[2904]: <341132> <WARN> |AP AP03A001@10.17.38.197 cli| Chec k sum error for AP-10.17.38.2, slave 39377 vs master 9804, error_cnt 7, recover_ sent 6.
    Mar 31 18:43:00 cli[2904]: <541004> <WARN> |AP AP03A001@10.17.38.197 cli| recv _sta_update: receive station msg, mac-30:a8:db:fe:a7:8b bssid-40:e3:d6:a4:05:42 ssid-wifi-i.

     

     

    In the Airwave logs it's showing that all the IAPs are going to a 'down' state after the following message:

     

    W-Instant AP with IP 10.17.38.4PSK based authentication: not contain a valid X-Shared-Secret in request header.

     

     

    Spoiler
    Example:

    Thu Mar 31 16:06:43 2016 System Device Dell PowerConnect W-AP215 40:e3:d6:c2:xx:xx Status changed to 'Controller is Down' 16 Top > N
    Thu Mar 31 16:06:43 2016 System Device Dell PowerConnect W-AP215 40:e3:d6:c2:xx:xx Down 16 Top > N
    Thu Mar 31 16:06:42 2016 System Device Dell PowerConnect W-AP215 40:e3:d6:c1:xx:xx Status changed to 'Controller is Down' 17 Top > N
    Thu Mar 31 16:06:42 2016 System Device Dell PowerConnect W-AP215 40:e3:d6:c1:xx:xx Down 17 Top >N
    Thu Mar 31 16:06:42 2016 System Device Dell W-Instant Virtual Controller Instant Virtual Controller Status changed to 'Virtual Controller authentication failed' 1 Top > N

     

    Here's the client statistics for today with several disconnections

     

    image002.png

     

    I found that there was a configuration mismatch between the VC and the IAPs, rebooted everything and they came back showing the "Good" status, but the problem kept happening.

     

    Has anyone here experienced a similar problem?

     



  • 2.  RE: Clients losing connection

    Posted Mar 31, 2016 07:33 PM

    What version of Instant code are you running?