04-14-2017 02:19 PM
I've now upgraded to 220.127.116.11 and it didn't make any difference. It was even slightly slower this time :)
64>iperf3.exe -c 192.168.88.99 -t 200 Connecting to host 192.168.88.99, port 5201 [ 4] local 192.168.88.53 port 53029 connected to 192.168.88.99 port 5201 [ ID] Interval Transfer Bandwidth [ 4] 0.00-1.01 sec 4.25 MBytes 35.3 Mbits/sec [ 4] 1.01-2.01 sec 4.00 MBytes 33.4 Mbits/sec [ 4] 2.01-3.00 sec 4.00 MBytes 34.0 Mbits/sec [ 4] 3.00-4.01 sec 4.00 MBytes 33.5 Mbits/sec
This test also with only this single client connected. New show-tech attached.
04-14-2017 03:05 PM
I hate to ask for this but can you share the working show tech as well on 18.104.22.168. i have to take these to another group of people and would like a complete set on 22.214.171.124. I think i can use older set of pcaps though.
04-14-2017 03:45 PM
Hmmm.. After I upgraded the IAP, I just reconnected the same client and then had the same problem.
Then I rebooted the client, and since then I haven't been able to trigger the problem again for more than 1h.
Now I've even connected my Mac and iPhone again, and still don't see the problem.
With a little luck the upgrade might acctually have done the trick, but the client also needed a reboot to clean up his head :)
But to soon to judge... I'll keep testing and keep you updated. Just let me know if you need any more info.
04-14-2017 04:25 PM
I give up for tonight - I can't reproduce the problem any longer :)
I've connected 2 more laptops and my iPhone, moved them to bad signal positions, pushed data from multiple clients, disconnected/connected, but no matter what my test-client keeps performing well!
We'll see tomorrow it the happiness will last, but it has never been this stable before - that's for sure.
04-15-2017 07:17 AM
Unfortunately it seems like I just had a lucky time last night.
Today I've been testing with 126.96.36.199, 188.8.131.52 and 184.108.40.206 on another IAP-315 (same config) and all gave the same problem. Then I switched back to the other IAP running 220.127.116.11 that worked good last night, and rebooted my client, but even on my first attempt it locked at 40M :(
At least it's very binary that when it works good I can see with tcpdump on my iperf server that the client is sending large (aggregated frames, I assume), but when it's not working I see standard sized frames.
So I'm gonna see if I can figure out how to manually disable A-MPDU and A-MSDU to see if the thoughput mathes the non-working scenario, or if something else is also playing us.
I can also add that with firmware up to and 18.104.22.168 I saw intermittent drop-offs from my SSID, so the client switched over to another SSID for no appearent reason. So far I've not seen that with 22.214.171.124+.
The only way I've been able to trigger the problem with 126.96.36.199+ is when I disconnect/connect to the network, so I'm thinking could it perhaps be a problem on the client side that when I connect to another network it learns that network don't support Frame Aggregation, and then for some reason it sticks with that even when joining my IAP-315 network.
It kindof makes sence as a client reboot always resolves the problem.
Attached are show-tech's from todays testing.
04-15-2017 03:31 PM
Dooh, this is driving me crazy! It feels like things are much more stable with the IAP running on 188.8.131.52 code. At least I don't have any unexplained drop-offs, and I find it very hard to trigger my initial problem. It has only happend twice today.
Currently I'm stuck in a "new" state, IAP running on 184.108.40.206 code, where my client can push about 80-95 Mbps.
Capture of what's happening in the air is very similar to the working scenario. I can see BA (Block Ackknowledge) and A-MPDU is being used in both cases.
Then just a clarification regarding the tcpdump's on my iperf server where I earlier saw very large frames and though those were the result of Frame Aggregation. That's not the case, it was just TCP Segmentation Offloading on the NIC that was fooling me. After turning that off I always see normal 1514 bytes frames there in all cases.
Adding a show-tech for this new state as well.
With this new state, a reboot of the client doesn't help. It's stuck! I'll probably have to reboot the IAP to clear it...
04-16-2017 08:25 AM
I've now read up on A-MPDU and learned that is uses Block Ack, which needs be negotiated via ADDBA Requests in each direction. I've done new captures which include the time where the client Associates with the IAP until my iperf measurement starts.
I now have three different states I'm analyzing:
"Good" - where everything works fine and I can Tx and Rx 200Mbps.
"40M" - where I can Tx about 40Mbps and Rx 200Mbps.
"80M" - where I can Tx about 80-100Mps. Forgot to test Rx.
Running on 220.127.116.11 code I'm mostly in "good" state, but have face a few situations where I've ended in 80M or 40M state.
When I'm in 40M, a reboot of the client resolves the issue.
When I'm in 80M, client reboot does not help, but rebooting the AP resolves the issue.
Examining the new pcaps shows that in "good" or "80M" state, there is a mutual Block Ack Request/Response negotiation (Both IAP and Client Requests BA):
And I can see that the actual iperf data is sent in chunks of 52 QOS Data frames that are Block Ack'ed (Both "good" and "80M").
However in the 40M state, which I easily end up in when running <=18.104.22.168 code, only the IAP is requesting BA, the client don't:
I explains why I can download with full 200Mbps, but only upload with 40Mbps when stuck in "40M" state.
Googling ADDBA driver problems gave me this:
Where someone reports similar problems where things works week for a period of time but after that the NIC driver fails to establish a new BA session.
Given that, it feels like the 40M problem is mostly a client NIC driver problem but it doesn't explain why it works so much better when running on 22.214.171.124 code.
I also lack an explanation what is causing the "80M" state, where the client *IS* using BA and transmitting using 400M bitrate but is only able to push 80Mbps.
04-16-2017 08:27 AM
04-17-2017 04:08 AM
The client I've performed most tests with is a HP EliteBook 840 G3, with Intel "Dual Band Wireless-AC 8260" WiFi NIC, running Windows 10 and driver version 126.96.36.199 from 2015-12-30.
I've tested to upgrade to the lastest drivers from HP (10.40.0.3P) and directly from Intel (10.50.1.5) but both just seemed to make things even worse. Using them I've never been able to transfer for than 80M, and when capturing the BA negotiation there are just a lot of re-transmissions, Add/Delete etc:
Please remember that I started seeing the problem at a customer site, who is using Lenovo Thinkpads, so it's not just my single client having the issue. I've also replicated the problem booting up Ubuntu on my HP Laptop.