Thanks for the reply, I appreciate the tips.
I've been continuing tests since upgrading the firmware, and the packet loss has reduced compared to before. However, I still see packet loss on routed traffic and no packet loss on the switched traffic.
I have been going through the configs closely and have cleared out a lot of cruft. No config changes made any noticeable difference.
I would like to bring the LACP membership down to one link, but I am worried about affecting production traffic. Since the issue is only seen during business hours, I haven't felt comfortable enough to do that.
I've tried a few other tests, but I've settled on the UDP streams as the most reliable way to see the issue. For example, I will run a TCP test using iperf, but it will almost immediately reach a throughput close to 1Gbps. Since the packet loss is periodic, I don't want to push this much bandwidth continously until it happens. Packet loss is also part of TCP's congestion design, so I expect to see packetloss frequently anyway as the congestion window is increased.
I simplified my picture, but I do also have a test running where the packet gets routed in 'B' and then sent through 'A' without routing. This test shows almost 0% packet loss. The core in 'A' and in 'B' both do L3 routing. Most VLANs only exist in one building and the core in the building has an SVI for the gateway. A few VLANs stretch between buildings for various reasons.
I do not have a dedicated point-to-point VLAN for routing between the cores. This is one config change I considered, but it would cause more packets to follow the problem path and I was worried about making the issue more noticeable to users.
A sanitized config for the core switch in 'A' is availble at https://pastebin.com/1FSaUhbd
The 'B' config is very similar. Traffic arrives in 'A' over Trk11 from 'B'. Most of the routing table is created by RIP advertisements from the cores, and a couple of edge routers. VLAN 2 is used for the routing and RIP traffic.