The classify-media ACL, is what tells IAP to study / monitor, the identified traffic for control packets before a voice call.
Here is the user guide snippet, explaining this:
"
Voice and video devices use a signaling protocol to establish, control, and terminate voice and video calls. These
control or signaling sessions are usually permitted using predefined ACLs. If the control signaling packets are
encrypted, the IAP cannot determine the dynamic ports that are used for voice or video traffic.
In these cases, the IAP has to use an ACL with the classify-media option enabled to identify the voice or video flow based on a deep packet inspection and analysis of the actual traffic. Instant identifies and prioritizes voice and video traffic
from applications such as Skype for Business, Apple Facetime, and Jabber.
Skype for Business uses Session Initiation Protocol (SIP) over TLS or HTTPS to establish, control, and terminate
voice and video calls. Apple Facetime uses Extensible Messaging and Presence Protocol (XMPP) over TLS or
HTTPS for these functions.
The following CLI example shows the media classification for VoIP calls:
(Instant AP)(config)# wlan access-rule example_s4b_test
(Instant AP)(example_s4b_test)# rule alias <domain_name_for_S4B_server> match tcp 443 443 permit log classify-media
(Instant AP)(example_s4b_test)# rule any any match tcp 5060 5060 permit log classify-media
(Instant AP)(example_s4b_test)# rule any any match tcp 5061 5061 permit log classify-media
(Instant AP)(example_s4b_test)# rule any any match tcp 5223 5223 permit log classify-media
(Instant AP)(example_s4b_test)# rule any any match any any any permit
(Instant AP)(example_s4b_test)# end
(Instant AP)# commit apply
"
Actual voice call happens on UDP ports and is automatically priotirized, to a default value of 48. If you want to use a custom value instead, then an ACL with the specified ToS value as well is needed.