What is AES67?
AES67 is a standard developed by the Audio Engineering Society (AES) to ensure interoperability between different Audio over IP (AoIP) systems. The standard provides a framework for devices from different manufacturers to communicate seamlessly over IP networks. AES67 addresses several key aspects of device communication, including:
- Device control
- Connection management
- Transport
- Clock synchronization
1. Device Control
In AES67, device control refers to the ability to manage and monitor audio devices over a network. It covers protocols and methods that allow devices to be remotely configured, adjusted, and supervised, ensuring that audio flows can be set up and maintained efficiently.
1.1 Control and Monitoring (SNMP)
AES67 allows devices to be remotely configured and monitored, eliminating the need to physically adjust devices, especially when they are distributed over a large area. Device control is typically achieved through protocols like AES70 (Open Control Architecture), a widely accepted standard for audio control systems, or through network management protocols like SNMP (Simple Network Management Protocol).
AES70 provides extensive control over devices, with access to detailed parameters such as sampling rate, gain settings, and operational controls such as channel management. With these features, engineers can ensure that all devices are operating optimally and can adjust settings at any time, even if the devices are distributed across multiple network nodes.
1.2 Device Discovery (SAP)
Device discovery is another important aspect of device control in an AES67 system. Discovery protocols such as Bonjour and SAP (Session Announcement Protocol) enable devices to automatically find each other without manual intervention. These protocols ensure that new devices connected to the network can be integrated seamlessly, allowing for rapid deployment and reconfiguration of audio streams.
2. Connection Management
Connection management refers to how AES67 handles the setup, maintenance, and termination of audio streams on the network. It ensures that all devices can connect and transmit audio efficiently while maintaining high reliability and low latency.
2.1 Session Description (SDP)
To ensure that different devices on the network can interpret audio streams, AES67 relies on the Session Description Protocol (SDP). SDP provides detailed information about the stream, such as codec type, sampling rate, channel configuration, and IP/port settings. This ensures that the sending and receiving devices are perfectly synchronized on the transmitted audio data.
2.2 Stream Setup and Management (SIP)
AES67 uses protocols such as SIP (Session Initiation Protocol) and SAP to setup and manage streams. These protocols help devices initiate and maintain connections, allowing dynamic setup and stream adjustments as needed. In addition, AES67 supports dynamic stream management, allowing audio streams to be added or removed without affecting overall network stability.
3. Transport
Transport refers to how AES67 manages the actual transmission of audio data over the network. AES67 prioritizes low latency and high-quality audio transmission to ensure professional-grade performance in applications such as broadcast and live sound.
3.1 Real-time Transport Protocol (RTP)
AES67 relies on the Real-time Transport Protocol (RTP) to transport audio streams. RTP is designed to handle real-time audio with minimal latency, ensuring the integrity and time synchronization of audio data. RTP also supports packet sequencing and timestamping, which is key to keeping audio streams aligned between different devices.
3.2 Quality of Service (QoS)
To ensure that audio packets are given priority over other types of network traffic, AES67 employs Quality of Service (QoS) mechanisms. These mechanisms are typically implemented through DiffServ (Differentiated Services), which tags audio packets for priority treatment at routers and switches in the network. This reduces the chance of packet loss or delay, preventing audio quality from being disrupted.
3.3 Multicast and Routing (IGMP)
Multicast is another feature supported by AES67 that allows a single audio stream to be sent to multiple devices simultaneously. This is useful in large-scale sound reinforcement or broadcast environments where multiple devices need to receive the same audio signal. AES67 leverages standard network technologies such as IGMP (Internet Group Management Protocol) to efficiently manage multicast traffic, optimize bandwidth usage and reduce network load.
4. Clock Synchronization
Clock synchronization is a critical component of any audio-over-IP system, ensuring that all devices are in sync and audio streams remain perfectly aligned across the network. In AES67, time synchronization is achieved through Precision Time Protocol (PTP) and Media Clock Synchronization.
4.1 Precision Time Protocol (PTP)
AES67 uses the IEEE 1588-2008 Precision Time Protocol (PTP) for network-wide clock synchronization. This ensures that all devices in the network share the same time reference, which is critical for maintaining low jitter and low latency. PTP operates by synchronizing the internal clocks of all devices in the network, allowing audio samples to be sent and received with precise timing.
4.2 Media Clock Synchronization (48kHz / 96kHz)
In addition to network clock synchronization, AES67 also ensures that the audio data itself remains aligned through Media Clock Synchronization. This guarantees that the samples in each audio stream are synchronized across all devices, providing a consistent and high-quality listening experience without audio drift or distortion.
Conclusion
AES67 provides a robust and reliable framework for audio IP systems, focusing on interoperability, quality, and flexibility. By standardizing device control, connection management, transport, and clock synchronization, AES67 enables different audio IP systems to work together seamlessly, regardless of the manufacturer or proprietary protocols used. This standard plays an important role in professional audio environments, ensuring that high-quality, synchronized audio can be transmitted across complex network infrastructures without compromising performance.