Home — Essay Samples — Information Science and Technology — Computer — Fundamentals of USB Audio

Fundamentals of Usb Audio

Categories: Computer

Human-Written

About this sample

Human-Written

Words: 1919 |

Pages: 3|

10 min read

Published: Oct 23, 2019

Words: 1919|Pages: 3|10 min read

Published: Oct 23, 2019

USB Basics
USB Audio
What's a second between friends?
Multiple clock sources
Compliance and native support
Summary

USB, the Universal Serial Bus has been around for decades and is a heavily used standard in the world of personal computers. Memory sticks, external drives, mice, and web cameras are all interfaced over USB. In this article we will look into USB Audio: a standard for digital audio used in PCs, smartphones, and tablets to interface with audio peripherals such as speakers, microphones, or mixing desks. In this article, we set out to show how USB Audio works, what to watch out for, and how to use USB Audio for high fidelity multi-channel input and output

USB Basics

USB is a protocol where the PC, the USB-host, initiates a transfer, and the device (for example a USB speaker) responds. Each transfer is addressed to a specific device, and to a specific endpoint on the device. IN-transfers send data to the PC. When the host initiates an IN-transfer the device has to respond with data for the host. OUT-transfers send data to the device. When the host performs an OUT-transfer it sends a packet of data that the device must capture. In the world of USB-Audio, IN and OUT transfers may be used to transport audio samples: an OUT-transfer to send audio data from a PC to a speaker, whereas an IN-transfer is used to send audio data from a microphone to the PC.

There are four sorts of IN and OUT-transfers in USB: Bulk, Isochronous, Interrupt, and Control transfers.

A bulk transfer is used to reliably transfer data between host and device. All USB transfers carry a CRC (checksum) that indicates whether an error has occurred. On a bulk transfer, the receiver of the data has to verify the CRC. If the CRC is correct the transfer is acknowledged, and the data is assumed to have been transferred error-free. If the CRC is not correct, the transfer is not acknowledged and will be retried. If the device is not ready to accept data it can send a negative-acknowledgment, NAK, which will cause the host to retry the transfer. Bulk transfers are not considered time critical, and are scheduled around the time critical transfers discussed below.

Isochronous transfers are used to transfer data in real-time between host and device. When an isochronous endpoint is set up by the host, the host allocates a specific amount of bandwidth to the isochronous endpoint, and it regularly performs an IN- or OUT-transfer on that endpoint. For example, the host may OUT 1 KByte of data every 125µs to the device. Since a fixed and limited amount of bandwidth has been allocated, there is no time to resend data if anything goes wrong. The data has a CRC as normal, but if the receiving side detects an error there is no resend mechanism.

Interrupt transfers are used by the host to regularly poll the device to find out whether something worthwhile has happened. For example, a host may poll an audio device to check whether the MUTE button has been pressed. The name Interrupt transfer is slightly confusing, since they do not interrupt anything. However, regular polling of data gives the same sort of functionality that an host-interrupt would provide.

Control transfers are very much like bulk transfers. Control transfers are acknowledged, can be NAKed, and are delivered in a non-real-time fashion. Control transfers are used for operations that are outside normal data flow, such as querying the device capabilities, or endpoint status. An explanation on how device capabilities are described is outside the scope of this article, and we just state that there are predefined classes such as 'USB Audio Class' or 'USB Mass Storage Class' that enable cross platform interoperability.

All transfers are made in USB frames. High Speed USB frames span 125µs (Full Speed USB are 1ms) and are marked by the host sending a Start-Of-Frame (SOF) message. Isochronous and Interrupt transfers are transmitted at most once a frame.

USB Audio

USB Audio uses isochronous, interrupt and control transfers. All audio data is transferred over isochronous transfers; interrupt transfers are used to relay information regarding the availability of audio clocks; control transfers are used used to set volume, request sample rates, etc.

The data requirements of a USB-Audio system depends on the number of channels, the number of bits to represent each sample, and the sample rate. Typical channel counts are 2 (stereo), 6 (5.1) or much higher for studio and DJ use. Typically sample size is 24 bits, although 16 bits is available for legacy audio, and 32 bits for high quality audio. Typical sample rates are 44.1, 48, 96, and 192kHz. The latter is used for high quality audio.

Suppose that we design a stereo audio speaker with a 96kHz sample rate and 24-bit samples. In order to simplify data marshalling on host and device, 24-bit values are typically padded with a zero byte, so the total data throughput is 96,000 x 2 channels x 4 bytes = 768,000 bytes per second. The isochronous endpoints run at a rate of one transfer per 125µs; or 8,000 transfers per second. Dividing the required byte rate over the frame rate gives us the number of bytes for each isochronous transfer: 768,000/8,000 = 96 bytes per transfer.

When using CD rates, such as 44,100Hz, the transfer rate works out as 44.1 transfers per second. In USB Audio each transfer always carries a whole number of samples; alternating transfers carry 48 and 40 bytes (6 and 5 stereo samples), so that the average rate works out as 44.1 bytes per transfer.

A single isochronous transfer can carry 1024 bytes, and can carry at most 256 samples (at 24/32 bits). This means that a single isochronous endpoint can transfer 42 channels at 48kHz, or 10 channels at 192kHz (assuming that High Speed USB is used - Full Speed USB cannot carry more than a single stereo IN and OUT pair at 48kHz).

When transmitting digital audio, latency is introduced. In the case of High Speed USB this latency is 250µs. A packet of data is transferred once in every 125µs window, but given that it may be sent anytime in this window a 250µs buffer is required. On top of this 250µs delay, extra delay may be incurred in the O/S driver, and in the CODEC. Note that Full Speed USB has a much higher intrinsic latency of 2ms, as data is only sent once in every 1ms window

What's a second between friends?

The big issue in digital audio is to agree on a common notion of time. Above we have defined USB frames to be transferred 8,000 times per second, and set the speakers to play a sample 96,000 times per second. This will only work if the speaker and the host agree on the length of a second. USB-Audio offers three modes that ensure that the host and the speaker agree on timings:

In synchronous mode, the length of a second is defined by the host device. That is, the host will send data at a rate, and the device has to exactly match that rate.
In asynchronous mode it is the other way around, the device sets the definition of a second, and the host has to match the device.
In adaptive mode the data flow determines the clock.

Adaptive and synchronous mode are not ideal because PCs are notoriously bad at keeping a stable clock, and there are often other audio sources involved, such as an external digital deck. Asynchronous mode enables external clock sources to be used as the master, or a low-jitter clock in the device. Typically, either relies on a crystal based PLL.

Hence there are at least two separate clocks in the system, the USB clock with a host driven frequency of 8,000 transfers per second, and a sample clock with an externally driven sample rate of, for example, 96,000 Hz.

These clocks will have subtly different frequencies, and the difference will vary slightly over time. Hence the average number of audio samples per frame will be slightly more or less than the expected rate. For example, in the case of our 96,000Hz sample rate, the average number of samples maybe 12.001. In order to ensure that the host sends the right amount of data, and not too much or too little, the host requests the current sample rate over an interrupt endpoint. Every few milliseconds the average sample rate over the last period is reported back as a 16.16-bit fixed-point number. If the last period averaged out as 12.001 frames, then the value 0x000C0041 will be reported (65536 * 12.001).

Given this average rate, the host can work out when to send an extra sample in a transfer; in this example 8 transfers each second will carry one extra sample. In addition to this, the host can use this value to synchronise itself with the audio device. This enables host applications such as a DVD player to keep the video in sync with the audio. If it didn't, the audio would slowly run ahead of the video, and after two hours the audio would be a second out.

In order to keep a short feedback loop, the trick is to not buffer audio packets and feedback packets unnecessarily. Any additional buffering creates latency in the reporting, and this latency makes it more difficult to keep a smooth flow of traffic. This means that the low-level USB stack and the USB-Audio stack should be tightly integrated, without buffering in between. Although this is hard to achieve on an application processor, this is quite easy to achieve if the software is implemented on an embedded processor that has a predictable execution time.

Multiple clock sources

The above scheme considers just two clock sources - either the USB device provides the clock, or the host provides the clock. In more complex devices such as mixer desks there may be other devices that provide the sample rate, for example through a digital interface such as ADAT or SPDIF, or through a BNC connector that carries the word clock. For systems like this, the USB-Audio standard allows designers to put a clock selector in the device.

The clock selector states which clock is to be used as the sample rate. The clock selector has multiple input clocks (e.g., the incoming clock on an S/PDIF connection; a local crystal, and the incoming clock on an ADAT connection) and with a control transfer the user selects which clock to use as input, for example the incoming clock of the S/PDIF connection.

Compliance and native support

Once a device is USB-Audio Class compliant, it will integrate neatly into the operating system. Figure 3 shows a screenshot of the controls of a USB-Audio device plugged into Mac OS/X. It shows that the clock selection, sample rate selection, channel volume control, and mute control are all controllable just like for any other audio device.

Figure 3: An interoperable device appears in a standard O/S dialog (Mac OS/X in this example), and the O/S can set volume, sampling rate, etc.

Compliance to the standard makes the device interoperable. O/S vendors can supply a single USB-Audio driver that drives a multitude of devices, with a multitude of capabilities.

Indeed, the same USB-Audio implementation can be parameterised to implement a different number of channels, and the same driver can be used to interface to the device.

Summary

USB-Audio Class 2.0 takes advantage of High Speed USB 2.0, enabling low latency transfer of audio between PC and a connected audio device. The high throughput of High Speed USB 2.0 can be utilised to deliver many audio channels, and with high audio quality. The USB-Audio Class standard caters for a wide range of devices, from complex mixing desks with many channels, multiple clock sources and complex controls, to surround sound systems, PC speakers and microphones.

Image Processing Techniques And Deep Learning Algorithm For Crack Detection And Classification

My Passion For The Hardware Computing

This essay was reviewed by

Alex Wood

More about our Team

Cite this Essay

Fundamentals of USB Audio. (2018, October 23). GradesFixer. Retrieved July 11, 2025, from https://gradesfixer.com/free-essay-examples/fundamentals-of-usb-audio/

“Fundamentals of USB Audio.” GradesFixer, 23 Oct. 2018, gradesfixer.com/free-essay-examples/fundamentals-of-usb-audio/

Fundamentals of USB Audio. [online]. Available at: <https://gradesfixer.com/free-essay-examples/fundamentals-of-usb-audio/> [Accessed 11 Jul. 2025].

Fundamentals of USB Audio [Internet]. GradesFixer. 2018 Oct 23 [cited 2025 Jul 11]. Available from: https://gradesfixer.com/free-essay-examples/fundamentals-of-usb-audio/

copy

Keep in mind: This sample was shared by another student.

450+ experts on 30 subjects ready to help
Custom essay delivered in as few as 3 hours

Get high-quality help

Prof Ernest (PhD)

Verified writer

Expert in: Information Science and Technology

(571 reviews)

“Thank you so much for accepting my assignment the night before it was due. I look forward to working with you moving forward”

+120 experts online

Hire writer

Learn the cost and time for your paper

Paper Topic

Deadline: in 10 days

Number of pages

Email Invalid email

By clicking “Check Writers’ Offers”, you agree to our terms of service and privacy policy. We’ll occasionally send you promo and account related email

"You must agree to out terms of services and privacy policy"

Get an estimate

No need to pay just yet!

Remember! This is just a sample.

You can get your custom paper by one of our expert writers.

Get custom essay

121 writers online

Still can’t find what you need?

Browse our vast selection of original essay samples, each expertly formatted and styled

Fundamentals of Usb Audio

Table of contents

USB Basics

USB Audio

What's a second between friends?

Multiple clock sources

Compliance and native support

Summary

Cite this Essay

Still can’t find what you need?

Get Your
Personalized Essay in 3 Hours or Less!

Fundamentals of Usb Audio

Table of contents

USB Basics

USB Audio

What's a second between friends?

Multiple clock sources

Compliance and native support

Summary

Cite this Essay

Related Essays

Still can’t find what you need?

Related Essays on Computer

Related Topics

Get Your Personalized Essay in 3 Hours or Less!

Get Your
Personalized Essay in 3 Hours or Less!