Audio can be a complex world, in some ways even more so than video. In this article, we will aim to demystify audio, equipping you with a deeper understanding, recommendations for best practices, and a variety of troubleshooting steps so that you can navigate any new situation that might come your way.

Overview

Audio is often not the first thing that comes to mind when one thinks about live streaming, but few people realize that humans can tolerate poor-quality video far better than we can tolerate poor-quality audio. If the video is bad, we can focus through it and still glean from the content; if the audio is bad, most people will be unable to effectively hear, process, and retain what is being presented. Therefore, when improving your live stream, it is often best to focus your time, energy, and resources on your audio rather than your video.

At a high level, audio for a live stream involves three critical things:

something (or things) to produce the sound

something (or things) to capture the sound and translate it into an audio signal

something (or things) to convert the signal into a format suitable for streaming

Each of these things is present in every single live stream that contains audio, though it's important to recognize that sometimes one physical object might accomplish two of these three steps. For example, a guitar with an instrument pickup both produces and captures sound, and a USB microphone both captures and converts sound.

Capturing Audio

Sound is vibration or slight oscillations in pressure. Our ears are equipped with some very delicate apparatus to detect these pressure changes in the air and send them to the brain as tiny electrical impulses. Microphones work in much the same way, in that they use one of several different types of very delicate pressure sensors to detect air pressure changes, and they send out electrical impulses that represent the sounds they detect. Instrument pickups also work by detecting vibrations, though they aren't detecting vibrations in the air; rather, they're detecting vibrations in the body of the instrument, or (in the case of magnetic pickups for guitars) they're detecting vibrations in the magnetic fields of the pickup and the strings. In any case, all these vibrations are converted to faint analog electrical signals. The route that the audio signal takes from its source through your system and toward the your audience is referred to as the signal path.

Converting Audio

All the electrical impulses from the above sound-capturing apparatus are analog signals, but they must all be converted to digital signals in order to be streamed. Converting from analog to digital is always accomplished by an analog-to-digital converter or ADC. Exactly where this ADC is located, however, varies. Sometimes, the ADC is a discrete piece of hardware that takes in an analog signal (typically through XLR connectors or 1/4" TRS phone connectors) and outputs a digital signal (through USB); the Scarlett 2i2 from Focusrite is one of the best-known examples of such devices. Other times, the ADC is actually built into the capture device, such as in the case of a USB microphone. Still other times, the ADC is built into the soundboard or the streaming encoder/computer.

If you aren't sure where the conversion is happening in your system, look for the point where XLR and/or 1/4" cables stop and USB, HDMI, SDI, or Ethernet cables start. In most cases, this change of cable types will happen right where the conversion from analog audio to digital audio happens.

This conversion is accomplished through a process called sampling, whereby the ADC receives an analog signal and takes a certain number of samples or snapshots of that signal every second, which can then be encoded digitally. The result is that a continuous flow of analog signal is broken into many discrete instants of digital signal. This is very similar to how video is composed of many discrete instants called frames, and those frames are displayed in such rapid succession that our brains blend them all together into the appearance of smooth motion. In fact, the biggest difference between audio samples and video frames is that audio is sampled at a much higher rate than video; while a typical video might have 30 frames per second, the audio that accompanies that video would likely have 48,000 samples per second!

The rate at which samples are taken by the ADC is called (fittingly) the sample rate, and it will typically be listed in hertz (Hz) or kilohertz (kHz, or sometimes just k). As a unit of measure, hertz simply means per second, so a sample rate of 48 kHz means 48,000 samples per second. Higher sample rates produce a more detailed representation of the audio being captured but also can be more difficult for other parts of the system to process. The most common sample rates are 44.1 kHz and 48 kHz.

Processing Audio

Processing is a broad term that covers many different types of modifications that might be made to an audio signal, typically with the goal of enhancing or clarifying certain qualities. Equalization (EQ), application of effects, compression, and gating are all examples of processing commonly performed on audio signals. Audio processing is easy to learn but hard to master, and far too deep a topic for us to broach here; fortunately, a lot of great material has been created for this topic and is freely available online! If you're looking to deepen your skills in this area, we recommend you start with this seminar from Churchfront or one of the excellent free courses from MxU.

One particular aspect of processing that we will cover here is normalization. Normalization is the process of determining the average amplitude of an audio recording and then modifying it through gain until the average amplitude hits a specified target. Normalization can be applied by your digital audio processor in-house, but it is also applied by the service that is receiving and hosting your stream. This is a way for that platform to ensure that all the content the host is relatively close to the same volume, which leads to a much better experience for the audience on that platform. For example, if Spotify didn't apply any normalization to the music on their platform, you would likely have to adjust the volume every time a new song came on just to have a consistent listening experience! Like those other services, Subsplash applies normalization to content hosted on our platform; specifically, we apply loudness normalization to -17 LUFS. This helps the end product be clear and predictably loud for your audience.

Best Practices for Audio Systems

General Equipment Maintenance

It's important to regularly check the condition of your audio equipment. Audio is very susceptible to interference from malfunctioning equipment, and proactive maintenance and care are key to avoiding issues. Here are some specific things to look out for:

Microphones should be wiped clean regularly. Mics are delicate, and can easily be damaged by heavy use or rough handling. Become familiar with the maintenance and repair processes for your microphones and have replacement parts on hand.

Instrument pickups can become dirty and may need cleaning occasionally.

Cables can become frayed or damaged, and their connectors can become bent or tarnished, especially if they are moved frequently (such as in the case of guitar cables). Become familiar with the methods of replacing cable connectors and have backup cable connectors and cables on hand.

Any devices forming part of the signal path should not be hot to the touch. If they are, it could indicate existing degradation or it could be an early warning of future degradation. This future degradation could be avoided or delayed by repositioning the device or installing a cooling fan, as well as remembering to power the device down when not in use.

Any device on the signal path should be plugged into clean power (i.e., power that has been conditioned by a quality power conditioner), as audio equipment is particularly vulnerable to power irregularities.

In the same vein, try to keep your audio equipment as isolated from your building's other power consumers as possible. In particular, your lighting equipment should be on a different breaker than your audio equipment, as those two systems sharing a breaker will almost certainly affect the quality of your audio.

If you use wireless equipment (such as wireless mics, instrument packs, or in-ear monitors), carefully perform radio frequency spectrum management to ensure that each of your devices is operating on its own frequency and that each device is free of interference from outside sources.

Most devices that handle digital audio feature onboard computers and will likely need firmware updates over time. Become familiar with the update process for each device and create a strategy for staying aware of and applying updates.

Audio System Design

In general, simpler is better. The more converters, extenders, and cables through which your signal path passes, the more likely you are to introduce audio issues and the more difficult it will be to determine where issues originate.

While it is not always possible, creating a separate audio mix for your in-house audio and your stream audio can be greatly beneficial. The ideal mix for your venue's audio will be uniquely shaped by the acoustic characteristics of that venue; the ideal mix for the live stream will inevitably differ because the venue's acoustics aren't as significant a factor. Separating the two mixes allows your in-house output to be mixed much differently than your stream mix, which in turn allows you to ensure that each mix sounds great without compromising one or the other.

Troubleshooting Audio Issues

General Methodology

Because audio systems rely on multiple lengthy signal paths, narrowing down the precise location of an issue can be difficult without the right methodology. We recommend the following methodology:

Start at the origin of the signal path (the audio capture device) and swap it out with an equivalent device, then test the system. If the audio issue persists, then you know that the original audio capture device was not at fault. Restore the system to its original configuration.

Go one step down the signal path to the cable that connects the audio capture device to the rest of the system and swap it out with an equivalent cable, then test the system again. If the audio issue persists, then you know that neither the audio capture device nor its cable was at fault. Restore the system again.

Proceed down the signal path one step at a time, swapping out components and performing tests to the best of your ability. Eventually, you'll identify which component was at fault, and can then try to determine the specific nature of the fault.

Common Issues

Unexpectedly Quiet Audio

Check Your Gain Structure

Take some time with the worship team to ensure that your gain structure is optimized. At a basic level, this means setting all of the faders on your mixer to 0 and using the gain settings to add or subtract amplitude until each channel is at roughly the same level. A good way to approach this is to raise the gain on each channel until the loudest peaks in the audio are just barely maxing out the meter, and then lower the gain by about 15 dB. Repeat this for every channel to have a consistent gain structure.

Check for Phase Cancellation

Another possible factor here is a phenomenon called phase cancellation, which causes sounds from certain devices to partially or fully cancel out the sounds from other devices. As we know, sound is an oscillation in air pressure (moving continuously from high pressure, to low pressure, to high pressure, and so on) that propagates outward from its origin as a wave; therefore, if you have two sound producers in a room, you will have two sets of pressure waves in that room which interact with each other. Now imagine a simple scenario where those two sound producers are producing an identical sound, and you’ve set up a microphone in the room to capture that sound. There would be certain places in that room where the mic would be receiving high-pressure waves from both sound generators at the same time; we would say that these two sound sources are in phase with each other relative to the microphone, because their patterns of high-pressure and low-pressure waves are lining up perfectly. The result is that the mic would pick up a crisp, clear, and loud sound. But there would be certain places in that room where the mic would be receiving sound from one source at the peak of its pressure wave at the exact same moment that it would be receiving sound from the other source at the bottom of its pressure wave; we would say that these two sound sources are out of phase with each other relative to the microphone, because their pressure waves are perfectly opposing each other. The result of this situation is that the low pressure from one source would perfectly cancel out the high pressure from the other source, which would lead to the microphone not detecting any sound at all.

In a real-world scenario, things are a lot more complex: there are usually far more than two sound producers in a venue, none of which will be producing exactly the same sounds. Similarly, there are often many microphones, none of which are likely to be positioned in perfectly in-phase or out-of-phase positions. Most of the time, phase cancellation will manifest as certain frequencies seeming more muffled than others. The first step toward fixing this should be trying to reposition your affected mics. If that doesn’t eliminate the issues and your soundboard has polarity inversion switches (often incorrectly called phase inversion switches but always labeled ∅), flipping that switch for the enabled microphone(s) may improve things, but for more complicated situations you may need a more precise phase alignment tool.

Check for Polarity Cancellation

Similar to phase cancellation is polarity cancellation, which causes the electrical signals that represent captured sound to be canceled due to being in direct opposition to each other. While phase cancellation happens in infinitely variable degrees, polarity cancellation is absolute (i.e., it is either happening or not). An audio signal represents variations in sound pressure through variations in voltage, and the polarity of that signal describes whether it represents high pressure with high voltage and low pressure with low voltage, or high pressure with low voltage and low pressure with high voltage. If two mics are picking up the same sound but have differing polarities, or if two speakers are producing the same sound but have differing polarities, the sound will be canceled out by the polarity differences. If your soundboard has polarity inversion switches (often incorrectly called phase inversion switches but always labeled ∅), flipping that switch for the affected microphone(s) will solve the issue. If your soundboard does not have such switches, you’ll need to purchase a polarity inverter for each affected microphone’s signal path.

Audio Artifacts

Audio artifacts (also called sonic artifacts) are any undesired sounds in your audio resulting from problems that happened while capturing, recording, or processing the sound. The most common audio artifacts that Subsplash customers experience are clipping and feedback.

Clipping

Clipping is the result of an audio signal that is too powerful for the limitations of the systems through which it is passing. This can occur in both analog and digital contexts, and it almost always causes undesirable distortion in the audio. In nearly all cases, audio mixers include built-in meters (or sometimes simple indicator lights) that inform the operator when a given audio input is clipping. If you observe any clipping on any of your audio inputs, try to reduce the amount of signal being received by the mixer by adjusting the device capturing the sound or lowering the gain on the channel.

Feedback

Feedback describes a situation where audio at a certain frequency is captured by a microphone and output by a nearby speaker, then the output from the speaker is picked up by the microphone and is output by the speaker again. This never-ending loop of input and output continues to build on itself in intensity until the resulting sound is unbearable. Mute the mic, turn it down, or move it away from the speaker to eliminate feedback.

Audio/Video Desync

Audio/video desync (or more simply audio desync) describes audio and video that aren’t aligned with each other properly, causing audio to be heard either before or after the corresponding video. This is most often observed while watching a video of someone speaking, as we can very easily detect when lip movements don’t seem to correspond to the words we’re hearing at that instant. Audio desync is typically caused by some sort of delay in either the audio or the video signal path. Many of these delays are inherent to the signal paths and can’t be changed without also redesigning the signal path entirely, but some delays can be fixed and most can be mitigated with the following steps:

Ensure that your signal path is not unnecessarily long or convoluted. Longer signal paths equate to more delay.

Perform general maintenance procedures on the devices in the signal path.

If your signal path includes a computer, the computer could be overloaded. Monitor your computer’s performance to see if adjustments need to be made.

If your audio or video enters your computer through one or more USB devices, ensure each device is connected to a high-speed USB port. Connecting to USB hubs is not recommended, as they tend to introduce latency in the connection.

Any further desync should be mitigated by trial and error using your encoder’s built-in function for adjusting the offset of the audio relative to the video.

Finally, if you’re not seeing desync while your stream is live but you are seeing desync in the recording after the stream is done, this is usually introduced by a transcoding error caused by poor encoding while the stream was live, most often a variable frame rate. This typically can be traced back to encoder overload, and certain steps can be taken to mitigate the issue at the encoder level, but those steps are outside the scope of this article beyond a general advisement to make sure that your encoding settings match our supported ones, found here. If you are still running into this issue after checking your encoding settings, please reach out to the support team!

Echo

Echo is almost always caused by sound getting captured twice at slightly different times. Many cameras and computers have onboard microphones which might be inadvertently introducing another audio source into your stream that your soundboard operator will never see. If you are hearing an echo in your stream, make sure that any such rogue microphone is muted, either at the source or at the encoder.

Glossary

Click to expand!

Signal Path: the entire path of an audio signal, from source (microphone, instrument, vocalist, etc.) to receiver.

Signal Chain: (see signal path)

Gain: the amount of amplification applied to an audio signal.

Gain Structure: the overall pattern of gain applied individually to each audio signal in a mixer.

Phase: the progress of a sound wave through its cycle at a particular moment.

In Phase: the condition of two or more distinct sound waves being at the same phase simultaneously.

Out of Phase: the condition of two or more distinct sound waves being at opposite phases simultaneously.

Phase Cancellation: the result of two more sound waves canceling each other due to being out of phase.

Polarity: the direction of electric current.

Polarity Cancellation: the result of two or more audio signals canceling each other out due to having opposing polarities.

Audio Artifact: an undesired sound captured while recording or streaming or introduced by processing and editing.

Sonic Artifact: (see audio artifact)

Clipping: the result of an audio signal’s amplitude exceeding the limitations of one of the signal path’s components.

Feedback: the result of an endless loop between an audio input and an audio output.

Audio/Video Desync: the condition of an audio signal being either ahead or behind a corresponding video signal.

Audio Desync: (see audio/video desync)

Sampling: the process of representing a continuous analog signal as a series of discrete digital samples.

Sample Rate: the rate at which samples are taken of an analog signal.

Analog-to-Digital Converter: a device that converts one or more analog signals into one or more digital signals by sampling.

Digital Audio Interface: a device that primarily houses one or more analog-to-digital converters and makes the digital output available to a computer via a USB connection, and may also provide secondary functions.

Audio Normalization: the application of a target gain level added to set volume and amplitude at a certain level.

Troubleshoot your Livestream

Livestreaming 101: Audio