Introduction
Adaptive bitrate streaming (ABR or ABS) is technology designed to stream files efficiently over HTTP networks. Multiple files of the same content, in different size files, are offered to a user’s video player, and the client chooses the most suitable file to play back on the device.
It was designed to improve streaming by delivering the right content under any circumstances, considering the specific device and the specific network condition, decreasing the need for rebuffering. ABR streaming allows video players to choose the best available video segment (or chunk) to play based on the available bitrate and device capability.
The ABR Ladder is a metaphor. It refers to the array of segments of different quality and resolution which are available from the streaming server.
If the bitrate rises, the logic in a video player can choose to play a larger size file with better quality, thus climbing up the ladder. If the bitrate drops, the player can switch back to a lower quality rendition, climbing back down the ladder.
If you analyze your ABR Ladder and design it to stream files efficiently over HTTP networks, with careful consideration of device capability and network condition, it can decrease rebuffering and improve the user experience.
Background
ABR streaming improved on previous streaming models by adjusting the stream to match the current available bitrate and changing transport conditions. This is particularly helpful for streaming on mobile networks.
Network data is measured in bits per second, known as the bitrate. Adaptive bitrate streaming dynamically tracks CPU, memory capacity, and network conditions, and then delivers video quality to match.
The source video is encoded at varying bit rates on the server side, and those video files are then divided into small segments. Segment length can vary, but they typically run between one and ten seconds. Some providers then break those segments down into even smaller parts to stream them over parallel sessions.
Most modern video players support ABR. The logic in the player on a user’s device is able to choose among all the segments offered in the video’s manifest file, which are adjusted for increasing bitrates. The player chooses the segments that best match the available bandwidth on the user’s device at that moment.
The player starts by requesting the lowest bit rate segments that are offered.
If the player determines the download speed exceeds the bitrate capability of the initial segment, it will request the next higher bit rate segment, until the current bitrate segment and the available bandwidth are a good match.
The player will continue to request segments at that bitrate until the bandwidth changes. If everything is working correctly, the user will have a smooth viewing experience under a variety of network conditions.
In a 2010, Apple proposed a Fixed Bitrate Encoding ladder, seen below, where a video file would be encoded into ten different quality variants, ranging in from an audio-only stream at 64 kbps to a 1080p audio/video stream at 8564 kbps. Note that the Apple dimension numbers reported for Cell phones are for portrait view, and dimension numbers reported for WiFi are for landscape view.
In 2015, Netflix added a refinement to ABR Ladder analysis with Per-Title Encoding.
In this approach, Netflix encoded each video in multiple resolutions and data rates to identify the rungs that deliver the best quality at each relevant data rate.
This meant a content provider could customize both the number of rungs in a ladder, and the related resolution based on the requirements for a specific video, as seen below. This strategy offered potential encoding and storage savings.
In 2018 Context-Aware Encoding (CAE) started gaining traction. It expanded encoding considerations to include devices.
With Context-Aware Encoding, a content provider could deliver content to a variety of device types, from smartphones to home theaters, by offering separate encoding parameters for each device type. The goal is to reduce the playback bandwidth while maintaining the same Quality of Experience. The reduction can be substantial.
The Context-Aware Ladder below offers the same quality as the Fixed Bitrate Ladder while using half as many variants, using lower bitrates or higher resolutions for each variant, which could improve playback performance and cost efficiency.
If encoding is intended to save overall bandwidth, one primary factor is the location in which content will be consumed. The traditional approach to encoding requires creating typical settings for each video in a library.
In the context of videos that will be streamed to multiple devices, such as mobile devices on cellular data networks and wired set-top boxes, it is a good idea to have multiple encoding schemes, one for each type of expected device.
Context-Aware Encoding allows content providers to fine-tune parameters per-show, or even on a per-scene encoding basis.
The Issue
There are many issues to consider when you are designing an ABR Ladder.
One thing to remember is that there is no one answer, no one way to design the perfect ABR Ladder. One ladder design is not sufficient to accommodate all the different types of content, different lengths, Live and VOD, different devices and display sizes, and a variety of different network conditions.
This reality requires flexibility in choosing the right design for your situation. Different scenarios require different options. As a result, different experts offer a wide variety of answers when asked to design the perfect ladder.
Real-world considerations when making design decisions:
- Take into consider the expected network types and conditions
- Be aware of network policies, and rate limits for different service plans
- Prepare for common device and screen requirements and limitations, such as the limitations of a smartphone display compared to a home theater screen
- Consider the ABR adaptation logic in client players
- Apply iterative design and testing: Design – Test – Measure – Refactor – Repeat
- Even when distributing relatively homogenous videos, if you’re using an encoding ladder that is not customized for your video content, then it’s almost certainly suboptimal.
- Network bandwidth, the declared bitrate in the playlist/manifest, and the encoding bitrate all have implications for ABR ladder design
- Remember that the client player will select a track based on the declared bitrate value in the playlist/manifest.
- The encoding bitrate is only a fraction of the total load. Consider the total overhead when determining the encoding bitrate for a track
- Consider the total overhead when determining the encoding bitrate for a track
- Significant difference can exist between encoding bitrate and the declared bitrate in the playlist/manifest
- Over-Declaration can be used to tune client aggressiveness in selecting higher bitrate tracks
- Avoid significant Over-Declaration, which can cause player to select lower quality track than the network can accommodate
One big consideration is cost. When choosing a potential design with cost in mind, you should avoid inefficiency and waste in areas such as energy, content management, storage costs, service requirements, network costs, and development effort.
Another issue in choosing the right ladder design is human perception.
The curve on the graph below illustrates the balance between network bitrate and perceived quality. At a certain point the curve bends due to diminishing return. Beyond that point, the average human eye cannot discern any improvement in quality.
Best Practice Recommendation
The Best Practice recommendation is to design the best ABR Ladder strategy for delivering the best quality video possible over the widest range of devices and network conditions. This is obviously an important decision from the perspective of mobility customers.
Experimentation with various bitrates, resolutions, and file sizes is recommended. If the bitrate chosen is too high, the viewer could experience a stall and buffering. If the bitrate chosen is too low, it could result in annoying encoding “artifacts”.
The goal is to balance for an optimal viewer experience. The bitrate ladder was applied to all content to try to achieve an acceptable playback experience and bandwidth usage on average, without taking specific videos’ content into account.
Some Ladder Design Considerations
- The Strategy for Bottom Track
- The Strategy for Top Track
- The need for Good Quality Tracks with reasonable bitrates
- Consecutive Track Rung Design Considerations
- Encoding and Packaging Optimization for individual tracks
- Chunk duration
- Demuxed vs Muxed audio/video
- Fitting under network limits
Let’s dig in a little deeper on the top and bottom tracks. Starting with the bottom of the ladder, if the resolution for the bottom rung is set too low, the quality will suffer. If the resolution is set too high it raises the possibility of the buffer running out, causing stalls.
Bottom Track Considerations and Best Practices
With the bottom track, you should focus on supporting playback continuity under poor network conditions. As always, you should avoid stalls, which are bad for Quality of Experience (QoE).
What is the encoding quality?
- The bottom track does not have to be set at a high quality: The player is not expected to stay on this track for long time.
What is the video frame rate?
- Should it be full motion, or not? For example are the frame rates: 7.5 fps,15 fps, 30 fps, etc.
What is the video resolution?
- For the same perceptual quality, a higher resolution requires a higher encoding bitrate.
What is the encoding bitrate for the bottom track?
- Low enough that player being able to play it with high probability under bad network conditions.
Is it muxed, audio and video combined, or demuxed, with audio and video sent separately?
- Carefully consider what the encoding quality/bitrate should be for an audio only track?
- Consider what the encoding quality/bitrate should be for the audio and video components for the track?
Now let’s consider the top track of the ladder.
Top Track Considerations and Best Practices.
What should the Quality, Bitrate, Resolution, Frame Rate be for the top track?
The main goal with the top track is to support high perceptual-quality playback.
- Supporting acceptable quality includes consideration of bitrate, high resolution, full motion (30 fps or 60 fps).
- Sustainable playback: The player should be able to stream the track in a sustained way under realistic good network conditions.
It is important to only use high bitrates/resolutions that lead to meaningful Quality of Experience improvements in the top track.
Diminishing Returns: Video QoE does not increase linearly with increasing resolution/bitrate, due to of the limits of human perception.
Resource Wastage: Sending unnecessarily high quality/bitrate wastes precious network resources.
- In some networks, such as cellular, it is important to consider using the user’s data budget efficiently.
Selecting too high a bitrate can even harm user experience.
- A higher bitrate track is more challenging to stream than a lower bitrate track.
- A bitrate that is too high increases the potential for stalls.
What should the framerate be? For example, should you choose 30 fps or 60 fps.
- A higher frame rate requires a higher bitrate.
- The framerate depends on content. High motion sports can benefit from a higher frame rate. Talking head newscasts may not.
Streaming Separate Audio and Video
Learn more about Demuxed vs Muxed audio/video behavior in that Best Practice.
Video Quality of Experience Thresholds
Fitting under network limits Video Quality of Experience thresholds for cellular networks
Determine Video Quality of Experience thresholds for cellular networks
Analyzing the Adaptive Bitrate Ladder
The Video Optimizer Best Practice test for Analyzing the Adaptive Bitrate Ladder lets developers and content providers know the number of different quality tracks used, and the percentage of time that each played during the test.
Comparing these test results with your goals when you designed your ABR Ladder, you can see if your playlists and manifest files are performing as expected.
Below is an example a test result.
The image below is a screenshot example showing the percentage of time that each quality level played during a test stream.
Below is a screenshot of how the full graph appears in Video Optimizer after a trace has been opened. The colors match the percentages in the image above. The Y axis displays the percentage of time that a given quality level played.
By analyzing your ABR Ladder and designing it to stream files more efficiently over HTTP networks, with careful consideration of device capability and network condition, you can decrease rebuffering and improve the user experience.
You can learn more about video streaming and other development issues in our other Mobile Development Best Practices recommendations.