WebRTC series: 5 Crucial ways connectivity and bandwidth affect WebRTC UX
In the previous post in this WebRTC series, we looked at some useful UX patterns for voice and video calls to ensure things go smoothly for your users. Now, let’s talk about connectivity, bandwidth, and call quality. These aspects of WebRTC calling, although sometimes out of your control, will be what users consider most in judging the reliability of your service.
1. The Cost of Interactions
A major component of user experience design is performance. Users will be accessing our apps from a variety of devices and browsers, each with varying capability. They might be on a high-capacity Wi-Fi network or they might be on a low-bandwidth 3G network. They might have other apps running on their machine or a dozen tabs open in the browser that they’re also trying to use for a video call. These are all things that will affect the app’s performance and over which you have zero control. We can identify and understand the costs of the things we can control and design around those.
Three major variables for call quality are audio, video, and number of connections. Audio is the least costly by using relatively low bandwidth. Video uses more, but it depends on resolution. The core WebRTC code will automatically determine the available bandwidth and scale down the video resolution accordingly. Each of these become more costly as more people join a call; even if a video router is helping in the middle, the impact on processing incoming video can be significant. The general rule is: the more participants involved, the more bandwidth required.
2. Designing Various States of Connectivity
As a user joins a call, there are a few specific states they’ll experience that you should consider. The first is the state of connecting. What do you show them while the app is establishing a connection? Hopefully this happens quickly, of course, but that’s not a scenario to depend on. Inform the user of what’s happening with a message or loading indicator so they know to wait.
Later in this post, we’ll discuss the technical ways to make this step faster, but the loading state is also a good opportunity to think about the perception of loading speed. Are there things we can do and visual cues to lean on that make this connection process feel faster?
A lot of apps asynchronously load content purposefully to make the loading process feel quicker. Facebook is a good example. As your news feed content is fetched, placeholder images are loaded first. This allows for the first content on screen to load as quickly as possible, ensuring the user won’t see a blank screen for long, if at all. How quickly that first content loads has an affect on how users perceive the speed of your application. Think about what content could be delivered as the connection is being made.
The connected state is one we’ve covered in previous posts and the state you probably designed first. The call continues as expected, and the UI supports the various call features.
What about if the call doesn’t connect? Even if your backend team has done everything possible to make sure calls go through, the media request could fail or the call may not connect because of a firewall or proxy server. What will the user see if this happens? How can you guide them to fix (or at least understand) the problem and encourage them to try again? In these cases, clear instructions and documentation are a must, and realtime or asynchronous user support are especially helpful. Also, consider building a monitoring infrastructure into your application. This will provide insights into when and why calls fail and help you fix and prevent issues in the future.
For a deeper and more technical look at call failures, view the talk Failing Gracefully with WebRTC by Philipp Hancke.
3. Show the User Connectivity and Quality Status
From previous posts in this series, we emphasized the importance of visual feedback for the user. Show them what’s happening with consistency and clarity. This status information is especially important within the context of connectivity and call quality.
An emerging pattern for voice and video calls is to borrow iconography from cell phones. Side-by-side vertical bars became an established convention for showing quality of cell coverage. A comment like, “I have full bars” has come to mean “I have reliable connectivity and service.” In addition to this recognizable visual status, some apps also include a breakdown of technical stats. Numbers like bitrate, packet loss, and resolution can sometimes help users to assess call quality and issues on their end, although they need to be presented carefully (e.g., as a secondary UI element) so that they don’t cause confusion.
As mentioned earlier, sometimes a connection fails, and sometimes it happens in the middle of a call. How do you show other call participants what’s happening? Different apps handle this in different ways, but two common approaches are removing the disconnected user’s avatar/video from the interface and obscuring the avatar/video while providing a “connection interrupted” message.
4. Information Gives the User More Control
Giving the user this kind of status information empowers them to analyze their browser, device, and network usage, and to improve their call quality on their end. They can make adjustments to their environment, close out some bandwidth-intensive apps, and request that others on their network pause any downloading or streaming activities.
For video calls with a lot of packet loss affecting quality, a quick and effective solution is for users to hide their video streams. This will free up bandwidth and improve audio quality, especially on calls with many people. This is a helpful tip, but may not be well-known by users. Let users know this is an option within their control via messaging, documentation, or a helpful UI tooltip.
5. Optimize the Connection Process
As noted earlier, the process of connecting to a call and its perceived speed are very important to the user’s experience with your app. Experienced WebRTC developers use several tricks to optimize this call session startup so it is much quicker and ideally feels instant.
The first of these is something called Trickle ICE. ICE, which stands for Interactive Connectivity Establishment, involves an endpoint gathering IP addresses, prioritizing them, and sending them to a peer. The peer then gathers its own addresses, prioritizes them, and sends them back. This takes time. Traditionally, a media data couldn’t flow until after the entire ICE gathering process completed.
With Trickle ICE, though, endpoints can send and receive addresses as they are discovered, eliminating the need to wait before connectivity checks can begin. This typically reduces the time it takes to establish a session from several seconds to less than 500ms. (For more in-depth information about ICE and Trickle ICE, definitely read ICE always tastes better when it trickles! by Emil Ivov.)
A second method builds upon Trickle ICE by sending only an IP address that is “most likely to succeed” in the initial offer that starts the session negotiation. Typically this address lives at a media relay (or “TURN server”) that is known to not be hidden behind a firewall, so that both parties to a session can always reach it. The endpoints can start the call through the relay, and only later switch to peer-to-peer mode or another IP address that they discover will work through continued “trickling” after the call starts.
Another method is to proactively start negotiating the call setup process behind the scenes if the user interacts with a UI element that’s associated with making a call. Examples might include mousing over a person’s name in a call history log or prompting the user to click a click-to-call button on a web page. By starting the negotiation process in the background, an app can shave precious milliseconds off the apparent time it takes for the call to go through.
These are a few common issues and patterns around connectivity and bandwidth with WebRTC-enabled calls. Designing for these scenarios within your application will improve your user experience. In the next and final post in this series, we’ll discuss accessibility, enhancements, and gathering user feedback.