Skip to content

Parking Lot Rules, B-Frames and Ultra Low-Latency Encoding

Jan Ozer

Jan Ozer

is Senior Director of Video Marketing at NETINT.

Jan is also a contributing editor to Streaming Media Magazine , writing about codecs and encoding tools. He has written multiple authoritative books on video encoding, including ‘Video Encoding by the Numbers: Eliminate the Guesswork from your Streaming Video’ and ‘ Learn to Produce Video with FFmpeg: In Thirty Minutes or Less’ and has produced multiple training courses relating to streaming media production.

One of my sweetest memories of bringing up our two daughters was weekly trips to the grocery store. Each got a $5.00 bribe for accompanying their father, which they happily invested in various tchotchkes that seldom lasted the week. When we exited the car, “parking lot rules” always applied, which meant that each daughter held one of Daddy’s hands for the walk to the store. Two girls, two hands, no running around the busy parking lot.

Parking lot rules came to mind as we debugged a decoding latency issue when testing a new server product. Initial tests revealed a decoding latency of up to 200 milliseconds in some high-volume configurations. Given that the encoding latency was under 20 milliseconds, the decoding numbers were uncomfortably high.

Eliminate B-Frames from the Origination Stream

After raising the issue, our testing team implemented a fix, which dropped latency to under 20 milliseconds, and decreased encoding latency as well. The change is the parking-lot-rules corollary for live streamers, which is “for ultra-low latency, eliminate B-frames from your live streaming workflow.” With H.264, this means using the baseline profile, which eliminates B-frames. With H.265, you’ll have to use a GOP structure that does the same.

A quick glance at Figure 1 reveals why B-frames blow-up decoding latency (shoutout to OTTverse, where we grabbed the image). B-frames, of course, incorporate redundancies from frames before and after the frame being encoded. They are packed and decoded out of order. Any frame decoded out of order adds latency – the further they are out of order, the greater the latency.

Jan Ozer - B-frames low latency - diagram-1-OTTV Verse
Figure 1. B-frames are packed out of order and can increase decode latency.

Will eliminating B-frames (or the Baseline H.264 profile) reduce the quality of the incoming stream? Only minimally, if at all. These streams are typically produced at a relatively high bit rate, so B-frames or higher-quality profiles deliver minimal additional quality. It’s even less likely that any decrease in quality would be noticeable in the output stream (see here).

Let’s pause for a moment and reflect on the bigger picture. Figure 2 shows the typical live streaming workflow. We’ve been talking about B-frames in the on-premise encode impacting the decoding latency in the transcoding server. What about B-frames in the transcoding server when encoding streams for delivery to viewers?

Jan Ozer - B-frames low latency - diagram-2

Predictably, the result is the same. B-frames introduce the same latency during encoding for delivery, for the same reason–packing frames out of order introduces delays. This is why, when implementing low-latency mode with the NETINT Quadra Video Processing Unit and T408 transcoder, you must use a GOP preset that encodes with consecutive frames.

When you get things right – incoming streams without B-frames and outgoing streams without B-frames, the results are transformative. Let’s have a look.

Tue Low Latency Transcoding

Table 1 below shows actual testing results. This use case involves scaling 1080p AVC input down to 720p for delivery, which is common for interactive gaming, auction sites, and conferencing, and the server can produce 320 streams while encoding AVC, HEVC, and AV1. I don’t have the original data for the input file with B-frames, but as I recall, decoder latency averaged 150 – 200 ms, a noticeable break in a live conversation. Even worse, unlike encoder latency, it didn’t drop significantly in low-delay mode.

As you see in the table, after the fix, total latency is around 160 ms for all outputs in normal, (latency-tolerant) mode. Working with the input file without B-frames, and outputting streams without B-frames, combined encoder and decoder latency plummets to around 22 ms, well under a single frame (which for 30 fps video takes 33 ms to display). That’s low enough for even the most latency-sensitive applications.

Jan Ozer - B-frames low latency - Table-1
Table 1. Encode/decode latency in normal and low-delay mode
(with a properly formatted input file).

How much will the lack of B-frames impact quality in the output encoding ladder? Once again, B-frames have delivered surprisingly little value in the tests that I’ve performed. You can read a good article on the subject here, and access updated data here (see page 22), which show less than a 1% quality difference between streams with and without B-frames. The bottom line, of course, is that if your application needs ultra-low latency, you have to prioritize that over any potential quality loss, though it’s good to know that few, if any, viewers will notice it.  

Returning to the thoughts that prompted this article, when my daughters have their kids, an endearing wish is that they implement parking lot rules in all relevant shopping trips. Given their progress to date, this may not occur in my lifetime. If you’re a live-streaming engineer, you have no similar excuse to ignore the corollary. If latency is critical, make sure you eliminate B-frames from your live-streaming workflows.

PS. The server referenced is the Quadra Video 100 Server, which combines ten Quadra video processing units (VPUs) with a SuperMicro chassis driven by a 32-core CPU. Total cost should be around $20,000 in this configuration. Stay tuned for more details or message us.