Meta AV1 Delivery Presentation: Six Key Takeaways

One of the most gracious things that large companies like Meta and Netflix do is to share their knowledge with others in the community. On November 3, Meta hosted Video @Scale Fall 2022 which featured multiple speakers from Meta and other companies. If you’re unfamiliar with the event, here’s the description, “Designed for engineers that develop or manage large-scale video systems serving millions of people.”

Meta’s Ryan Lei speaking on Scaling AV1 End-To-End Delivery at Meta.

One talk drew my attention; Meta’s Ryan Lei speaking on Scaling AV1 End-To-End Delivery at Meta. Watch above or use this link:  https://bit.ly/Lei_AV1 

For perspective, where Netflix has focused AV1 distribution on Smart TVs, Meta’s focus is mobile. Briefly, the company started delivering “AV1-encoded FB/IG Reels videos to selected iPhone and Android devices” in 2022. Lei’s talk included encoding, decoding, and some observations about the bandwidth savings, improved MOS scores, and increased viewing time that AV1 delivered.

Here are my top 6 takeaways from Lei’s excellent presentation.

1. Meta Finds that AV1 is 30% More Efficient than HEVC/VP9

As you’ll learn later in this article, Meta relies upon software playback on iOS and Android platforms. Since both platforms support HEVC decoding, iOS in hardware (since 2017) and Android mostly in hardware but also in software, it’s reasonable to ask why Meta didn’t just use HEVC?

The answer is that in Meta’s own tests, they found that AV1 was 30% more efficient than both VP9 and HEVC, about 21% lower than the 38% higher efficiency that I found in this study by Streaming Media. Lei didn’t discuss HEVC in his presentation, but you’d have to guess that Meta chose AV1 over HEVC because the superior quality AV1 was able to deliver outweighed the potential impact of software-playback on mobile device battery life.

Meta about AV1-2
SLIDE FROM Meta’s Ryan Lei speaking on Scaling AV1 End-To-End Delivery at Meta.

2. Meta Encodes with SVT-AV1 For Video On Demand (VOD)

The chart shown below tracks the encoding time and quality levels of the open-source codecs shown on the upper right, which includes libaom-av1 (AV1 codec), libvpx (VP9), x265 (HEVC), x264, (AVC), vvenc (VVC), and SVT-AV1 (AV1).

Here’s how Lei interpreted this data. “From this graph, we see that SVT-AV1 maintains a consistent performance across a wide range of complexity levels. No matter for an encoding efficiency or compute efficiency point of view, SVT-AV1 always achieves the most optimal results among open-source encoders.” Again, these results track my own findings, at least as it relates to SVT-AV1 as compared to Libaom.

Interestingly, the chart only tracks software encoders, not hardware, which present a completely different quality/encoding time curve. You’ll see why this is important at the end of this post.

Meta about AV1-3
SLIDE FROM Meta’s Ryan Lei speaking on Scaling AV1 End-To-End Delivery at Meta.

3. Meta Creates Their Encoding Ladder Using the Convex Hull

There are many forms of per-title encoding. Some, like YouTube, are based on machine learning, while others’, like Netflix, are based on multiple encodes to find the convex hull. Since Meta’s encoding task is much closer to YouTube than Netflix (high volume UGC), you might assume that Meta uses AI as well.

However, Meta actually uses the convex hull, a brute force technique that involves encoding at multiple resolutions and multiple bitrates to find the combination that comprises the convex hull for that video. In the example shown below, Meta encoded at seven resolutions and five CRF levels, a total of 35 encodes. To compute the convex hull, Meta plots the 35 data points and then draws a line connecting the points on the upper left boundary. The points on the convex hull are the optimal encoding configuration for that video.

As Lei points out, “the complexity of this process is quite high.” To reduce the complexity, Meta uses techniques like computing the convex hull with high-speed presets, and then encoding the selected resolution and CRF points using higher-quality presets for final delivery. Lei noted that though there are more encodes using this hybrid approach, as the optimal configurations are encoded twice, overall encoding time is reduced. 

Just to state the obvious, this approach only works for video on demand, not live. Even with the fastest hardware encoders, you can’t produce 35 iterations to identify the optimal five. This indicates that Meta uses a different schema for live transcoding, which Lei doesn’t address.

Meta about AV1-4
SLIDE FROM Meta’s Ryan Lei speaking on Scaling AV1 End-To-End Delivery at Meta.

4. Meta Uses the Convex Hull Computed for AVC for VP9 and AV1

Like most large publishers, Meta encodes using multiple codecs like H.264, VP9, and AV1 to deliver to different devices. One surprising revelation was that Meta uses the convex hull computed for H.264 to guide the convex hull implementations for the VP9 and AV1 encodes.

Lei didn’t explain how this works – as you can see in the figure below, the resolutions and bitrates for the three codecs are obviously different, and that’s what you would expect. So, there must be some kind of interpolation of the convex hull information from one codec to another. But you see that VP9 delivers a 48% bitrate savings over the top H.264 ladder rung, while AV1 delivers 65%.

SLIDE FROM Meta’s Ryan Lei speaking on Scaling AV1 End-To-End Delivery at Meta.

5. Apple and Android Phones Present Completely Different Challenges

Again, no surprise. There are many fewer Apple devices, and all are premium high-performance models. In contrast, there’s a much greater range of Android devices, from low-cost/low-performance options to models that rival Apple in cost and performance.

Lei shared that Facebook tests Android devices to determine eligibility for AV1 videos. As you can see in the slide below, Meta delivers much different quality to iOS and Android devices.

It was clear from Lei’s talk that delivering AV1 to Apple phones was relatively simple compared to sending AV1 video to Android phones. This is actually the reverse of what you might expect, as iOS doesn’t support AV1 natively while Android does. Though you can deliver video via an app to iOS devices, as Meta does, Safari doesn’t support it. And even though Android does support AV1 playback natively, you’ll have to implement some type of testing protocol—like Meta—to ensure smooth playback until AV1 hardware support becomes pervasive, which probably won’t be until 2024 or beyond.

Meta about AV1-6
SLIDE FROM Meta’s Ryan Lei speaking on Scaling AV1 End-To-End Delivery at Meta.

6. AV1 has Delivered in Several Key Metrics

Integrating a new codec into your encoding and delivery pipeline isn’t trivial. So, the big question is, was AV1 worth it? The slide below displays three graphs. Sorry that the quality in the original slide is suboptimal, but here’s the net/net.

The graph on the top left shows the week-over-week playback MOS on all videos played on an iPhone. It shows about a 0.6 MOS point improvement. Since MOS (Mean Opinion Score) is usually computed on a scale from 1-5, .6 is a significant number. The second graph, on the upper right is the bitrate of all videos delivered, and it shows about a 12% bitrate reduction.

The bottom chart presents the average iPhone watch time for the different codecs used in Facebook Reels and shows that AV1 watch time went up to about 70% within the first week after rollout. This doesn’t seem to mean that AV1 increased watch time; rather, it seems to show that a significant number of devices were able to play AV1, which is how AV1 delivered the MOS improvement and bitrate reductions shown in the top two charts.

Meta about AV1-7
SLIDE FROM Meta’s Ryan Lei speaking on Scaling AV1 End-To-End Delivery at Meta.

Lei’s talk was about 18 minutes long, and there’s a lot more useful data and observations than I’ve presented here. Again, here’s the link – https://bit.ly/Lei_AV1. If you’re considering deploying AV1 for VOD encoding in your organization, you’ll find the encoding-related portions of Lei’s talk illuminating.

ASICs are able to deliver video quality on par with SW encoders with significantly improved power efficiency. Because of the rapid commoditization of video processing, rising energy costs, and pollution concerns, Video Processing ASICS are inevitable.”

What about live? Lei didn’t address it, but you can take some guidance from the fact that Meta recently announced their own Video Processing ASIC. After the announcement, David Ronca, Director, Video Encoding at Meta, commented that “ASICs are able to deliver video quality on par with SW encoders with significantly improved power efficiency. Because of the rapid commoditization of video processing, rising energy costs, and pollution concerns, Video Processing ASICS are inevitable.”

At NETINT, we’ve been shipping transcoders based upon custom encoding ASICs since 2019 and have real market validations of Ronca’s comments. While software encoding may be appropriate for VOD, ASIC based transcoders are superior, if not essential, for live transcoding.

Back on Lei’s talk, whether you’re distributing VOD or live AV1 streams, Lei’s descriptions of the challenges of AV1 delivery to mobile will be instructive to all.

Related Article

NETINT Blog

Evaluating Hardware Transcoder Performance

If you’ve ever benchmarked software codecs, you know the quality/throughput tradeoff; simply stated, the higher the quality, the lower the throughput. In contrast, for many