AV1 Capped CRF Encoding with Quadra VPU

We’ve previously reported results for capped CRF encoding for H.264 and HEVC using NETINT Quadra video processing units (VPU). This post will detail AV1 performance, including both 1080p and 4K data.

For those with limited time, here’s what you need to know: Capped CRF delivers higher quality video during hard-to-encode regions than CBR, similar quality during all other scenes, and improved quality of experience at the same cost or lower than CBR. NETINT VPUs are the first hardware video encoders to adopt Capped CRF across the three most popular codecs in use today, AV1, HEVC, and H.264.

You can read a quick description of capped CRF here and get a deep dive with H.264 and HEVC performance results here

CAPPED CRF OVERVIEW

Briefly, capped CRF is a smart bitrate control technique that combines the benefits of CRF encoding with a bitrate cap. Unlike variable bitrate encoding (VBR) and constant bitrate encoding (CBR), which target specific bitrates, capped CRF targets a specific quality level, which is controlled by the CRF value. You also set a bitrate cap, which is applied if the encoder can’t meet the quality level below the bitrate cap.

On easy-to-encode videos, the CRF value sets the quality level, which it can usually achieve below the bitrate cap. In these cases, capped CRF typically delivers bitrate savings over CBR-encoded footage while delivering similar quality. For harder-to-encode footage, the bitrate cap usually controls, and capped CRF delivers close to the same quality and bitrate as CBR.

The value proposition is clear: lower bitrates and good quality during easy scenes, and similar to CBR in bitrate and quality for harder scenes. I’m not addressing VBR because NETINT’s focus is live streaming, where CBR usage dominates. If you’re analyzing capped CRF for VOD, you would compare against 2-pass VBR as well as potentially CBR.

One last detail. CRF values have an inverse relationship to quality and bitrate; the higher the CRF value, the lower the quality and bitrate. In general, video engineers select a CRF value that delivers their target quality level. For premium content, you might target an average VMAF score of 95. For user-generated content or training videos, you might target 93 or even lower. As you’ll see, the lower the quality score, the greater the bandwidth savings.

1080p RESULTS

We show 1080p results in Table 1, which is divided between easy-to-encode and hard-to-encode content. We encoded the CBR clips to 4.5 Mbps and applied the same cap for capped CRF encoding.

Jan Ozer-AV1 Capped CRF-1
Table 1. 1080p results using Quadra VPU and capped CRF encoding.

You see that in CBR mode, Quadra VPUs do not reach the target rate as accurately as when using capped CRF mode. This won’t degrade viewer quality of experience since the VMAF scores exceed 95, so this missing on the low side saves excess bandwidth with no visual quality detriment.

In this comparison, bitrate savings is minimized, particularly at CRF 19 and 21, as the capped CRF clips in the hard-to-encode content have a higher bitrate than the CBR counterparts (4,419 and 4,092 to 3,889). Not surprisingly, CRF 19 and 21 deliver little bandwidth savings and a slighly higher quality than CBR.

At CRF 23, things get interesting, with an overall bandwidth savings of 16.1% with a negligible quality delta from CBR. With a VMAF score of around 95, CRF 23 might be the target for engineers delivering premium content. Engineers targeting slightly lower quality can choose CRF 27 and achieve a bitrate savings of 43%, and an efficient 2.4 Mbps bit rate for hard-to-encode footage. At CRF 27, Quadra VPUs encoded the hard-to-encode Football clip at 3,999 kbps with an impressive VMAF score of 93.39.

Note that as with H.264 and HEVC, AV1 capped CRF does reduce throughput. Specifically, a single Quadra VPU installed in a 32-core workstation outputs 23 simultaneous CBR streams using CBR encoding. This dropped to eighteen for capped CRF, a reduction of 22%.

4K RESULTS

Many engineers encoding with AV1 are delivering UHD content, so we ran similar tests with the Quadra and 4K30 8-bit content with a CBR target and bitrate cap of 16 Mbps. Using four clips, including a 4K version of the high-motion Football clip to much less dynamic content like Netflix’s Meridian clip and Blender Foundation’s Sintel.

Table 2. 4K results for the Quadra VPU and capped CRF encoding.

In CBR mode, the Quadra VPU hit the bitrate target much more accurately at 4K than 1080p, so even at CRF 19, the VPU delivered a 13% bitrate savings with a VMAF score of 96.23. Again, CRF 23 delivered a VMAF score of very close to 95, with 45% savings over CBR. Impressively, at CRF 23, Quadra delivered an overall VMAF score of 94.87 for these 4K clips at 7.78 Mbps, and that’s with the Football clip weighing in at 14.3 Mbps.

Of course, these savings directly relate to the cap and CBR target. It’s certainly fair to argue that 16 Mbps is excessive for 4K AV1-encoded content, though Apple recommends 16.8 for 8-bit 4K content with HEVC here.

The point is, when you encode with CBR, you’re limiting quality to control bandwidth costs. With capped CRF, you can set the cap higher than your CBR target, knowing that all content contains easy-to-encode regions that will balance out the impact of the higher cap and deliver similar or lower bandwidth costs. With these comparative settings, capped CRF delivers higher quality video during hard-to-encode regions than CBR, similar quality during all other scenes, and improved quality of experience at the same cost or lower than CBR.

DENSER / LEANER / GREENER : Symposium on Building Your Own Streaming Cloud

Save Bandwidth with Capped CRF

Video engineers are constantly seeking ways to deliver high-quality video more efficiently and cost-effectively. Among the innovative techniques gaining traction is capped Constant Rate Factor (CRF) encoding, a form of Content-Adaptive Encoding (CAE), which NETINT recently introduced across our Video Processing Unit (VPU) product lines for x264 and x265. In this blog, we explore why capped CRF is essential for engineers seeking to streamline video delivery and save on bandwidth costs.

Capped CRF - The Efficient Encoding Solution

Capped CRF is a smart bitrate control technique that combines the benefits of CRF encoding with a bit rate cap. Unlike Variable Bitrate Encoding (VBR) and Constant Bitrate Encoding (CBR), which target specific bitrates, capped CRF targets a specific quality level controlled by the CRF value, with a bitrate cap applied if the encoder can’t meet the quality level below the bitrate cap.

A typical capped CRF command string might look like this:

crf 21    -maxrate 6MB

This tells the encoder to encode to CRF 21 quality, but don’t exceed 6 Mbps. Let’s see how this might work with the football video shown in the figure, which compares capped CRF at these parameters with a CBR file encoded to 6 Mbps.

NETINT - Bitrate Comparison - Capped CRF

With the x264 codec, CRF 21 typically delivers a VMAF score of around 95. With easy-to-encode sideline shots, the CRF value would control the encoding, delivering 95 VMAF quality at 2 Mbps, a substantial savings over CBR at 6 Mbps.

During actual plays, the 6 Mbps bitrate cap would control, delivering the same quality as CBR at 6 Mbps. So, capped CRF saves bandwidth with easy-to-encode scenes while delivering equivalent to CBR quality with hard-to-encode scenes.

Ease of Integration

As implemented within the NETINT product line, capped CRF requires no additional technology licensing or complex integration – you simply upgrade your products and change your encoding command string. This means that you can seamlessly implement the feature across NETINT’s VPUs without extensive adjustments or additional investments.

NETINT’s capped CRF is compatible with H.264 and HEVC, and AV1 coming (Quadra only), so you can use the feature across different codec options to suit your specific project requirements. Regardless of the codec used, capped CRF delivers consistent video quality with the potential for bandwidth savings, making it a valuable tool for optimizing video delivery.

A Game Changer

By deploying capped CRF, engineers can efficiently deliver high-quality video streams, enhance viewer experiences, and reduce operational expenses. As the demand for video streaming continues to grow, Capped CRF emerges as a game-changer for engineers striving to stay at the forefront of video delivery optimization.

You can read more about how capped CRF works here. You can read more about Quadra VPUs here, and T408 transcoders here.

Now ON-DEMAND: Symposium on Building Your Live Streaming Cloud

Beyond Traditional Transcoding: NETINT’s Pioneering Technology for Today’s Streaming Needs

Welcome to our here’s-what’s-new-since-last-IBC-so-you-should-schedule-a-meeting-with-us blog post. I know you’ve got many of these to wade through, so I’ll be brief.

First, a brief introduction. We’re NETINT, the ASIC-based transcoding company. We sell standalone products like our T408 video transcoder and Quadra VPUs ( for video transcoding units) and servers with ten of either device installed. All offer exceptional throughput at an industry-low cost per stream and power consumption per stream. Our products are denser, leaner, and greener than any competitive technology.
They’re also more innovative. The first-generation T408 was the first ASIC-based hardware transcoder available for at least a decade, and the second-generation Quadra was the first hardware transcoder with AV1 and AI processing. Our Quadra shipped before Google and Meta shipped their first generation ASIC-based transcoders and they still don’t support AV1.
That’s us; here’s what’s new.

Capped CRF Encoding

We’ve added capped CRF encoding to our Quadra products for H.264, HEVC, and AV1, with capped CRF coming for the T408 and T432 (H.264/HEVC). By way of background, with the wide adoption of content-adaptive encoding techniques (CAE), constant rate factor (CRF) encoding with a bit rate cap gained popularity as a lightweight form of CAE to reduce the bitrate of easy-to-encode sequences, saving delivery bandwidth and delivering CBR-like quality on hard-to-encode sequences. Capped CRF encoding is a mode that we expect many of our customers to use.

Figure 1 shows capped CRF operation on a theoretical football clip. The relevant switches in the command string would look something like this:

-crf 21  -maxrate 6MB

This directs FFmpeg to deliver at least the quality of CRF 21, which for H.264 typically equals around a 95 VMAF score. However, the maxrate switch ensures that the bitrate never exceeds 6 Mbps.

As shown in the figure, in operation, the Quadra VPU transcodes the easy-to-encode sideline shots at CRF 21 quality, producing a bitrate of around 2 Mbps. Then, during actual high-motion game footage, the 6MB cap would control, and the VPU would deliver the same quality as CBR. In this fashion, capped CRF saves bandwidth with easy-to-encode scenes while delivering equivalent to CBR quality with hard-to-encode scenes.

Figure 1. Capped CRF in operation. Relatively low-motion sideline shots are encoded to CRF 21 quality (~95 VMAF), while the 6 Mbps bitrate cap controls during high-motion game footage. Transcoding.
Figure 1. Capped CRF in operation. Relatively low-motion sideline shots are encoded to CRF 21 quality (~95 VMAF), while the 6 Mbps bitrate cap controls during high-motion game footage.

By deploying capped CRF, engineers can efficiently deliver high-quality video streams, enhance viewer experiences, and reduce operational expenses. As the demand for video streaming continues to grow, Capped CRF emerges as a game-changer for engineers striving to stay at the forefront of video delivery optimization.

You can read more about capped CRF operation and performance in Get Free CAE on NETINT VPUs with Capped CRF.

Peer-to-Peer Direct Memory Access (DMA) for Cloud Gaming

Peer-to-peer DMA is a feature that makes the NETINT Quadra VPU ideal for cloud gaming. By way of background, in a cloud-gaming workflow, the GPU is primarily used to render frames from the game engine output. Once rendered, these frames are encoded with codecs like H.264 and HEVC.

Many GPUs can render frames and transcode to these codecs, so it might seem most efficient to perform both operations on the same GPU. However, encoding demands a significant chunk of the GPU’s resources, which in turn reduces overall system throughput. It’s not the rendering engine that’s stretched to its limits but the encoder.

What happens when you introduce a dedicated video transcoder into the system using normal techniques? The host CPU manages the frame transfer between the GPU and the transcoder, which can create a bottleneck and slow system performance.

Figure 2. Peer-to-peer DMA enables up to 200 720p60 game streams from a single 2RU server. Transcoding.
Figure 2. Peer-to-peer DMA enables up to 200 720p60 game streams from a single 2RU server.

In contrast, peer-to-peer DMA allows the GPU to send frames directly to the transcoder, eliminating CPU involvement in data transfers (Figure 2). With peer-to-peer DMA enabled, the Quadra supports latencies as low as 8ms, even under heavy loads. It also unburdens the CPU from managing inter-device data transfers, freeing it to handle other essential tasks like game logic and physics calculations. This optimization enhances the overall system performance, ensuring a seamless gaming experience.

Some NETINT customers are using Quadra and peer-to-peer DMA to produce 200 720p60 game streams from a single 2RU server, and that number will increase to 400 before year-end. If you’re currently assembling an infrastructure for cloud gaming, come see us at IBC.

Logan Video Server

NETINT started selling standalone PCIe and U.2 transcoding devices, which our customers installed into servers. In late 2022, customers started requesting a prepackaged solution comprised of a server with ten transcoders installed. The Logan Video Server is our first response.

Logan refers to NETINT’s first-generation G4 ASIC, which transcodes to H.264 and HEVC. The Logan Video Server, which launched in the first quarter of 2023, includes a SuperMicro server with a 32-core AMD CPU running Ubuntu 20.04 LTS and ten NETINT T408 U.2 transcoder cards (which cost $300 each) for $8,900. There’s also a 64-core option available for $11,500 and an 8-core option for $7,000.

The value proposition is simple. You get a break on price because of volume commitments and don’t have to install the individual cards, which is generally simple but still can take an hour or two. And the performance with ten installed cards is stunning, given the price tag.

You can read about the performance of the 32-core server model in my review here, which also discusses the software architecture and operation. We’ll share one table, which shows one-to-one transcoding of 4K, 1080p, and 720p inputs with FFmpeg and GStreamer.

At the $8,900 cost, the server delivers a cost per stream as low as $445 for 4K, $111.25 for 1080p, and just over $50 for 720p at normal and low latency. Since each T408 only draws 7 watts and CPU utilization is so low, power consumption is also exceptionally low.

Meet NETINT at IBC - Transcoding - Table-1
Table 1. One-to-one transcoding performance for 4K, 1080p, and 720p.

With impressive density, low power consumption, and multiple integration options, the NETINT Video Transcoding Server is the new standard to beat for live streaming applications. With a lower-priced model available for pure encoding operations and a more powerful model for CPU-intensive operations, the NETINT Logan server family meets a broad range of requirements.

Quadra Video Server

Once the Logan Video Server became available, customers started asking about a similarly configured server for NETINT’s Quadra line of video transcoding units (VPUs), which adds AV1 output, onboard scaling and overlay, and two AI processing engines. So, we created the Quadra Video Server.

This model uses the same Supermicro chassis as the Logan Video Server and the same Ubuntu operating system but comes with ten Quadra T1U U.2 form factor VPUs, which retail for $1,500 each. Each T1U offers roughly four times the throughput of the T408, performs on-board scaling and overlay, and can output AV1 in addition to H.264 and HEVC.

The CPU options are the same as the Logan server, with the 8-core unit costing $19,000, the 32-core unit costing $21,000, and the 64-core model costing $24,000. That’s 4X the throughput at just over 2x the price.

You can read my review of the 32-core Quadra Video Server here. I’ll again share one table, this time reporting encoding ladder performance at 1080p for H.264 (120 ladders), HEVC (140), and AV1 (120), and 4K for HEVC (40) and AV1 (30).

In comparison, running FFmpeg using only the CPU, the 32-core system only produced nineteen H.264 1080p ladders, five HEVC 1080p ladders, and six AV1 1080p ladders. Given this low-volume throughput at 1080p, we didn’t bother trying to duplicate the 4K results with CPU-only transcoding.

Figure 2. Encoding ladder performance of the Quadra Video Server.
Table 2. Encoding ladder performance of the Quadra Video Server.

Beyond sheer transcoding performance, the review also details AI-based operations and performance for tasks like region of interest transcoding, which can preserve facial quality in security and other relatively low-quality videos, and background removal for conferencing applications.

Where the Logan Video Server is your best low-cost option for high volume H.264 and HEVC transcoding, the Quadra Video Server quadruples these outputs, adds AV1 and onboard scaling and overlay, and makes AI processing available.

Come See Us at the Show

We now return to our normally scheduled IBC pitch. We’ll be in Stand 5.A86 and you can book a meeting by clicking here.

Figure 3. Book a meeting.
.

Now ON-DEMAND: Symposium on Building Your Live Streaming Cloud

Get Free CAE on NETINT VPUs with Capped CRF

Capped CRF

NETINT recently added capped CRF to the rate control mechanism across our Video Processing Unit (VPU) product lines. With the wide adoption of content-adaptive encoding techniques (CAE), constant rate factor (CRF) encoding with a bit rate cap gained popularity as a lightweight form of CAE to reduce the bitrate of easy-to-encode sequences, saving delivery bandwidth with constant video quality. It’s a mode that we expect many of our customers to use, and this document will explain what it is, how it works, and how to get the most use from the feature.

In addition to working with H.264, HEVC, and AV1 on the Quadra VPU line, capped CRF works with H.264 and HEVC on the T408 and T432 video transcoders. This document details how to encode with capped CRF using the H.264 and HEVC codecs on Quadra VPUs, though most application scenarios apply to all codecs across the NETINT VPU lines.

What is Capped CRF and How Does it Work?

Capped CRF is a bitrate control technique that combines constant rate factor (CRF) encoding with a bit rate cap. Multiple codecs and software encoders support it, including x264 and x265 within FFmpeg. In contrast to CBR and VBR encoding, which encode to a specified target bitrate (and ignore output quality), CRF encodes to a specified quality level and ignores the bitrate.

CRF values range from 0-51, with lower numbers delivering higher quality at higher bitrates (less savings) and higher CRF values delivering lower quality levels at lower bitrates (more bitrate savings). Many encoding engineers will utilize values spanning 21 to 23. Which is right for you? As you will read below, your desired quality and bitrate savings balance determines the best value for your use case.

For example, with the x264 codec, if you transcode to CRF 23, the encoder typically outputs a file with a VMAF quality of 93-95. If that file is a 4K60 soccer match, the bitrate might be 30 Mbps. If it’s a 1080p talking head, it might be 1.2 Mbps. Because CRF delivers a known quality level, it’s ideal for creating archival copies of videos. However, since there’s no bitrate control, in most instances, CRF alone is unusable for streaming delivery.

When you combine CRF with a bit rate cap, you get the best of both worlds, a bit rate reduction with consistent quality for easy-to-encode clips and similar to CBR quality and bitrate or more complex clips.

Here’s how capped CRF could be used with the Quadra VPU:

ffmpeg -i input crf=23:vbvBufferSize=1000:bitrate=6000000 output

The relevant elements are:

  • CRF=23 – sets the quality target at around 95 VMAF

  • vbvBufferSize=1000 – sets the VBV buffer to one second (1000 ms)

  • bitrate=6000000 – caps the bitrate at 6 Mbps.

These commands would produce a file that targets close to 95 VMAF quality but, in all cases, peaks at around 6 Mbps.

For a simple-to-encode talking head clip, Quadra produced a file with an average bitrate of 1,274 kbps and a VMAF score of 95.14. Figure 1 shows this output in a program called Bitrate Viewer. Since the entire file is under the 6 Mbps cap, the CRF value controls the bitrate throughout.

Encoding this clip with Quadra using CBR at 6 Mbps produced a file with a bit rate of 5.4 Mbps and a VMAF score of 97.50. Multiple studies have found that VMAF scores above 95 are not perceptible by viewers, so the extra 2.26 VMAF score doesn’t improve the viewer’s quality of experience (QoE). In this case, capped CRF reduces your bandwidth cost by 76% without impacting QoE.

Figure 1. Capped CRF encoding a simple-to-encode video in Bitrate Viewer.

You see this in Figure 2, showing the capped CRF frame with a VMAF score of 94.73 on the left and the CBR frame with a VMAF score of 97.2 on the right. The video on the right has a bit rate over 4 Mbps larger than the video on the left, but the viewer wouldn’t notice the difference.

Figure 2. Frames from the talkinghead clip. Capped CRF at 1.23 Mbps on the left,
CBR at 5.4 Mbps on the right. No viewer would notice the difference.

Figure 3 shows capped CRF operation with a hard-to-encode American football clip. The average bitrate is 5900 kbps, and the VMAF score is 94.5. You see that the bitrate for most of the file is pushing against the 6 Mbps cap, which means that the cap is the controlling element. In the two regions where there are slight dips, the CRF setting controls the quality.

Figure 3. Capped CRF encoding a hard-to-encode video in Bitrate Viewer.

In contrast, the CBR encode of the football clip produced a bit rate of 6,013 kbps and a VMAF score of  94.73. Netflix has stated that most viewers won’t notice a VMAF differential under 6 points, so a viewer would not perceive the .25 VMAF delta between the CBR and capped CRF file. In this case, capped CRF reduced delivery bandwidth by about 2% without impacting QoE.

Of course, as shown in Figure 2, the two-minute segment tested was almost all high motion. The typical sports broadcast contains many lower-motion sequences, including some commercials, cutting to the broadcasters, or during timeouts and penalty calls. In most cases, you would expect many more dips like those shown in Figure 2 and more substantial savings.

So, the benefits of capped CRF are as follows:

  • You can use a single ladder for all your content, automatically saving bitrate on easy-to-encode clips and delivering the equivalent QoE on hard-to-encode clips.
  • Even if you modify your ladder by type of content, you should save bandwidth on easy-to-encode regions within all broadcasts without impacting QoE.
  • Provides the benefit of CAE without the added integration complexity or extra technology licensing cost. Capped CRF is free across all NETINT VPU and video transcoder products.

Producing Capped CRF

Using the NETINT Quadra VPU series, the following commands for H.264 capped CRF will optimize video quality and deliver a file or stream with a fully compliant VBV buffer. As noted previously, this command string with the appropriate modifications to codec value will work across the entire NETINT product line. For example, to output HEVC, change -c:v h264_ni_quadra_enc to -c:v h265_ni_quadra_enc.

Here’s the command string.

ffmpeg -y -i input.mp4 -y -c:v h264_ni_quadra_enc -xcoder-params “gopPresetIdx=5:RcEnable=0:crf=23:intraPeriod=120:lookAheadDepth=10:cuLevelRCEnable=1:v
bvBufferSize=1000:bitrate=6000000:tolCtbRcInter=0:tolCtbRcIntra=0:zeroCopyMode=0″ output.mp4

Here’s a brief explanation of the encoding-related switches.

  • -c:v h264_ni_quadra_enc -xcoder-params – Selects Quadra’s H.264 codec and identifies the codec commands identified below.

  • gopPresetIdx=5 – this chooses the Group of Pictures (GOP) pattern, or the mixture of B-frame and P-frames within each GOP. You should be able to adjust this without impacting capped CRF performance.

  • RcEnable=0 – this disables rate control. You must use this setting to enable capped CRF.

  • crf=23 – this chooses the CRF value. You must include a CRF value within your command string to enable capped CRF.

  • intraPeriod=120 – This sets the GOP size to four seconds which we used for all tests. You can adjust this setting to your normal target without impacting CRF operation.

  • lookAheadDepth=10 – This sets the lookahead to 10 frames. You can adjust this setting to your normal target without impacting CRF operation.

  • cuLevelRCEnable=1 – this enables coding unit-level rate control. Do not adjust this setting without verifying output quality and VBV compliance.

  • vbvBufferSize=1000 – This sets the VBV buffer size. You must set this to trigger capped CRF operation.

  • bitrate=6000000 – This sets the bitrate. You must set this to trigger capped CRF operation. You can adjust this setting to your target without impacting CRF operation.

  • tolCtbRcInter=0 – This defines the tolerance of CU-level rate control for P-frames and B-frames. Do not adjust this setting without verifying output quality and VBV compliance.

  • tolCtbRcIntra=0 – This sets the tolerance of CU level rate control for I-frames. Do not adjust this setting without verifying output quality and VBV compliance.

  • zeroCopyMode=0 – this enables or disables the libxcoder zero copy feature. Do not adjust this setting without verifying output quality and VBV compliance.

You can access additional information about these controls in the Quadra Integration and Programming Guide.

Choosing the CRF Value and Bitrate Cap – H.264

Deploying capped CRF involves two significant decisions, choosing the CRF value and setting the bitrate cap. Choosing the CRF value is the most critical decision, so let’s begin there.

Table 1 shows the bitrate and VMAF quality of ten files encoded with the H.264 codec using the CRF values shown with a 6 Mbps cap and using CBR encoding with a 6 Mbps cap. The table presents the easy-to-encode files on top, showing clip-specific results and the average value for the category. The Delta from CBR shows the bitrate and VMAF differential from the CBR score. Then the table does the same for hard-to-encode clips, showing clip-specific results and the average value for the category. The bottom two rows present the overall average bitrate and VMAF values and the overall savings and quality differential from CBR.

Capped CRF - Table 1. CBR and capped CRF bitrates and VMAF scores for H.264 encoded clips.
Table 1. CBR and capped CRF bitrates and VMAF scores for H.264 encoded clips.

As mentioned, with CRF, lower values produce higher quality. In the table, CRF 19 produces the highest quality (and lowest bitrate savings), and CRF 27 delivers the lowest quality (and highest bitrate savings). What’s the right CRF value? The one that delivers the target VMAF score for your typical clips for your target audience.

For the test clips shown, CRF 19 produces an average quality of well over 95; as mentioned above, VMAF scores beyond 95 aren’t perceivable by the average viewer, so the extra bandwidth needed to deliver these files is wasted. Premium services should choose CRF values between 21-23 to achieve the top rung quality of around 95 VMAF scores. These deliver more significant bandwidth savings than CRF 19 while preserving the desired quality level. In contrast, commodity services should experiment with higher values like 25-27 to deliver slightly lower VMAF scores while achieving more significant bandwidth savings.

What bitrate cap should you select? CRF sets quality, while the bitrate cap sets the budget. In most cases, you should consider using your existing cap. As we’ve seen, with easy-to-encode clips, capped CRF should deliver about the same quality of experience with the potential for bitrate savings. For hard-to-encode clips, capped CRF should deliver the same QoE with the potential for some bitrate savings on easy-to-encode sections of your broadcast.

Note that identifying the optimal CRF value will vary according to the complexity of your video files, as well as frame rate, resolution, and bitrate cap. If you plan to implement capped CRF with Quadra or any encoder, you should run similar tests on your standard test clips using your encoding parameters and draw your own conclusions.

Now let’s examine capped CRF and HEVC.

Choosing the CRF Value and Bitrate Cap – HEVC

Table 2 shows the results of HEVC encodes using CBR at 4.5 Mbps and the specified CRF values with a cap of 4.5 Mbps. With these test clips and encoding parameters, Quadra’s CRF values produce nearly the same result, with CRF values 21-23 appropriate for premium services and 25 – 27 good settings for UGC content.

Capped CRF - Table 2. CBR and capped CRF bitrates and VMAF scores for HEVC encoded clips.
Table 2. CBR and capped CRF bitrates and VMAF scores for HEVC encoded clips.

Again, the cap is yours to set; we arbitrarily reduced the H.264 bitrate cap of 6 Mbps by 25% to determine the 4.5 Mbps cap for HEVC.

Capped CRF Performance

Note that as currently tested, capped CRF comes with a modest performance hit, as shown in Table 3. Specifically, in CBR mode, Quadra output twenty 1080p30 H.264-encoded streams. This dropped to sixteen using capped CRF, a reduction of 20%.

For HEVC, throughput dropped from twenty-three to eighteen 1080p30 streams, a reduction of about 22%. We performed all tests using CRF 21, with a 6 Mbps cap for H.264 and 4.5 Mbps for HEVC. Note that these are early days in the CRF implementation, and it may be that this performance delta is reduced or even eliminated over time.

Capped CRF - Table 3. 1080p30 outputs produced using the techniques shown.
Table 3. 1080p30 outputs produced using the techniques shown.

We installed the Quadra in a workstation powered by a 3.6 GHz AMD Ryzen 5 5600X 6-Core Processor running Ubuntu 18.04.6 LTS with 16 GB of RAM. As you can see in the table, we also tested output for the x264 codec in FFmpeg using the medium and veryfast presets, producing two and five 1080p30 outputs, respectively. For x265, we tested using the medium and ultrafast presets and the workstation produced one and three 1080p30 streams.

Even at the reduced throughput, Quadra’s CRF output dwarfs the CPU-only output. When you consider that the NETINT Quadra Video Server packs ten Quadra VPUs into a single 1RU form factor, you get a sense of how VPUs offer unparalleled density and the industry’s lowest cost per stream and power consumption per stream.

Bandwidth is one of the most significant costs for all live-streaming productions. In many applications, capped CRF with the NETINT Quadra delivers a real opportunity to reduce bandwidth cost with no perceived impact on viewer quality of experience.