NETINT recently added capped CRF to the rate control mechanism across our Video Processing Unit (VPU) product lines. With the wide adoption of content-adaptive encoding techniques (CAE), constant rate factor (CRF) encoding with a bit rate cap gained popularity as a lightweight form of CAE to reduce the bitrate of easy-to-encode sequences, saving delivery bandwidth with constant video quality. It’s a mode that we expect many of our customers to use, and this document will explain what it is, how it works, and how to get the most use from the feature.
In addition to working with H.264, HEVC, and AV1 on the Quadra VPU line, capped CRF works with H.264 and HEVC on the T408 and T432 video transcoders. This document details how to encode with capped CRF using the H.264 and HEVC codecs on Quadra VPUs, though most application scenarios apply to all codecs across the NETINT VPU lines.
What is Capped CRF and How Does it Work?
Capped CRF is a bitrate control technique that combines constant rate factor (CRF) encoding with a bit rate cap. Multiple codecs and software encoders support it, including x264 and x265 within FFmpeg. In contrast to CBR and VBR encoding, which encode to a specified target bitrate (and ignore output quality), CRF encodes to a specified quality level and ignores the bitrate.
CRF values range from 0-51, with lower numbers delivering higher quality at higher bitrates (less savings) and higher CRF values delivering lower quality levels at lower bitrates (more bitrate savings). Many encoding engineers will utilize values spanning 21 to 23. Which is right for you? As you will read below, your desired quality and bitrate savings balance determines the best value for your use case.
For example, with the x264 codec, if you transcode to CRF 23, the encoder typically outputs a file with a VMAF quality of 93-95. If that file is a 4K60 soccer match, the bitrate might be 30 Mbps. If it’s a 1080p talking head, it might be 1.2 Mbps. Because CRF delivers a known quality level, it’s ideal for creating archival copies of videos. However, since there’s no bitrate control, in most instances, CRF alone is unusable for streaming delivery.
When you combine CRF with a bit rate cap, you get the best of both worlds, a bit rate reduction with consistent quality for easy-to-encode clips and similar to CBR quality and bitrate or more complex clips.
ffmpeg -i input crf=23:vbvBufferSize=1000:bitrate=6000000 output
The relevant elements are:
- CRF=23 – sets the quality target at around 95 VMAF
- vbvBufferSize=1000 – sets the VBV buffer to one second (1000 ms)
- bitrate=6000000 – caps the bitrate at 6 Mbps.
These commands would produce a file that targets close to 95 VMAF quality but, in all cases, peaks at around 6 Mbps.
For a simple-to-encode talking head clip, Quadra produced a file with an average bitrate of 1,274 kbps and a VMAF score of 95.14. Figure 1 shows this output in a program called Bitrate Viewer. Since the entire file is under the 6 Mbps cap, the CRF value controls the bitrate throughout.
Encoding this clip with Quadra using CBR at 6 Mbps produced a file with a bit rate of 5.4 Mbps and a VMAF score of 97.50. Multiple studies have found that VMAF scores above 95 are not perceptible by viewers, so the extra 2.26 VMAF score doesn’t improve the viewer’s quality of experience (QoE). In this case, capped CRF reduces your bandwidth cost by 76% without impacting QoE.
Figure 1. Capped CRF encoding a simple-to-encode video in Bitrate Viewer.
You see this in Figure 2, showing the capped CRF frame with a VMAF score of 94.73 on the left and the CBR frame with a VMAF score of 97.2 on the right. The video on the right has a bit rate over 4 Mbps larger than the video on the left, but the viewer wouldn’t notice the difference.
Figure 2. Frames from the talkinghead clip. Capped CRF at 1.23 Mbps on the left,
CBR at 5.4 Mbps on the right. No viewer would notice the difference.
Figure 3 shows capped CRF operation with a hard-to-encode American football clip. The average bitrate is 5900 kbps, and the VMAF score is 94.5. You see that the bitrate for most of the file is pushing against the 6 Mbps cap, which means that the cap is the controlling element. In the two regions where there are slight dips, the CRF setting controls the quality.
Figure 3. Capped CRF encoding a hard-to-encode video in Bitrate Viewer.
In contrast, the CBR encode of the football clip produced a bit rate of 6,013 kbps and a VMAF score of 94.73. Netflix has stated that most viewers won’t notice a VMAF differential under 6 points, so a viewer would not perceive the .25 VMAF delta between the CBR and capped CRF file. In this case, capped CRF reduced delivery bandwidth by about 2% without impacting QoE.
Of course, as shown in Figure 2, the two-minute segment tested was almost all high motion. The typical sports broadcast contains many lower-motion sequences, including some commercials, cutting to the broadcasters, or during timeouts and penalty calls. In most cases, you would expect many more dips like those shown in Figure 2 and more substantial savings.
So, the benefits of capped CRF are as follows:
- You can use a single ladder for all your content, automatically saving bitrate on easy-to-encode clips and delivering the equivalent QoE on hard-to-encode clips.
- Even if you modify your ladder by type of content, you should save bandwidth on easy-to-encode regions within all broadcasts without impacting QoE.
- Provides the benefit of CAE without the added integration complexity or extra technology licensing cost. Capped CRF is free across all NETINT VPU and video transcoder products.
Producing Capped CRF
Using the NETINT Quadra VPU series, the following commands for H.264 capped CRF will optimize video quality and deliver a file or stream with a fully compliant VBV buffer. As noted previously, this command string with the appropriate modifications to codec value will work across the entire NETINT product line. For example, to output HEVC, change -c:v h264_ni_quadra_enc to -c:v h265_ni_quadra_enc.
Here’s the command string.
ffmpeg -y -i input.mp4 -y -c:v h264_ni_quadra_enc -xcoder-params “gopPresetIdx=5:RcEnable=0:crf=23:intraPeriod=120:lookAheadDepth=10:cuLevelRCEnable=1:v
Here’s a brief explanation of the encoding-related switches.
- -c:v h264_ni_quadra_enc -xcoder-params – Selects Quadra’s H.264 codec and identifies the codec commands identified below.
- gopPresetIdx=5 – this chooses the Group of Pictures (GOP) pattern, or the mixture of B-frame and P-frames within each GOP. You should be able to adjust this without impacting capped CRF performance.
- RcEnable=0 – this disables rate control. You must use this setting to enable capped CRF.
- crf=23 – this chooses the CRF value. You must include a CRF value within your command string to enable capped CRF.
- intraPeriod=120 – This sets the GOP size to four seconds which we used for all tests. You can adjust this setting to your normal target without impacting CRF operation.
- lookAheadDepth=10 – This sets the lookahead to 10 frames. You can adjust this setting to your normal target without impacting CRF operation.
- cuLevelRCEnable=1 – this enables coding unit-level rate control. Do not adjust this setting without verifying output quality and VBV compliance.
- vbvBufferSize=1000 – This sets the VBV buffer size. You must set this to trigger capped CRF operation.
- bitrate=6000000 – This sets the bitrate. You must set this to trigger capped CRF operation. You can adjust this setting to your target without impacting CRF operation.
- tolCtbRcInter=0 – This defines the tolerance of CU-level rate control for P-frames and B-frames. Do not adjust this setting without verifying output quality and VBV compliance.
- tolCtbRcIntra=0 – This sets the tolerance of CU level rate control for I-frames. Do not adjust this setting without verifying output quality and VBV compliance.
- zeroCopyMode=0 – this enables or disables the libxcoder zero copy feature. Do not adjust this setting without verifying output quality and VBV compliance.
You can access additional information about these controls in the Quadra Integration and Programming Guide.
Choosing the CRF Value and Bitrate Cap – H.264
Deploying capped CRF involves two significant decisions, choosing the CRF value and setting the bitrate cap. Choosing the CRF value is the most critical decision, so let’s begin there.
Table 1 shows the bitrate and VMAF quality of ten files encoded with the H.264 codec using the CRF values shown with a 6 Mbps cap and using CBR encoding with a 6 Mbps cap. The table presents the easy-to-encode files on top, showing clip-specific results and the average value for the category. The Delta from CBR shows the bitrate and VMAF differential from the CBR score. Then the table does the same for hard-to-encode clips, showing clip-specific results and the average value for the category. The bottom two rows present the overall average bitrate and VMAF values and the overall savings and quality differential from CBR.
Table 1. CBR and capped CRF bitrates and VMAF scores for H.264 encoded clips.
As mentioned, with CRF, lower values produce higher quality. In the table, CRF 19 produces the highest quality (and lowest bitrate savings), and CRF 27 delivers the lowest quality (and highest bitrate savings). What’s the right CRF value? The one that delivers the target VMAF score for your typical clips for your target audience.
For the test clips shown, CRF 19 produces an average quality of well over 95; as mentioned above, VMAF scores beyond 95 aren’t perceivable by the average viewer, so the extra bandwidth needed to deliver these files is wasted. Premium services should choose CRF values between 21-23 to achieve the top rung quality of around 95 VMAF scores. These deliver more significant bandwidth savings than CRF 19 while preserving the desired quality level. In contrast, commodity services should experiment with higher values like 25-27 to deliver slightly lower VMAF scores while achieving more significant bandwidth savings.
What bitrate cap should you select? CRF sets quality, while the bitrate cap sets the budget. In most cases, you should consider using your existing cap. As we’ve seen, with easy-to-encode clips, capped CRF should deliver about the same quality of experience with the potential for bitrate savings. For hard-to-encode clips, capped CRF should deliver the same QoE with the potential for some bitrate savings on easy-to-encode sections of your broadcast.
Note that identifying the optimal CRF value will vary according to the complexity of your video files, as well as frame rate, resolution, and bitrate cap. If you plan to implement capped CRF with Quadra or any encoder, you should run similar tests on your standard test clips using your encoding parameters and draw your own conclusions.
Now let’s examine capped CRF and HEVC.
Choosing the CRF Value and Bitrate Cap – HEVC
Table 2 shows the results of HEVC encodes using CBR at 4.5 Mbps and the specified CRF values with a cap of 4.5 Mbps. With these test clips and encoding parameters, Quadra’s CRF values produce nearly the same result, with CRF values 21-23 appropriate for premium services and 25 – 27 good settings for UGC content.
Table 2. CBR and capped CRF bitrates and VMAF scores for HEVC encoded clips.
Again, the cap is yours to set; we arbitrarily reduced the H.264 bitrate cap of 6 Mbps by 25% to determine the 4.5 Mbps cap for HEVC.
Capped CRF Performance
Note that as currently tested, capped CRF comes with a modest performance hit, as shown in Table 3. Specifically, in CBR mode, Quadra output twenty 1080p30 H.264-encoded streams. This dropped to sixteen using capped CRF, a reduction of 20%.
For HEVC, throughput dropped from twenty-three to eighteen 1080p30 streams, a reduction of about 22%. We performed all tests using CRF 21, with a 6 Mbps cap for H.264 and 4.5 Mbps for HEVC. Note that these are early days in the CRF implementation, and it may be that this performance delta is reduced or even eliminated over time.
Table 3. 1080p30 outputs produced using the techniques shown.
We installed the Quadra in a workstation powered by a 3.6 GHz AMD Ryzen 5 5600X 6-Core Processor running Ubuntu 18.04.6 LTS with 16 GB of RAM. As you can see in the table, we also tested output for the x264 codec in FFmpeg using the medium and veryfast presets, producing two and five 1080p30 outputs, respectively. For x265, we tested using the medium and ultrafast presets and the workstation produced one and three 1080p30 streams.
Even at the reduced throughput, Quadra’s CRF output dwarfs the CPU-only output. When you consider that the NETINT Quadra Video Server packs ten Quadra VPUs into a single 1RU form factor, you get a sense of how VPUs offer unparalleled density and the industry’s lowest cost per stream and power consumption per stream.
Bandwidth is one of the most significant costs for all live-streaming productions. In many applications, capped CRF with the NETINT Quadra delivers a real opportunity to reduce bandwidth cost with no perceived impact on viewer quality of experience.