Discover the unparalleled efficiency of Quadra VPUs for transcoding. Ideal for UGC platforms, Quadra offers significant savings in CAPEX and OPEX with top-notch quality.Continue reading
Discover the unparalleled efficiency of Quadra VPUs for transcoding. Ideal for UGC platforms, Quadra offers significant savings in CAPEX and OPEX with top-notch quality.Continue reading
Netflix delves into gaming, aiming to transform the cloud and mobile landscape. Experience enhanced entertainment offerings.Continue reading
The decision behind free video encoding at api.video isn’t just a marketing gimmick. By developing their own infrastructure and utilizing state-of-the-art VPUs, the company has managed to slash encoding expenses by 99.33%. Yet, they’ve ensured that this cost-cutting doesn’t translate to reduced video quality, setting new standards for the video streaming worldContinue reading
Learn why live-streaming platform Zapping built its own low-latency technology and CDN to stream Latin American content using NETINT Streaming Video Servers helping to accelerate Zapping’s rapid expansion in the region. “Zapping is the Netflix of the live streaming here in Chile, in Latin America. We developed all our technology; the encoders, our low-latency, and the apps in each platform. We developed our own CDN…” Nacho Opazo, Zapping Co-founder and CTO.
FIGURE 1. Nacho Opazo, Zapping Co-founder and CTO on a rare vacation away from the office,
Zapping is a live-streaming platform in Latin America that started in Chile and has since expanded into Brazil, Peru, and Costa Rica. Ignacio (Nacho) Opazo, the co-founder and CTO, has been the driving force behind the company’s technological innovations.
The verb zapping refers to the ability to switch content streams with minimal delay. Give him a minute and Nacho will gladly demonstrate there superior low latency in the hyper-responsive mobile app he designed and developed. He’s also responsible for Zapping’s content delivery network (CDN), custom low-latency technology, and user interfaces on smart TVs.
Zapping streams free channels available via terrestrial broadcast, as well as content from HBO, Paramount, Fox, TNT Sports, Globo, and many others. Though this includes a broad range of content types, from local news to daytime and primetime TV to premium movies, what really moves the needle in South America is sports, specifically soccer. It’s a competitive marketplace; in addition to terrestrial TV, other market entrants include DirectTV, Entel, and MovieStar, a long with free-to-air content in some markets.
Soccer coverage is a key driver for subscriptions and presented multiple challenges to Zapping, including latency, video quality, and bandwidth consumption. With aggressive expansion plans, Zapping also needed to focus on capital management and optimizing operating costs.
FIGURE 2. Innovative, feature-rich players and broad compatibility are key to Zapping’s outstanding customer experience.
The Challenges of Soccer Broadcasting
Latency is a critical issue for soccer coverage, and challenging because Zapping competes with services operating in different countries. As Nacho described, “Here in Chile, the soccer matches are premium. So you need to hire a cable operator, and you can hear your neighbor screaming if they have a cable operator with lower latency. Latency is one of the key questions we get asked about in social media. In Brazil, it is more complicated because some soccer matches are free to air. So, our latency has to be lower than free-to-air in Brazil. One potential solution here was to install a server with a low latency transcoder in the CDN of each soccer broadcaster to ensure that Zapping’s streams originate from as close to the original signal as possible.”
Zapping competed with these same services regarding quality, which is a key determinant of quality of experience (QoE). Soccer is incredibly fast-moving and presents a murderer’s row of compression challenges, from midfield shots of tiny players advancing and defending to finely detailed shots of undulating crowds and waving flags to close-ups of fouled players rolling in the grass. Zapping needed a transcoder to preserve detail and color accuracy without breaking the bandwidth bank.
Like latency, Zapping’s bandwidth problems vary by country. In all countries, soccer’s popularity stresses the internet in general. “Video files are huge, and when you have a soccer match, thousands of people come to your servers and saturate the region’s internet.”
Beyond general capacity, some countries have suboptimal infrastructures for high-bandwidth soccer matches, like low-speed inter-trunk connections. “In the beginning, we saw low bandwidth connections – like 10 Gbps trunks between ISPs, and we saturated that trunk with our service.” Problems like these convinced Zapping to create their own CDN to ensure high-speed delivery.
In Chile, Zapping found a different problem. “Here in Chile, we have a really good internet. We have a connection of one gigabyte to the users, one gigabyte per second, and fiber optic. But 80% of our viewers watch on Smart TVs that they don’t upgrade that often, and these devices don’t have good Wi-Fi connections. So, Wi-Fi is the problem in Chili.” While Zapping’s CDN was a huge help in avoiding bandwidth bottlenecks, the best general-purpose solution was to implement HEVC.
To summarize these requirements, Zapping needed a transcoding system affordable enough to install and operate in data centers around South America that delivered high-quality H.264 and HEVC output with exceptionally low latency.
From CPU to GPU to ASIC
Nacho considered all options to find the right transcoding system. “I started encoding with CPUs using Quick Sync from Intel. but my problem was getting more density for the rack unit. Intel enabled five sockets per a 1RU rack unit, which was really low. Though the video quality was good, the amount of power that you needed, and the amount of heat that you produced was really, really high.”
Nacho next tried NVIDIA GPUs, starting with the P2000 and moving to T4. Configured with an 80-core Intel CPU and two T4s, the NVIDIA-powered system could produce about 50 complete ladders per 1RU rack unit, an improvement, but still insufficient.
Then, Nacho learned about NETINT’s first-generation T408 technology. “I was looking to get more density with my servers and found a NETINT article that claimed that you could output 122 channels per rack unit.” Nacho ordered a unit and started testing. “I found that the power draw was really low, as was the latency, and the quality of both H.264 and HEVC is really good.”
Looking ahead, Nacho foresees the need for even more density. “Right now we’re trying the [second generation] NETINT Quadra processor. I need to get more dense. Brazil is a really big country. We need more power and more density in the rack.”
Nacho was sold on the hardware performance but had to integrate the NETINT transcoders into his encoding stack, which was a non-issue. “We control the encoders with FFmpeg, and converting over to the NETINT transcoders was really seamless for us. Really, really easy.”
Just as Nacho finalized his testing, NETINT started offering a server package that included ten T408s in a Supermicro server with all software pre-installed. These proved perfectly suited to Zapping’s technology and expansion plans.
According to Nacho, “The servers are really, really good. For us, buying the server is better because it’s ready to use. As we deploy our platform in Latin America, we send a server to each country. It’s as simple as sliding it into a rack, installing our software, and we’re ready to go. It’s really, really easy for us.”
Delivering Better Soccer Matches
FIGURE 3. Nacho will deploy the Quadra Video Server for the greatest density, lowest cost and latency, and highest quality H.264 and HEVC.
Armed with NETINT servers, Nacho proceeded to attack each of the challenges discussed above. “For the latency, we talk with the channel distributor and put a NETINT server inside the CDN of each broadcaster. And we can skip the satellite uplink and save one or two seconds of latency.”
Nacho originally implemented his own low-latency protocols but now is experimenting with low-latency HLS. “With LL HLS, we can get six seconds ahead from free to air. Let’s talk in about three months and see what that looks like.”
Nacho also implemented a “turbo mode” that toggles the viewer in and out of Zapping’s low-latency mode. Viewers prioritizing low latency can enable turbo mode at the risk of slightly lower quality and a greater likelihood of buffering issues. Viewers who prioritize video quality and minimal buffering over ultra-low latency can disable turbo mode. As Nacho explained, “If you have a bad connection, like bad Wi-Fi, you can turn off the low latency and watch the match in a 30-second buffer like the normal buffer of HLS.”
Nacho also aggressively converted to HEVC output. “For us, HEVC is really, really important. We get a 40% lower bit rate than H.264 with the same quality image. That’s full HD quality at 6 Mbps per second, which is really good compared to competitors using H.264 at 5 Mbps in full HD. And the user knows we’re delivering HEVC. We have that in our UX. The user can turn HEVC on and off and really see the difference. So it’s really, really important.”
Regarding the HEVC switch, Nacho explained, “If we know that your TV or device is HEVC compatible, we play HEVC by default. But there are so many setup boxes, and some signal their codec compatibilities incorrectly. If we’re not sure, we turn off the HEVC by default, and the user can try it, and if it works, great; if not, they play H.264.”
After much experimentation, Nacho extended HEVC’s low-bitrate quality to other broadcasts as well. ‘For CNN or talk shows, we are trying a 600 kilobyte per second HEVC, and it looks really, really good, even on a big screen.”
The Live Streaming Netflix of Latin America
One of Zapping’s unique strengths is that it considers itself a technology company, along with being a content company. This aggressive approach has enabled Zapping to achieve significant success in Chile and to expand into Latin America.
As Nacho describes, “Zapping is the Netflix of the live streaming here in Chile, in Latin America. We developed all our technology; the encoders, our low-latency, and the apps in each platform. We developed our own CDN; I think it’s bigger than Akamai and Fastly here in Chile. We are taking the same steps as Netflix. That you make your platform, you make the UI, you make the encoding process and then you must deliver.”
Nacho is clear about how NETINT’s products have contributed to his success. “NETINT servers are an affordable, functional, and high-performant element of our success, providing unparalleled density along with excellent low-latency and H.264 and HEVC quality, all at extremely low power consumption. NETINT has helped accelerate our expansion while increasing our profitability.”
Innovative technologists like Nacho and Zapping choose and rely on equally innovative tools and building blocks to deliver critical functions and components of their services. We’re proud that Nacho has chosen NETINT servers as the technology of choice for expanding operations in Latin America, and look forward to a long and successful collaboration.
NETINT is proud to be included in the Streaming Media list of the Top 100 Companies in the Streaming Media Universe, which “set themselves apart from the crowd with their innovative approach and their contribution to the expansion and maturation of the streaming media universe.”
The list is compiled by members of Streaming Media Magazine’s inner circle and “foregrounds the industry’s most innovative and influential technology suppliers, service providers, platforms, and media and content companies, as acclaimed by our editorial team. Some are large and established industry standard-bearers, while others are comparably small and relatively new arrivals that are just beginning to make a splash.”
Commenting on the Award, Alex Lui, NETINT CEO said, “Over the last twelve months, video engineers have increasingly recognized the unique value that ASIC-based transcoders deliver to the live streaming, cloud gaming, and surveillance markets, including the lowest cost and power consumption per stream, and the highest density. Our entire company appreciates that insiders at Streaming Media share this assessment.”
Whether your business model is FAST or subscription-based premium content, your success depends upon your ability to deliver a high-quality viewing experience while relentlessly reducing costs. Transcoding is one of the most expensive production-related costs and the ultimate determinant of video quality, so obviously plays a huge role on both sides of this equation. This article identifies the most relevant metrics for ascertaining the true cost of transcoding and then uses these metrics to compare the relative cost of the available methods for live transcoding.
Economics of Transcoding: Cost Metrics
There are two potential cost categories associated with transcoding: capital costs and operating costs. Capital costs arise when you buy your own transcoding gear, while operating costs apply when you operate this equipment or use a cloud provider. Let’s discuss each in turn.
Economics of Transcoding: CAPEX
The simplest way to compare transcoders is to normalize capital and operating costs using the cost per stream or cost per ladder, which simplifies comparing disparate systems with different costs and throughput. The cost per stream applies to services inputting and delivering a single stream, while the cost per ladder applies to services inputting a single stream and outputting an encoding ladder.
We’ll present real-world comparisons once we introduce the available transcoding options, but for the purposes of this discussion, consider the simple example in Table 1. The top line shows that System B costs twice as much as System A, while line 2 shows that it also offers 250% of the capacity of System A. On a cost-per-stream basis, System B is actually cheaper.
TABLE 1: A simple cost-per-stream analysis.
The next few lines use this data to compute the number of required systems for each approach and the total CAPEX. Assuming that your service needs 640 simultaneous streams, the total CAPEX for System A dwarfs that of System B. Clearly, just because a particular system costs more than another doesn’t make it the more expensive option.
For the record, the throughput of a particular server is also referred to as density, and it obviously impacts OPEX charges. System B delivers over six times the streams from the same 1RU rack as System A, so is much more dense, which will directly impact both power consumption and storage charges.
Several factors complicate the otherwise simple analysis of cost per stream. First, you should analyze using the output codec or codecs, current and future. Many systems output H.264 quite competently but choke considerably with the much more complex HEVC codec. If AV1 may be in your future plans, you should prioritize a transcoder that outputs AV1 and compare cost per stream against all alternatives.
The second requirement is to use consistent output parameters. Some vendors quote throughput at 30 fps, some at 60 fps. Obviously, you need to use the same value for all transcoding options. As a rough rule of thumb, if a vendor quotes 60 fps, you can double the throughput for 30 fps, so a system that can output 8 1080p60 streams and likely output 16 1080p30 streams. Obviously, you should verify this before buying.
If a vendor quotes in streams and you’re outputting encoding ladders, it’s more complicated. Encoding ladders involve scaling to lower resolutions for the lower-quality rungs. If the transcoder performs scaling on-board, throughput should be greater than systems that scale using the host CPU, and you can deploy a less capable (and less expensive) host system.
The last consideration involves the concept of “operating point,” or the encoding parameters that you would likely use for your production, and the throughput and quality at those parameters. To explain, most transcoders include encoding options that trade off quality vs throughput much like presets do for x264 and x265. Choosing the optimal setting for your transcoding hardware is often a balance of throughput and bandwidth costs. That is, if a particular setting saves 10% bandwidth, it might make economic sense to encode using that setting even if it drops throughput by 10% and raises your capital cost accordingly. So, you’d want to compute your throughput numbers and cost per stream at that operating point.
In addition, many transcoders produce lower throughput when operating in low latency mode. If you’re transcoding for low-latency productions, you should ascertain whether the quoted figures in the spec sheets are for normal or low latency.
For these reasons, completing a thorough comparison requires a two-step analysis. Use spec sheet numbers to identify transcoders that you’d like to consider and acquire them for further testing. Once you have them in your labs you can identify the operating point for all candidates, test at these settings, and compare them accordingly.
Economics of Transcoding: OPEX - Power
Now, let’s look at OPEX, which has two components: power and storage costs. Table 2 continues our example, looking at power consumption.
Unfortunately, ascertaining power consumption may be complicated if you’re buying individual transcoders rather than a complete system. That’s because while transcoding manufacturers often list the power consumption utilized by their devices, you can only run these devices in a complete system. Within the system, power consumption will vary by the number of units configured in the system and the specific functions performed by the transcoder.
Note that the most significant contributor to overall system power consumption is the CPU. Referring back to the previous section, a transcoder that scales onboard will require lower CPU contribution than a system that scales using the host CPU, reducing overall CPU consumption. Along the same lines, a system without a hardware transcoder uses the CPU for all functions, maxing out CPU utilization likely consuming about the same energy as a system loaded with transcoders that collectively might consume 200 watts.
Again, the only way to achieve a full apples-to-apples comparison is to configure the server as you would for production and measure power consumption directly. Fortunately, as you can see in Table 2, stream throughput is a major determinant of overall power consumption. Even if you assume that systems A and B both consume the same power, System B’s throughput makes it much cheaper to operate over a five year expected life, and much kinder to the environment.
TABLE 2. Computing the watts per stream of the two systems.
Economics of Transcoding: Storage Costs
Once you purchase the systems, you’ll have to house them. While these costs are easiest to compute if you’re paying for a third-party co-location service, you’ll have to estimate costs even for in-house data centers. Table 3 continues the five year cost estimates for our two systems, and the denser system B proves much cheaper to house as well as power.
TABLE 3: Computing the storage costs for the two systems.
Economics of Transcoding: Transcoding Options
These are the cost fundamentals, now let’s explore them within the context of different encoding architectures.
There are three general transcoding options: CPU-only, GPU, and ASIC-based. There are also FPGA-based solutions, though these will probably be supplanted by cheaper-to-manufacture ASIC-based devices over time. Briefly,
- CPU-based transcoding, also called software-based transcoding, relies on the host central processing unit, or CPU, for all transcoding functions.
- GPU-based transcoding refers to Graphic Processing Units, which are developed primarily for graphics-related functions but may also transcode video. These are added to the server in add-in PCIe cards.
- ASICs are Application-Specific Integrated Circuits designed specifically for transcoding. These are added to the server as add-in PCIe cards or devices that conform to the U.2 form factor.
Economics of Transcoding: Real-World Comparison
NETINT manufactures ASIC-based transcoders and video processing units. Recently, we published a case study where a customer, Mayflower, rigorously and exhaustively compared these three alternatives, and we’ll share the results here.
By way of background, Mayflower’s use case needed to input 10,000 incoming simultaneous streams and distribute over a million outgoing simultaneous streams worldwide at a latency of one to two seconds. Mayflower hosts a worldwide service available 24/7/365.
Mayflower started with 80-core bare metal servers and tested CPU-based transcoding, then GPU-based transcoding, and then two generations of ASIC-based transcoding. Table 4 shows the net/net of their analysis, with NETINT’s Quadra T2 delivering the lowest cost per stream and the greatest density, which contributed to the lowest co-location and power costs.
RESULTS: COST AND POWER
TABLE 4. A real-world comparison of the cost per stream and OPEX associated with different transcoding techniques.
As you can see, the T2 delivered an 85% reduction in CAPEX with ~90% reductions in OPEX as compared to CPU-based transcoding. CAPEX savings as compared to the NVIDIA T4 GPU was about 57%, with OPEX savings around ~70%.
Table 5 shows the five-year cost of the Mayflower T-2 based solution using the cost per KWH in Cyprus of $0.335. As you can see, the total is $2,225,241, a number we’ll return to in a moment.
TABLE 5: Five-year cost of the Mayflower transcoding facility.
Just to close a loop, Tables 1, 2, and 3, compare the cost and performance of a Quadra Video Server equipped with ten Quadra T1U VPUs (Video Processing Units) with CPU-based transcoding on the same server platform. You can read more details on that comparison here.
Table 6 shows the total cost of both solutions. In terms of overall outlay, meeting the transcoding requirements with the Quadra-based System B costs 73% less than the CPU-based system. If that sounds like a significant savings, keep reading.
TABLE 6: Total cost of the CPU-based System A and Quadra T2-based System B.
Economics of Transcoding: Cloud Comparison
If you’re transcoding in the cloud, all of your costs are OPEX. With AWS, you have two alternatives: producing your streams with Elemental MediaLive or renting EC3 instances and running your own transcoding farm. We considered the MediaLive approach here, and it appears economically unviable for 24/7/365 operation.
Using Mayflower’s numbers, the CPU-only approach required 500 80-core Intel servers running 24/7. The closest CPU in the Amazon ECU pricing calculator was the 64-core c6i.16xlarge, which, under the EC2 Instance Savings plan, with a 3-year commitment and no upfront payment, costs 1,125.84/month.
FIGURE 1. The annual cost of the Mayflower system if using AWS.
We used Amazon’s pricing calculator to roll these numbers out to 12 months and 500 simultaneous servers, and you see the annual result in Figure 1. Multiply this by five to get to the five-year cost of $33,775,056, which is 15 times the cost of the Quadra T2 solution, as shown in table 5.
We ran the same calculation on the 13 systems required for the Quadra Video Server analysis shown in Tables 1-3 which was powered by a 32-core AMD CPU. Assuming a c6a.8xlarge CPU with a 3-year commitment and no upfront payment,, this produced an annual charge of $79,042.95, or $395,214.6 for the five-year period, which is about 8 times more costly than the Quadra-based solution.
FIGURE 2: The annual cost of an AWS system per the example schema presented in tables 1-3.
Cloud services are an effective means for getting services up and running, but are vastly more expensive than building your own encoding infrastructure. Service providers looking to achieve or enhance profitability and competitiveness should strongly consider building their own transcoding systems. As we’ve shown, building a system based on ASICs will be the least expensive option.
In August, NETINT held a symposium on Building Your Own Live Streaming Cloud. The on-demand version is available for any video engineer seeking guidance on which encoder architecture to acquire, the available software options for transcoding, where to install and run your encoding servers, and progress made on minimizing power consumption and your carbon footprint.
Stef van der Ziel, our keynote speaker, has been in the streaming industry since 1994, and as founder of Jet-Stream, oversaw the development of Jet-Stream Cloud, a European-based streaming platform. He discussed the challenges associated with creating your own encoding infrastructure, how to choose the best transcoding technology, and the cost savings available when you build your own platform.
Stef started by recounting the evolution and significance of transcoding in the streaming industry. To help set the stage, he described the streaming process, starting with a feed from a source like a camera. This feed is encoded and then transcoded into various qualities. This is followed by origin creation, packaging, and, finally, delivery via a CDN.
Stef emphasized the distinction between encoding and transcoding, noting that the latter is mission-critical too. If errors occur during transcoding, the entire stream can fail, leading to poor quality or buffering issues for viewers.
He then related that quality and viewer experience are paramount for transcoding services, regardless of whether they are cloud-based or on-premises. However, cost management is equally crucial.
Beyond the direct costs of transcoding, incorrect settings can lead to increased bandwidth and storage costs. Stef noted the often-overlooked human operational costs associated with managing a streaming platform, especially in the realm of transcoding. Expertise is essential, necessitating either an in-house team or hiring external experts.
Stef observed that while traffic prices have decreased significantly over the years, transcoding costs have remained relatively high. However, he noted a current trend of decreasing transcoding costs, which he finds exciting.
Lastly, in line with the theme of sustainable streaming, Stef emphasized the importance of green practices at every step of the streaming process. He mentioned that Jet-Stream has practiced green streaming since 2004 and that the intense computational demands of transcoding and analytics make them resistant to green practices.
CHOOSING TRANSCODING OPTIONS
In discussing transcoding options, Stef related that CPU-based encoding can deliver very good quality, but that it’s costly in terms of CPU and energy usage. He noted that the quality of GPU-based encoding was lower than CPU and less cost and power efficient than ASICs.
FIGURE 1. Stef found CPU and ASIC-based transcoding quality superior to GPU-based transcoding.
The real game-changer, according to Stef, is ASIC-based encoding. ASICs not only offer superior quality but also minimal latency, a crucial factor for specific low-latency use cases.
Compared to software transcoding, ASICs are also much more power efficient. For instance, while CPU-based transcoding could consume anywhere from 2,800 to 9,000 watts for transcoding 80 OTT channels to HD, ASIC-based hardware transcoding required only 308 watts for the same task. This translates to an energy saving of at least 89%.
Beyond energy efficiency, ASICs also shine in terms of scalability. Stef explained that the power constraints of CPU encoding might limit the capacity of a single rack to 200 full HD channels. In contrast, a rack populated with ASIC-based transcoders could handle up to 2,400 channels concurrently. This capability means increased density, optimized use of rack space, and overall heightened efficiency.
Not surprisingly, given these insights, Stef positioned ASIC-based transcoding as a clear frontrunner over CPU- and GPU-based encoding methods.
OTHER FEATURES TO CONSIDER
Once you’ve chosen your transcoding technology, and implemented basic transcoding functions, you need to consider additional features for your encoding facility. Drawing from his experience with Jet-Stream’s own products and services, Stef identified some to consider.
- Containerize operation in Kubernetes containers so any crash, however infrequent, is self-contained and easily replaceable, often without viewers’ noticing.
- Stack multiple machines to build a microcloud and implement automatic scaling and pooling.
Combine multiple technologies like decoding, filtering, origin, and edge serving, into a single server. That way, a single server can provide a complete solution in many different scenarios.
BEYOND THE BASICS
Beyond these basics, Stef also explained the need to add a flexible and capable interface to your system and to add new features continually, as Jet-Stream does. For example, you may want to burn in a logo or add multi-language audio to your stream, particularly in Europe. You may want or need to support subtitles and offer speech-to-text transcription.
If you’re supporting multiple channels with varying complexity, you may need different encoding profiles tuned for each content type. Another option might be capped CRF encoding to minimize bandwidth costs, which is now standard on all NETINT VPUs and transcoders. On the distribution side, you may need your system to support multiple CDNs for optimized distribution in different geographic regions and auto-failover.
Finally, as your service grows, you’ll need interfaces for health and performance status. Some of the performance indicators that Jet-Stream systems track include bandwidth per stream, viewers per stream, total bandwidth, and many others.
The key point is that you should start with a complete list of necessary features for your system and estimate the development and implementation costs for each. Knowledge of sophisticated products and services like those offered by Jet-Stream will help you understand what’s essential. But you really need a clear-eyed view of the development cost and time before you undertake creating your own encoding infrastructure.
COST AND ENERGY SAVINGS
Fortunately, it’s clear that building your own system can be a huge cost saver. According to Stef, on AWS, a typical full AC channel would cost roughly 2,400 euros per month. By creating his own encoding infrastructure, Jet-Stream reduced this down to 750 euros per month.
FIGURE 2. Running your own system can deliver significant savings over AWS.
Obviously, the savings scale as you grow, so “if you do this times 12 months, times five years, times 80 channels, you’re saving almost 8 million euros.” If you run the same math on energy consumption, you’ll save 22,000 euros on energy costs alone.
By running the transcoding setup on-premises, the cost savings can even be doubled. On-premises is a popular choice to bring more control over core streaming processes back in house.
Overall, Stef’s keynote effectively communicated that while creating your own encoding infrastructure will involve significant planning and development time and cost, the financial reward can be very substantial.
The goal of our recent Build Your Live Streaming Cloud symposium was to help live video engineers learn how to build and house their own transcoding infrastructure while minimizing power consumption and carbon footprint. Accordingly, we invited Barbara Lange from the Greening of Streaming to speak at the symposium. This article relates the key points of her talk, particularly describing the short-term goals of the Low Energy Sustainable Streaming (LESS) Accord.
By way of background, Barbara is a Volunteer Secretariat for the Greening of Streaming and the principal and CEO of Kibo121, a consultancy dedicated to guiding the media tech sector towards sustainability. Barbara described the Greening of Streaming as a member organization formed roughly two years ago. Its primary focus is on the end-to-end energy efficiency of the technical supply chain that supports streaming services.
The organization has an international membership and is dedicated to addressing the energy implications of the streaming sector. Their mission is to provide the global internet streaming industry with a platform to enhance engineering practices and promote collaboration throughout the supply chain. One core belief is that as streaming increases in scope, understanding the true energy costs, backed by real-world data, is paramount. Barbara mentioned that the organization’s monthly membership meetings are now open to the public, with the next meeting scheduled for October 11 at 11:00 Eastern.
Barbara then described the organization’s structure, highlighting its nine current working groups, which focus on diverse pursuits like defining terminology, organizing industry outreach, and identifying best practices. One notable initiative was the measurement of energy consumption during an English Premier soccer match. The organization also explores power consumption in audio streaming, compression/decompression, and the standardization of energy data.
A newly formed group is dedicated to understanding the energy costs associated with end-user devices. Barbara emphasized the importance of collaboration with academic and other industry groups to avoid duplication of effort and to ensure consistent and effective communication across the industry.
With this as background, Barbara focused on the LESS Accord. She began by addressing a common misconception, which is that contrary to some media reports, there’s almost no direct correlation between internet traffic, measured in gigabytes, and energy consumption, measured in kilowatt-hours. This realization emerged from discussions within Working Group Six, which is responsible for examining compression-related issues. This group initiated the LESS Accord.
The LESS Accord’s mission statement is to define best practices for employing compression technologies in streaming video workflows. The goal is to optimize energy efficiency while ensuring a consistently high-quality viewing experience for users. These guidelines target energy reduction throughout the entire streaming process, from the initial encoding for distribution to the decoding and display on consumer devices for all video delivery services.
As Barbara reported, over the past six months, the group has actively engaged with industry professionals, engineers, and experts. They’ve sought insights and suggestions on how to enhance energy efficiency across all workflow and system stages. The essence of the Accord is to foster a collaborative environment where various, sometimes contrasting, initiatives from recent years can be harmonized.
The ultimate goal is to refine testing objectives and pinpoint organizations that can form project groups. Barbara detailed the first of four projects designated in the LESS Accord’s mission statement.
PROJECT ONE: INTELLIGENT DISTRIBUTION MODEL SHIFTING
Project one involves is determining the most energy-efficient distribution model at any given time and enabling content delivery networks (CDNs) to seamlessly transition between these models. The three distribution models to be considered are:
- Unicast: The dominant model in today’s internet streaming.
- Peer-to-peer: Typically used for video on demand distribution.
- Net layer multicast: Often deployed for IPTV.
While each model has traditionally served a specific purpose, the group believes that all three could be viable options in various contexts. The hypothesis is that if these models can be provisioned almost spontaneously, there should be an underlying heuristic that facilitates the shift from one model to another. If energy efficiency is the primary concern, this shift could allow the CDN to meet that objective.
The main goal of this project is to design a workflow that incorporates energy measurements for the involved systems. The aim is to discern when an operator should transition from one model to another, with energy consumption of the entire system being the primary driver, without compromising the end user’s experience.
PROJECT TWO: THE "GOOD ENOUGH" CONCEPT
Barbara then described the second project, which involves potential energy savings through codec choices and optimization. The central question is whether energy can be conserved by allowing consumers to opt for a streaming experience that prioritizes energy efficiency.
The concept suggests introducing a “green button” on streaming media player devices or applications. By pressing this button, users would choose an experience optimized for energy conservation. Drawing a parallel, Barbara mentioned that many televisions come equipped with an “ECO” mode, which many users tend to disable or overlook. Project two will explore whether consumers might be more inclined to select the energy-efficient option if the energy consumption differences between modes were better communicated.
Taking the idea further, this project will explore consumer behavior if the devices defaulted to this ECO or green mode, and users had the choice to upgrade to a “gold mode” for a potentially enhanced quality. Or, if the default setting prioritized energy efficiency, would this lead to a more energy-conserving streaming system?
The project aims to explore these questions, especially considering that many users currently avoid ECO modes, possibly due to perceived concerns about service quality. As you’ve read, this project seeks to understand user behavior and preferences in the context of energy-efficient streaming.
PROJECT THREE: ENERGY MEASUREMENT THROUGHOUT WORKFLOWS
Barbara then described the third project, which she acknowledged as particularly intricate. The central challenge is to measure energy consumption at every stage of the streaming workflow. This initiative originated from Working Group Four, which has been exploring methods to monitor and probe systems to determine the energy costs associated with each step of the process.
The overarching question is: how much energy is required to deliver a stream to the consumer? While answering this question would be invaluable for economic, marketing, and feedback purposes, it’s a complex endeavor.
The proposed approach involves tracking energy consumption from start to finish in the streaming process. When a video file is created on a computer and encoding begins, an energy reading in kilowatt-hours could be taken. This process would be repeated at each subsequent production, delivery, and playback stage. The idea is to tag the video file with “energy breadcrumbs” or metadata that gets updated as the file progresses through the workflow. By the end, these breadcrumbs would provide a comprehensive view of the energy costs associated with the entire streaming process.
Barbara emphasized the ambitious nature of this project, noting that while it’s uncertain if they can fully realize this vision, they are committed to exploring it. She believes that this project, if successful, could have the most significant impact in terms of understanding energy consumption in the streaming sector.
PROJECT FOUR: TRANSITIONING WORKFLOWS FOR ENERGY EFFICIENCY
Barbara introduced the fourth project, which will explore how to adapt various technologies to transition existing workflows to hardware environments that are more energy-efficient. Some initial areas of exploration include:
- Optimization between different silicon environments: Examining how different hardware platforms can be more energy-efficient.
- Immersion cooling: Comparing traditional air cooling systems with alternative cooling methods in streaming environments. This includes processes like encoding, packaging, caching, and even playback in consumer electronics.
- Deploying tasks to renewable energy infrastructures: Specifically, relocating non-time-sensitive encoding tasks to infrastructures powered by surplus renewable energy. An exciting development in this area is the interest shown by the Scottish Enterprise which aims to test the relocation of non-critical transcoding workloads to a wind-powered facility in Scotland.
Barbara emphasized that all these projects were established during a Greening of Streaming event in June, and are currently in progress. She invited interested parties to join these projects and announced an upcoming member meeting that was held on September 13. Next one – October 11th.
Additionally, at IBC in September, the Greening of Streaming plans to present these projects to a broader audience, kick off the work in the fourth quarter, and continue into the next year. By the NAB event in April 2024, the organization hopes to discuss the projects in-depth and share test results.
Barbara Lange - Empowering a Greener Tomorrow:
The LESS Accord and its Energy Savings Drive
For those with limited time, here’s what you need to know: Capped CRF delivers higher quality video during hard-to-encode regions than CBR, similar quality during all other scenes, and improved quality of experience at the same cost or lower than CBR. NETINT VPUs are the first hardware video encoders to adopt Capped CRF across the three most popular codecs in use today, AV1, HEVC, and H.264.
CAPPED CRF OVERVIEW
Briefly, capped CRF is a smart bitrate control technique that combines the benefits of CRF encoding with a bitrate cap. Unlike variable bitrate encoding (VBR) and constant bitrate encoding (CBR), which target specific bitrates, capped CRF targets a specific quality level, which is controlled by the CRF value. You also set a bitrate cap, which is applied if the encoder can’t meet the quality level below the bitrate cap.
On easy-to-encode videos, the CRF value sets the quality level, which it can usually achieve below the bitrate cap. In these cases, capped CRF typically delivers bitrate savings over CBR-encoded footage while delivering similar quality. For harder-to-encode footage, the bitrate cap usually controls, and capped CRF delivers close to the same quality and bitrate as CBR.
The value proposition is clear: lower bitrates and good quality during easy scenes, and similar to CBR in bitrate and quality for harder scenes. I’m not addressing VBR because NETINT’s focus is live streaming, where CBR usage dominates. If you’re analyzing capped CRF for VOD, you would compare against 2-pass VBR as well as potentially CBR.
One last detail. CRF values have an inverse relationship to quality and bitrate; the higher the CRF value, the lower the quality and bitrate. In general, video engineers select a CRF value that delivers their target quality level. For premium content, you might target an average VMAF score of 95. For user-generated content or training videos, you might target 93 or even lower. As you’ll see, the lower the quality score, the greater the bandwidth savings.
We show 1080p results in Table 1, which is divided between easy-to-encode and hard-to-encode content. We encoded the CBR clips to 4.5 Mbps and applied the same cap for capped CRF encoding.
Table 1. 1080p results using Quadra VPU and capped CRF encoding.
You see that in CBR mode, Quadra VPUs do not reach the target rate as accurately as when using capped CRF mode. This won’t degrade viewer quality of experience since the VMAF scores exceed 95, so this missing on the low side saves excess bandwidth with no visual quality detriment.
In this comparison, bitrate savings is minimized, particularly at CRF 19 and 21, as the capped CRF clips in the hard-to-encode content have a higher bitrate than the CBR counterparts (4,419 and 4,092 to 3,889). Not surprisingly, CRF 19 and 21 deliver little bandwidth savings and a slighly higher quality than CBR.
At CRF 23, things get interesting, with an overall bandwidth savings of 16.1% with a negligible quality delta from CBR. With a VMAF score of around 95, CRF 23 might be the target for engineers delivering premium content. Engineers targeting slightly lower quality can choose CRF 27 and achieve a bitrate savings of 43%, and an efficient 2.4 Mbps bit rate for hard-to-encode footage. At CRF 27, Quadra VPUs encoded the hard-to-encode Football clip at 3,999 kbps with an impressive VMAF score of 93.39.
Note that as with H.264 and HEVC, AV1 capped CRF does reduce throughput. Specifically, a single Quadra VPU installed in a 32-core workstation outputs 23 simultaneous CBR streams using CBR encoding. This dropped to eighteen for capped CRF, a reduction of 22%.
Many engineers encoding with AV1 are delivering UHD content, so we ran similar tests with the Quadra and 4K30 8-bit content with a CBR target and bitrate cap of 16 Mbps. Using four clips, including a 4K version of the high-motion Football clip to much less dynamic content like Netflix’s Meridian clip and Blender Foundation’s Sintel.
Table 2. 4K results for the Quadra VPU and capped CRF encoding.
In CBR mode, the Quadra VPU hit the bitrate target much more accurately at 4K than 1080p, so even at CRF 19, the VPU delivered a 13% bitrate savings with a VMAF score of 96.23. Again, CRF 23 delivered a VMAF score of very close to 95, with 45% savings over CBR. Impressively, at CRF 23, Quadra delivered an overall VMAF score of 94.87 for these 4K clips at 7.78 Mbps, and that’s with the Football clip weighing in at 14.3 Mbps.
Of course, these savings directly relate to the cap and CBR target. It’s certainly fair to argue that 16 Mbps is excessive for 4K AV1-encoded content, though Apple recommends 16.8 for 8-bit 4K content with HEVC here.
The point is, when you encode with CBR, you’re limiting quality to control bandwidth costs. With capped CRF, you can set the cap higher than your CBR target, knowing that all content contains easy-to-encode regions that will balance out the impact of the higher cap and deliver similar or lower bandwidth costs. With these comparative settings, capped CRF delivers higher quality video during hard-to-encode regions than CBR, similar quality during all other scenes, and improved quality of experience at the same cost or lower than CBR.