Understanding the Economics of Transcoding

Understanding the Economics of Transcoding

Whether your business model is FAST or subscription-based premium content, your success depends upon your ability to deliver a high-quality viewing experience while relentlessly reducing costs. Transcoding is one of the most expensive production-related costs and the ultimate determinant of video quality, so obviously plays a huge role on both sides of this equation. This article identifies the most relevant metrics for ascertaining the true cost of transcoding and then uses these metrics to compare the relative cost of the available methods for live transcoding.

Economics of Transcoding: Cost Metrics

There are two potential cost categories associated with transcoding: capital costs and operating costs. Capital costs arise when you buy your own transcoding gear, while operating costs apply when you operate this equipment or use a cloud provider. Let’s discuss each in turn.

Economics of Transcoding: CAPEX

The simplest way to compare transcoders is to normalize capital and operating costs using the cost per stream or cost per ladder, which simplifies comparing disparate systems with different costs and throughput. The cost per stream applies to services inputting and delivering a single stream, while the cost per ladder applies to services inputting a single stream and outputting an encoding ladder.

We’ll present real-world comparisons once we introduce the available transcoding options, but for the purposes of this discussion, consider the simple example in Table 1. The top line shows that System B costs twice as much as System A, while line 2 shows that it also offers 250% of the capacity of System A. On a cost-per-stream basis, System B is actually cheaper.

Understanding the Economics of Transcoding - table 1
TABLE 1: A simple cost-per-stream analysis.

The next few lines use this data to compute the number of required systems for each approach and the total CAPEX. Assuming that your service needs 640 simultaneous streams, the total CAPEX for System A dwarfs that of System B. Clearly, just because a particular system costs more than another doesn’t make it the more expensive option.

For the record, the throughput of a particular server is also referred to as density, and it obviously impacts OPEX charges. System B delivers over six times the streams from the same 1RU rack as System A, so is much more dense, which will directly impact both power consumption and storage charges.

Details Matter

Several factors complicate the otherwise simple analysis of cost per stream. First, you should analyze using the output codec or codecs, current and future. Many systems output H.264 quite competently but choke considerably with the much more complex HEVC codec. If AV1 may be in your future plans, you should prioritize a transcoder that outputs AV1 and compare cost per stream against all alternatives.

The second requirement is to use consistent output parameters. Some vendors quote throughput at 30 fps, some at 60 fps. Obviously, you need to use the same value for all transcoding options. As a rough rule of thumb, if a vendor quotes 60 fps, you can double the throughput for 30 fps, so a system that can output 8 1080p60 streams and likely output 16 1080p30 streams. Obviously, you should verify this before buying.

If a vendor quotes in streams and you’re outputting encoding ladders, it’s more complicated. Encoding ladders involve scaling to lower resolutions for the lower-quality rungs. If the transcoder performs scaling on-board, throughput should be greater than systems that scale using the host CPU, and you can deploy a less capable (and less expensive) host system.

The last consideration involves the concept of “operating point,” or the encoding parameters that you would likely use for your production, and the throughput and quality at those parameters. To explain, most transcoders include encoding options that trade off quality vs throughput much like presets do for x264 and x265. Choosing the optimal setting for your transcoding hardware is often a balance of throughput and bandwidth costs. That is, if a particular setting saves 10% bandwidth, it might make economic sense to encode using that setting even if it drops throughput by 10% and raises your capital cost accordingly. So, you’d want to compute your throughput numbers and cost per stream at that operating point.

In addition, many transcoders produce lower throughput when operating in low latency mode. If you’re transcoding for low-latency productions, you should ascertain whether the quoted figures in the spec sheets are for normal or low latency.

For these reasons, completing a thorough comparison requires a two-step analysis. Use spec sheet numbers to identify transcoders that you’d like to consider and acquire them for further testing. Once you have them in your labs you can identify the operating point for all candidates, test at these settings, and compare them accordingly.

Economics of Transcoding: OPEX - Power

Now, let’s look at OPEX, which has two components: power and storage costs. Table 2 continues our example, looking at power consumption.

Unfortunately, ascertaining power consumption may be complicated if you’re buying individual transcoders rather than a complete system. That’s because while transcoding manufacturers often list the power consumption utilized by their devices, you can only run these devices in a complete system. Within the system, power consumption will vary by the number of units configured in the system and the specific functions performed by the transcoder.

Note that the most significant contributor to overall system power consumption is the CPU. Referring back to the previous section, a transcoder that scales onboard will require lower CPU contribution than a system that scales using the host CPU, reducing overall CPU consumption. Along the same lines, a system without a hardware transcoder uses the CPU for all functions, maxing out CPU utilization likely consuming about the same energy as a system loaded with transcoders that collectively might consume 200 watts. 

Again, the only way to achieve a full apples-to-apples comparison is to configure the server as you would for production and measure power consumption directly. Fortunately, as you can see in Table 2, stream throughput is a major determinant of overall power consumption. Even if you assume that systems A and B both consume the same power, System B’s throughput makes it much cheaper to operate over a five year expected life, and much kinder to the environment.

Understanding the Economics of Transcoding - table 2
TABLE 2. Computing the watts per stream of the two systems.

Economics of Transcoding: Storage Costs

Once you purchase the systems, you’ll have to house them. While these costs are easiest to compute if you’re paying for a third-party co-location service, you’ll have to estimate costs even for in-house data centers. Table 3 continues the five year cost estimates for our two systems, and the denser system B proves much cheaper to house as well as power.

Understanding the Economics of Transcoding - table 3
TABLE 3: Computing the storage costs for the two systems.

Economics of Transcoding: Transcoding Options

These are the cost fundamentals, now let’s explore them within the context of different encoding architectures.

There are three general transcoding options: CPU-only, GPU, and ASIC-based. There are also FPGA-based solutions, though these will probably be supplanted by cheaper-to-manufacture ASIC-based devices over time. Briefly,

  • CPU-based transcoding, also called software-based transcoding, relies on the host central processing unit, or CPU, for all transcoding functions.
  • GPU-based transcoding refers to Graphic Processing Units, which are developed primarily for graphics-related functions but may also transcode video. These are added to the server in add-in PCIe cards.
  • ASICs are Application-Specific Integrated Circuits designed specifically for transcoding. These are added to the server as add-in PCIe cards or devices that conform to the U.2 form factor.

Economics of Transcoding: Real-World Comparison

NETINT manufactures ASIC-based transcoders and video processing units. Recently, we published a case study where a customer, Mayflower, rigorously and exhaustively compared these three alternatives, and we’ll share the results here.

By way of background, Mayflower’s use case needed to input 10,000 incoming simultaneous streams and distribute over a million outgoing simultaneous streams worldwide at a latency of one to two seconds. Mayflower hosts a worldwide service available 24/7/365.

Mayflower started with 80-core bare metal servers and tested CPU-based transcoding, then GPU-based transcoding, and then two generations of ASIC-based transcoding. Table 4 shows the net/net of their analysis, with NETINT’s Quadra T2 delivering the lowest cost per stream and the greatest density, which contributed to the lowest co-location and power costs.

RESULTS: COST AND POWER

Understanding the Economics of Transcoding - table 4
TABLE 4. A real-world comparison of the cost per stream and OPEX associated with different transcoding techniques.

As you can see, the T2 delivered an 85% reduction in CAPEX with ~90% reductions in OPEX as compared to CPU-based transcoding. CAPEX savings as compared to the NVIDIA T4 GPU was about 57%, with OPEX savings around ~70%.

Table 5 shows the five-year cost of the Mayflower T-2 based solution using the cost per KWH in Cyprus of $0.335. As you can see, the total is $2,225,241, a number we’ll return to in a moment.

Understanding the Economics of Transcoding - table 5
TABLE 5: Five-year cost of the Mayflower transcoding facility.

Just to close a loop, Tables 1, 2, and 3, compare the cost and performance of a Quadra Video Server equipped with ten Quadra T1U VPUs (Video Processing Units) with CPU-based transcoding on the same server platform. You can read more details on that comparison here.

Table 6 shows the total cost of both solutions. In terms of overall outlay, meeting the transcoding requirements with the Quadra-based System B costs 73% less than the CPU-based system. If that sounds like a significant savings, keep reading. 

TABLE 6: Total cost of the CPU-based System A and Quadra T2-based System B.

Economics of Transcoding: Cloud Comparison

If you’re transcoding in the cloud, all of your costs are OPEX. With AWS, you have two alternatives: producing your streams with Elemental MediaLive or renting EC3 instances and running your own transcoding farm. We considered the MediaLive approach here, and it appears economically unviable for 24/7/365 operation.

Using Mayflower’s numbers, the CPU-only approach required 500 80-core Intel servers running 24/7. The closest CPU in the Amazon ECU pricing calculator was the 64-core c6i.16xlarge, which, under the EC2 Instance Savings plan, with a 3-year commitment and no upfront payment, costs 1,125.84/month.

Understanding the Economics of Transcoding - figure 1
FIGURE 1. The annual cost of the Mayflower system if using AWS.

We used Amazon’s pricing calculator to roll these numbers out to 12 months and 500 simultaneous servers, and you see the annual result in Figure 1. Multiply this by five to get to the five-year cost of $33,775,056, which is 15 times the cost of the Quadra T2 solution, as shown in table 5.

We ran the same calculation on the 13 systems required for the Quadra Video Server analysis shown in Tables 1-3 which was powered by a 32-core AMD CPU. Assuming a c6a.8xlarge CPU with a 3-year commitment and no upfront payment,, this produced an annual charge of $79,042.95, or $395,214.6 for the five-year period, which is about 8 times more costly than the Quadra-based solution.

Understanding the Economics of Transcoding - figure 2
FIGURE 2: The annual cost of an AWS system per the example schema presented in tables 1-3.

Cloud services are an effective means for getting services up and running, but are vastly more expensive than building your own encoding infrastructure. Service providers looking to achieve or enhance profitability and competitiveness should strongly consider building their own transcoding systems. As we’ve shown, building a system based on ASICs will be the least expensive option.

In August, NETINT held a symposium on Building Your Own Live Streaming Cloud. The on-demand version is available for any video engineer seeking guidance on which encoder architecture to acquire, the available software options for transcoding, where to install and run your encoding servers, and progress made on minimizing power consumption and your carbon footprint.

ON-DEMAND: Building Your Own Live Streaming Cloud

Demystifying the live-streaming setup

Demystifying the live-streaming setup w Stef van der Ziel from Jet-Stream (NETINT Symposium on Building Your Own Streaming Cloud) - featured image

Stef van der Ziel, our keynote speaker, has been in the streaming industry since 1994, and as founder of Jet-Stream, oversaw the development of Jet-Stream Cloud, a European-based streaming platform. He discussed the challenges associated with creating your own encoding infrastructure, how to choose the best transcoding technology, and the cost savings available when you build your own platform.

Stef started by recounting the evolution and significance of transcoding in the streaming industry. To help set the stage, he described the streaming process, starting with a feed from a source like a camera. This feed is encoded and then transcoded into various qualities. This is followed by origin creation, packaging, and, finally, delivery via a CDN.

Stef emphasized the distinction between encoding and transcoding, noting that the latter is mission-critical too. If errors occur during transcoding, the entire stream can fail, leading to poor quality or buffering issues for viewers.

He then related that quality and viewer experience are paramount for transcoding services, regardless of whether they are cloud-based or on-premises. However, cost management is equally crucial.

Beyond the direct costs of transcoding, incorrect settings can lead to increased bandwidth and storage costs. Stef noted the often-overlooked human operational costs associated with managing a streaming platform, especially in the realm of transcoding. Expertise is essential, necessitating either an in-house team or hiring external experts.

Stef observed that while traffic prices have decreased significantly over the years, transcoding costs have remained relatively high. However, he noted a current trend of decreasing transcoding costs, which he finds exciting.

Lastly, in line with the theme of sustainable streaming, Stef emphasized the importance of green practices at every step of the streaming process. He mentioned that Jet-Stream has practiced green streaming since 2004 and that the intense computational demands of transcoding and analytics make them resistant to green practices.

Demystifying the live-streaming setup w Stef van der Ziel from Jet-Stream (NETINT Symposium on Building Your Own Streaming Cloud) - slide 2

CHOOSING TRANSCODING OPTIONS

In discussing transcoding options, Stef related that CPU-based encoding can deliver very good quality, but that it’s costly in terms of CPU and energy usage. He noted that the quality of GPU-based encoding was lower than CPU and less cost and power efficient than ASICs.

Demystifying the live-streaming setup w Stef van der Ziel from Jet-Stream (NETINT Symposium on Building Your Own Streaming Cloud) - slide 10
FIGURE 1. Stef found CPU and ASIC-based transcoding quality superior to GPU-based transcoding.

The real game-changer, according to Stef, is ASIC-based encoding. ASICs not only offer superior quality but also minimal latency, a crucial factor for specific low-latency use cases.

Compared to software transcoding, ASICs are also much more power efficient. For instance, while CPU-based transcoding could consume anywhere from 2,800 to 9,000 watts for transcoding 80 OTT channels to HD, ASIC-based hardware transcoding required only 308 watts for the same task. This translates to an energy saving of at least 89%.

Beyond energy efficiency, ASICs also shine in terms of scalability. Stef explained that the power constraints of CPU encoding might limit the capacity of a single rack to 200 full HD channels. In contrast, a rack populated with ASIC-based transcoders could handle up to 2,400 channels concurrently. This capability means increased density, optimized use of rack space, and overall heightened efficiency.

Not surprisingly, given these insights, Stef positioned ASIC-based transcoding as a clear frontrunner over CPU- and GPU-based encoding methods.

OTHER FEATURES TO CONSIDER

Once you’ve chosen your transcoding technology, and implemented basic transcoding functions, you need to consider additional features for your encoding facility. Drawing from his experience with Jet-Stream’s own products and services, Stef identified some to consider.

  • Containerize operation in Kubernetes containers so any crash, however infrequent, is self-contained and easily replaceable, often without viewers’ noticing.
  • Stack multiple machines to build a microcloud and implement automatic scaling and pooling.

Combine multiple technologies like decoding, filtering, origin, and edge serving, into a single server. That way, a single server can provide a complete solution in many different scenarios.

Demystifying the live-streaming setup w Stef van der Ziel from Jet-Stream (NETINT Symposium on Building Your Own Streaming Cloud) - slide 20

BEYOND THE BASICS

Beyond these basics, Stef also explained the need to add a flexible and capable interface to your system and to add new features continually, as Jet-Stream does. For example, you may want to burn in a logo or add multi-language audio to your stream, particularly in Europe. You may want or need to support subtitles and offer speech-to-text transcription.

If you’re supporting multiple channels with varying complexity, you may need different encoding profiles tuned for each content type. Another option might be capped CRF encoding to minimize bandwidth costs, which is now standard on all NETINT VPUs and transcoders. On the distribution side, you may need your system to support multiple CDNs for optimized distribution in different geographic regions and auto-failover.

Finally, as your service grows, you’ll need interfaces for health and performance status. Some of the performance indicators that Jet-Stream systems track include bandwidth per stream, viewers per stream, total bandwidth, and many others.

The key point is that you should start with a complete list of necessary features for your system and estimate the development and implementation costs for each. Knowledge of sophisticated products and services like those offered by Jet-Stream will help you understand what’s essential. But you really need a clear-eyed view of the development cost and time before you undertake creating your own encoding infrastructure.

COST AND ENERGY SAVINGS

Fortunately, it’s clear that building your own system can be a huge cost saver. According to Stef, on AWS, a typical full AC channel would cost roughly 2,400 euros per month. By creating his own encoding infrastructure, Jet-Stream reduced this down to 750 euros per month.

Demystifying the live-streaming setup w Stef van der Ziel from Jet-Stream (NETINT Symposium on Building Your Own Streaming Cloud) - slide 14
FIGURE 2. Running your own system can deliver significant savings over AWS.

Obviously, the savings scale as you grow, so “if you do this times 12 months, times five years, times 80 channels, you’re saving almost 8 million euros.” If you run the same math on energy consumption, you’ll save 22,000 euros on energy costs alone.

By running the transcoding setup on-premises, the cost savings can even be doubled. On-premises is a popular choice to bring more control over core streaming processes back in house.

Overall, Stef’s keynote effectively communicated that while creating your own encoding infrastructure will involve significant planning and development time and cost, the financial reward can be very substantial.

Demystifying the live-streaming setup w Stef van der Ziel from Jet-Stream (NETINT Symposium on Building Your Own Streaming Cloud) - slide 46

ON-DEMAND: Stef van der Ziel - Demystifying the live-streaming setup

Choosing a Co-Location Facility

Edgio is the result of the merger between Limelight Networks and EdgeCast in 2022, which produced a company with over 20 years of experience choosing and installing their own equipment into co-location facilities.

With customers like Disney, ESPN, Amazon, and Verizon, Edgio has had to manage both explosive growth and exceptionally high expectations.

So, there’s no better source to help you learn to choose a co-location provider than Kyle Faber, Head of CDN Product Delivery Management at Edgio. He’s got experience, and as you’ll see below, the pictures to prove it.

Kyle starts with a description of the math involved in deciding whether co-location is the right direction for your organization, and then works though must-have and nice-to-have co-location features. He covers the value of certifications, the importance of redundancy and temperature management, explores connectivity, support, and cost considerations, and finishes with a look at sustainability. It’s a deep and comprehensive look at choosing a co-location provider and information that anyone facing this decision will find invaluable.

NETINT Technologies | Building your own streaming cloud - online symposium | Kyle-Faber-Choosing-a-Co-Location-Facility-1

NAVIGATE THE COMPLEXITIES OF PRIVATE COLOCATION DECISIONS

Kyle started by addressing the considerations video engineers should prioritize when contemplating the shift to private co-location. In the context of modern public cloud computing platforms, he asserted that the decision to opt for private colocation requires a higher level of scrutiny due to the advanced capabilities of cloud offerings. While some enterprises rely solely on public cloud solutions for their production stack, there are compelling reasons to explore private colocation options.

NETINT Technologies | Building your own streaming cloud - online symposium | Kyle-Faber-Choosing-a-Co-Location-Facility-2

He outlined his talk as follows:

  • First, he detailed a methodology for considering your financial break-even.
  • Then, he identified the “must have” features that a co-location provider must offer.
  • Then he related the nice-to-have, but not essential features that are potentially negotiable based on your organization’s goals.
  • He concluded with insight into how to balance the cloud vs. co-location decision, sharing that “it’s not a zero-sum game.”

As you’ll see, throughout the talk, Kyle provided practical insights to help video engineers navigate the complexities of private colocation decisions. He emphasized understanding the factors influencing these choices and making informed decisions based on an organization’s unique circumstances.

UNDERSTANDING THE MATH AND BREAKEVEN PRINCIPLES

NETINT Technologies | Building your own streaming cloud - online symposium | Kyle-Faber-Choosing-a-Co-Location-Facility-3

Kyle started the economic discussion with the concept of the economics of minimum load and its relevance to private co-location decisions for video engineers. Using an everyday analogy, Kyle drew parallels between choosing to buy a car for daily use versus opting for ride-sharing services. He noted that the expenses associated with car ownership accumulate rapidly, but they eventually stabilize.

The convenience of controlling usage and trip frequency often leads to a reduced cost per ride compared to ride-sharing services over time. This analogy illustrated the dynamics of yearly co-location contracts, where minimum load drives efficiencies and potential gains.

Kyle then shifted to a scenario involving short-term heavy needs, like vacation car rentals. He noted that car rentals offer flexibility for unpredictable schedules without the commitment of ownership. This aligns with the flexibility provided by bare metal service providers, who offer diverse options within predefined parameters. This approach maintains efficiency while operating within certain boundaries.

Concluding his analogy, Kyle compared on-demand and public cloud offerings to ride-sharing services. He emphasized their ease of access, requiring just a few clicks to summon a driver or server, without concerns regarding operational aspects like insurance, maintenance, and updates.

By illustrating these relatable scenarios, Kyle underscored the importance of understanding the economics of minimum load in the context of private co-location decisions, specifically catering to the considerations of video engineers.

NAVIGATE THE ECONOMICS OF MINIMUM LOAD

NETINT Technologies | Building your own streaming cloud - online symposium | Kyle-Faber-Choosing-a-Co-Location-Facility-4

Kyle next elaborated on the strategic approach required to navigate the economics of minimum load in the context of private co-location decisions. He emphasized the significance of aligning different models with specific data center demands.

Drawing from personal experiences, Kyle illustrated the concept using relatable scenarios. He contrasted his friend’s experience of living near a rail line in Seattle, which made car ownership unnecessary, with his own situation in Scottsdale, Arizona, where car ownership was essential due to logistical challenges.

Translating this to the business realm, Kyle pointed out that various companies have unique server requirements. Some prioritize flexible load management over specialized hardware needs and prefer to maintain a lean staff without extensive server administration roles. For Edgio, a content delivery network, private co-location globally was the optimal choice to meet their specific requirements.

Kyle then began a cost analysis, acknowledging that while the upfront cost of private co-location might seem daunting compared to public cloud prices, the cumulative server hour costs can accumulate rapidly. He referenced AWS’s substantial revenue from convenience as an example. He highlighted the necessity of considering hidden costs, including human capital requirements and logistical factors.

Addressing executive leaders, Kyle cautioned against assuming that software developers skilled with code are also adept at running data centers. He emphasized the importance of having dedicated data center and server administration experts to maximize cost savings and avoid potential disasters.

Looking toward the future, Kyle advised mid-sized companies to consider their future needs and focus on maintaining nimbleness. He shared his insights into the challenges of hardware logistics and the value of proper tracking and clarity to identify breakeven points. In this comprehensive overview, Kyle provided practical insights into the economics of minimum load, offering a pragmatic perspective on private co-location decisions for video engineers.

MUST-HAVE CO-LOCATION FEATURES

NETINT Technologies | Building your own streaming cloud - online symposium | Kyle-Faber-Choosing-a-Co-Location-Facility-5

With the economics covered, Kyle shifted to identifying the must-have features in any co-location service, suggesting that certifications play a crucial role in evaluating co-location providers. ISO 9,000 and SOC 2, types one and two, were cited as common minimum standards, with additional regional and industry-specific variations. Kyle recommended requesting certifications from potential vendors and conducting thorough research to understand the significance of these certifications.

Kyle explained that by obtaining certifications, you can move beyond basic questions about construction methods, power backup systems, and operational standards. Instead, you can focus on more nuanced inquiries, like power sources, security standards for visitors, and the training and responsiveness of remote hands teams. This transition allows for a more informed assessment of vendors’ capabilities and suitability for specific needs.

THE SIGNIFICANCE OF ON-SITE VISITS

NETINT Technologies | Building your own streaming cloud - online symposium | Kyle-Faber-Choosing-a-Co-Location-Facility-6

Kyle underscored the significance of on-site visits in the colocation decision-making process, sharing three images that highlighted the insights gained from physical visits to data center facilities. The first image depicted service cabling that entered a data center. While the front of the building seemed pristine, the back revealed potential issues lurking in the shadows. Kyle stressed that some problems can only be identified through close inspection.

The second image showed a fiber distribution panel, showcasing the low level of professionalism evident in the data center’s installations. This reinforced the idea that visual assessments can reveal the quality of a facility’s infrastructure.

The third image illustrated a unique scenario. During construction, a new fiber channel was being laid, but the basement entry of the fiber trench was left unsealed. An overnight rainstorm resulted in the trench filling with water. Because the basement access hole was uncapped, water flowed downhill into a room with valuable equipment. This real-life example served as a reminder of the importance of thorough inspection and due diligence in the colocation industry.

These visuals underscore the importance of physically visiting data centers to identify potential challenges and make informed decisions.

AND TEMPERATURE MANAGEMENT

NETINT Technologies | Building your own streaming cloud - online symposium | Kyle-Faber-Choosing-a-Co-Location-Facility-8

Kyle also shared that  temperature management is particularly important to data centers. For example, Edgio emphasizes cooling speed, temperature regulation, and high-density heat rejection technology. It’s not merely about achieving lower temperatures; it’s about effectively managing and dissipating heat.

Kyle explained that even a slight temperature fluctuation can trigger far-reaching consequences, so maintaining a precise temperature of 76 degrees Fahrenheit is paramount. The utilization of advanced heat rejection technology ensures that any deviations from this optimal point can be promptly corrected, guaranteeing peak performance for their installations.

NETINT Technologies | Building your own streaming cloud - online symposium | Kyle-Faber-Choosing-a-Co-Location-Facility-9

Paradoxically, economic success complicates temperature maintenance. Over the past eight years, Kyle reported that Edgio achieved a 30% improvement in server power efficiency, coupled with a 760% surge in server density metrics. However, since the laws of physics remain steadfast, this density surge brings with it an elevated heat generation within a smaller space.

CONNECTIVITY, SUPPORT, AND COST CONSIDERATIONS

NETINT Technologies | Building your own streaming cloud - online symposium | Kyle-Faber-Choosing-a-Co-Location-Facility-10

Kyle’s discussion then shifted to connectivity, sustainability, and environmental considerations with a focus on where to place each factor in your decision-making scorecard.

Emphasizing the critical role of connectivity in businesses, Kyle noted that vendors often claim constant uptime and availability, and usually deliver this, so they differentiate themselves through their access to the wider internet. When choosing a co-location provider, all organizations should reflect on their unique requirements. For instance, he suggests that businesses intending to connect with a CDN like Edgio might require a local data center partner that facilitates data transformation and transcoding but might not need the extensive infrastructure for global data distribution.

Kyle then addressed the significance of remote support, especially during initial installations where a swift response to issues is crucial. While tools like iDRAC and remote Out-of-Band server access provide control, Kyle highlighted the importance of real-time assistance during other critical moments, such as identifying server issues.

Addressing costs, Kyle acknowledges its pivotal role in decision-making, a sentiment particularly relevant given the current technology landscape. Kyle urges a balance between cost-effectiveness and quality, drawing parallels between daily personal choices and those made in professional spheres. He references Terry Pratchett’s boot theory of economics, emphasizing the inevitability of change and the need for proactive lifecycle management. “Even the best boots will not last forever,” Kyle paraphrased, “and you need to plan lifecycle management.”

A FEW WORDS ABOUT SUSTAINABILITY

NETINT Technologies | Building your own streaming cloud - online symposium | Kyle-Faber-Choosing-a-Co-Location-Facility-11

Kyle urged all participants and readers to consider sustainability, transcending its status as a mere buzzword. “Sustainability is more than a buzzword,” he declared, “It is a commitment.”

He illuminated the staggering energy appetite of data centers, exemplified by Amazon’s permits for generators in Virginia, capable of producing a remarkable 4.6 gigawatts of backup power – enough to illuminate New York City for a day. Kyle underscored the industry’s responsibility to reevaluate energy sources, citing the rising importance of Environmental Social Governance (ESG) movements. He emphasized that organizations are now compelled to report their environmental impact to stakeholders and investors, emphasizing transparency.

When considering colocation facilities, Kyle recommended evaluating their sustainability reports, which reveal critical information from energy-sourcing practices to governance approaches. By aligning operational needs with global responsibilities, businesses can make conscientious choices that resonate with their core values and forge meaningful partnerships with data center providers.

GET INTIMATELY ACQUAINTED WITH THE UNPREDICTABLE

NETINT Technologies | Building your own streaming cloud - online symposium | Kyle-Faber-Choosing-a-Co-Location-Facility-12

While you should perform a comprehensive needs analysis and service comparison to choose your provider, Kyle also highlighted that data centers are intimately acquainted with the unpredictable. Construction activities, often beyond the data center provider’s control, persistently surround these facilities.

The photo above, taken a mile away from a facility, exemplifies the unforeseen challenges. A construction crew, possibly misinformed or negligent, drove an auger into the ground at an incorrect location, inadvertently ensnaring cabling, and yanking dozens of meters of fiber from the earth.

The incident’s specifics remain unclear, yet the lesson is evident – despite meticulous planning, unpredictability is an integral facet of this landscape. As Kyle summarized, “It’s a stark reminder that despite our best plans, unpredictability has to be part of this landscape, so always be prepared for the unexpected.”

NO ONE-SIZE-FITS-ALL SOLUTION

NETINT Technologies | Building your own streaming cloud - online symposium | Kyle-Faber-Choosing-a-Co-Location-Facility-13

In closing, Kyle addressed the intricate decisions surrounding ownership, rental, and on-demand data center services, emphasizing that there’s no one-size-fits-all solution. He presents the choice between owning servers, renting them, or opting for on-demand cloud services as a complex tapestry woven with factors such as the unique average minimum load and an organization’s strategic objectives.

Kyle cautioned that navigating this intricate landscape demands a nuanced perspective. The decision requires a well-thought-out plan that not only accommodates an organization’s goals and growth but also anticipates the evolving trends of the industry. This approach ensures that the chosen path resonates seamlessly with an organization’s aspirations, offering stability for the journey ahead.

GO FROM A PURE OPEX MODEL TO A CAPEX MODEL

NETINT Technologies | Building your own streaming cloud - online symposium | Kyle-Faber-Choosing-a-Co-Location-Facility-14

Before wrapping up, Kyle answered one question from the audience, “ How does someone begin to approach a transition? Is it even possible to go from a pure OPEX model to a CAPEX model? Any suggestions, ideas, insights?”

Kyle noted that when you assess an OPEX model, you’re essentially looking at linear costs. These costs offer a clear breakdown of your system expenses, which can be projected into the future.

While there might be some pricing fluctuations as public cloud providers compete, you can treat entire segments as a transition unit. It might not be feasible to buy just one server and place it in isolation, but you can transition comprehensive sections in one concerted effort.

So, you might build a small encoding farm, allowing for a gradual shift while maintaining flexibility across various cloud instances like AWS, Azure, or GCP. This phased approach grants greater control, cost benefits, and a smoother transition into the new paradigm.

ON-DEMAND: Kyle Faber - Choosing a Co-Location Facility

From Cloud to Local Transcoding For Minimum Latency and Maximum Quality

From Cloud to Local Transcoding

Over the last ten years or so, most live productions have migrated towards a workflow that sends a contribution stream from the venue into the cloud for transcoding and delivery. For live events that need absolute minimum latency and maximum quality, it may be time to rethink that workflow, particularly if you’ve got multiple sharable inputs at the venue.

So says Bart Snoeks, Account & Partnership Director of THEO Technologies (“THEO”). By way of background, THEO invented and has commercially implemented the High-Efficiency Streaming Protocol (HESP), an adaptive HTTP- based video streaming protocol that enables sub-second end-to-end latency. You see how HESP compares to other low latency protocols in the table shown in Figure 1 from the HESP Alliance website – the organization focused on promoting and further advancing HESP.

Figure 1. HESP compared to other low latency protocols.

THEO has productized HESP as a real-time streaming service called THEOlive, which targets applications like live sports and betting, casino igaming, live auctions, and other events that require high-quality video at exceptionally low latency with delivery at scale. For example, in the case of in-play betting, cutting latency from 8 to 10 seconds (HLS) to under one second expands the betting window during the critical period just before the event.

When streaming casino games, ultra-low latency promotes fluent interactions between the players and ensures that all players see the turn of the cards in real time. When latency is lower, players can bet more quickly, increasing the number of hands that can be played.

According to Snoeks, a live streaming workflow that sends a contribution stream to the cloud for transcoding will always increase latency and can degrade quality as re-transcoding is needed. It’s especially poorly suited for stadium venues with multiple camera locations that want to enhance the attendee experience with multiple live feeds. In those latency-critical use cases you are actually adding network latency with a roundtrip to and from the cloud. Instead, it makes much more sense creating your encoding ladder and packaging on-site and pulling that directly from the origin to a private CDN for delivery.

Let’s take a step back and examine these two workflows.

Live Streaming Workflows

As stated at the top, most live-streaming productions encode a single contribution stream on-site and send that into the cloud for transcoding to a full ladder, packaging, and delivery. You see this workflow in Figure 2.

Figure 2. Encoding a contribution stream on-site to deliver to the cloud for transcoding, packaging, and delivery

This schema has multiple advantages. First, you’re sending a single stream to the cloud, lowering bandwidth requirements. Second, you’re centralizing your transcoding assets in a single location in the cloud, which typically enables better utilization.

According to Snoeks, however, this workflow will add 200 to 500  milliseconds of latency at a minimum, depending on the encoding speed, quality, and contribution protocol. In addition, though high-quality contribution encoders can minimize generational loss from the contribution stream, lower-quality transcoders can noticeably degrade the quality of the final output. You also need a contribution encoder for each camera, which can jack up hardware costs in high-volume igaming and similar applications.

Instead, for some specific use cases, you should consider the workflow shown in Figure 3. Here, you transcode on-site and send the full encoding ladder to a public CDN for external delivery and to a private CDN or equivalent for local viewing. This decreases latency to a minimum and produces absolute top quality as you avoid the additional transcoding step.

From Cloud to Local Transcoding - Figure-2
Figure 3. Encoding and packaging the encoding ladder on site and transmitting the streams to a public CDN for external viewers and a private CDN for local viewers.

This schema is particularly useful for venues that want to enhance the in-stadium experience with multiple camera feeds. Imagine a stock car race where an attendee only sees his driver on the track once every minute or so. Encoding on-site might allow attendees to watch the camera view from inside their favorite driver’s car with near real-time latency. It might let golf fans follow multiple groups while parked at a hole or following their favorite player.

If you’re encoding input from many cameras, say in a casino or even racetrack environment, the cost of on-site encoding might be less than the cost of the individual contribution encoders. So, you get the best of all worlds, lower cost per stream, lower latency, higher quality, and a better in-person experience where applicable.

If you’re interested in learning about your transcoding options, check out our symposium Building Your Own Live Streaming Cloud, where you can hear from multiple technology experts discussing transcoding options like CPU-only, GPU, and ASIC-based transcoding and their respective costs, throughput, and density.

If you’re interested in learning more about HESP, THEO in general, or THEOlive, watch for an upcoming episode of Voices of Video, where I interview Pieter-Jan Speelman, CTO of THEO Technologies. We’ll discuss HESP’s history and evolution, the power of THEOlive real-time streaming technology, and how to use it in your live production stack. Make sure you don’t miss it!

Now ON-DEMAND: Symposium on Building Your Live Streaming Cloud

From Cloud to Control. Building Your Own Live Streaming Platform

Cloud services are an effective way to begin live streaming. Still, once you reach a particular scale, it’s common to realize that you’re paying too much and can save significant OPEX by deploying transcoding infrastructure yourself. The question is, how to get started?

NETINT’s Build Your Own Live Streaming Platform symposium gathers insights from the brightest engineers and game-changers in the live-video processing industry on how to build and deploy a live-streaming platform.

In just three hours, we’ll cover the following:

  • Hardware options for live transcoding and encoding to cut costs by as much as 80%.
  • Software options for producing, delivering, and playing your live video streams.
  • Co-location selection criteria to achieve cloud-like performance with on-premise affordability.

You’ll also hear from two engineers who will demystify the process of assembling a live-streaming facility, how they identified and solved key hurdles, along with real costs and performance data.

Cloud? Or your own hardware?

It’s clear to many that producing live streams via a public cloud like AWS can be vastly more expensive than owning your hardware. (You can learn more by reading “Cloud or On-Premises? The Streaming Dilemma” and “How to Slash CAPEX, OPEX, and Carbon Emissions Using the NETINT T408 Video Transcoder”). 

To quote serial entrepreneur David Hansson, who recently migrated two SaaS services from the cloud to on-premise, “Don’t let the entrenched cloud interests dazzle you into believing that running your own setup is too complicated. Everyone and their dog did it to get the internet off the ground, and it’s only gotten easier since.” 

For those who have only operated in the cloud, there’s fear of the unknown. Fear buying hardware transcoders, selecting the right software, and choosing the best colocation service. So, we decided to fight fear with education and host a symposium to educate streaming engineers on all these topics.  

“Building Your Own Live Streaming Cloud” will uncover how owning your encoding stack can slash operating costs and boost performance with minimal CAPEX.

Learn to select the optimal transcoding hardware, transcoding and packaging software, and colocation facilities. We’ll also discuss strategies to reduce carbon emissions from your transcoding engine. 

This FREE virtual event takes place on August 17th, from 11:00 AM – 2:15 PM EST.

Five issues tackled by nine experts:

Transcoding Hardware Options:

Learn the pros and cons of CPU, GPU, and ASIC-based transcoding via detailed throughput and cost examples shared by Kenneth Robinson, Manager of Field Application Engineers at NETINT Technologies. Then Ilya Mikhaelis, Streaming Backend Tech Lead at Mayflower, will describe his company’s journey from CPU to GPU to ASICs, covering costs, power consumption, latency, and density metrics.

Software Options:

Jan Ozer from NETINT will identify the three categories of transcoding software: multimedia frameworks, media servers, and other tools. Then, you’ll hear from experts in each category, starting with Romain Bouqueau, founder of Motion Spell, who will discuss the capabilities of the GPAC multimedia framework. Barry Owen, Chief Solutions Architect at Wowza, will discuss Wowza Streaming Engine’s suitability for private clouds. Lastly, Adrian Roe, Director at Id3as, developer of Norsk, will demonstrate Norsk’s simple, scripting-based operation, and extensive production and transcoding features.

Housing Options:

Once you select your hardware and software, the next step is finding the right co-location facility to house your live streaming infrastructure. Kyle Faber, with experience in building Edgio’s video streaming infrastructure, will guide you through the essential factors to consider when choosing a co-location facility.

Minimizing the Environmental Impact:

As responsible streaming professionals, it’s essential to address the environmental impact of our operations. Barbara Lange, Secretariat of Greening of Streaming, will outline actionable steps video engineers can take to minimize power consumption when acquiring and deploying transcoding servers.

Pulling it All Together:

Stef van der Ziel, founder of live-streaming pioneer Jet-Stream, will share lessons learned from his experience in creating both Jet-Stream’s private cloud and cloud transcoding solutions for customers. In his closing talk, Stef will demystify the process of choosing hardware, software, and a hosting facility, bringing all the previous discussions together into a cohesive plan.

Full Agenda:

11:00 am. – 11:10 am EST

Introduction (10 minutes):
Mark Donnigan, Head of Strategic Marketing at NETINT Technologies
Welcome, overview, and what you will learn.

 

11:10 am. – 11:40 am EST

Choosing transcoding hardware (30 minutes):
Kenneth Robinson, Manager of Field Application Engineers at NETINT Technologies
You have three basic approaches to transcoding, CPU-only, GPU, and ASICs. Kenneth outlines the pros and cons of each approach with extensive throughput and CAPEX and OPEX examples for each.

 

11:40 am. – 12:00 pm EST

From CPU to GPU to ASIC: Our Transcoding Journey (20 minutes):
Ilya Mikhaelis, Streaming Backend Tech Lead at Mayflower
Charged with supporting very high-volume live transcoding operations, Ilya started with libx264 software transcoding, which consumed massive power but yielded low stream density per server. Then he experimented with GPUs and other hardware and ultimately transitioned to an ASIC-based solution with much lower power consumption and much higher stream density per server. Ilya will detail the costs, power consumption, and density of all options, providing both data and an invaluable evaluation framework.

 

12:00 pm. – 12:10 pm EST

Choosing your live production software (10 minutes): 
Jan Ozer, Senior Director of Video Technology at NETINT Technologies
The core of every live streaming system is transcoding and packaging software. This comes in many shapes and sizes, from open-source software like FFmpeg and GPAC, to streaming servers like Wowza, and production systems like Norsk. Jan discusses these multiple options so you can cohesively and affordably build your own live-streaming ecosystem.

 

12:10 pm. – 1:10 pm EST

Speed Round (60 minutes):
20-minute presentations from GPAC, Wowza, and NORSK.
Speakers from GPAC, Wowza, and NORSK discussing the features, functions, operational paradigms, and cost structure of their live software offering.

Speakers include:

  • Adrian Roe, CEO at id3as, Product: Norsk, Title: Make Live Easy with NORSK SDK
  • Romain Bouqueau, Founder and CEO, Motion Spell (home for GPAC Licensing), Product: GPAC Title of Talk: Deploying GPAC for Transcoding and Packaging
  • Barry Owen, Chief Solutions Architect at Wowza, Title of Talk: Start Streaming in Minutes with Wowza Streaming Engine



1:10 pm. – 1:40 pm EST

Choosing a co-location facility (30 minutes): 
Kyle Faber, Senior Director of Product Management at Edgio.
Once you’ve chosen your hardware and software, you need a place to install them. If you don’t have your own connected data center, you may consider a colocation facility. In his talk, Kyle addresses the key factors to consider when choosing a co-location facility for your live streaming infrastructure.

 

1:40 pm. – 1:55 pm EST

How to Greenify Your Encoding Stack (15 minutes):
Barbara Lange, Secretariat of Greening of Streaming.
Learn how video streaming companies can work to significantly reduce their energy footprint and contribute to a greener streaming industry. Implement hardware and infrastructure optimization using immersion cooling and data center design improvements to maximize energy efficiency in your streaming infrastructure.

 

1:55 pm. – 2:15 pm EST

Closing Keynote (20 minutes):
Stef van der Ziel, Founder Jet-Stream
Jet-stream has delivered streaming solutions since its launch in 1994 and offers its own live streaming platform. One focus has been creating custom transcoding solutions for customers seeking to create their own private cloud for various applications. In his closing talk, Stef will demystify the process of choosing hardware, software, and a hosting facility and wrap a pretty bow around all previous presentations.

Co-location for Optimized, Sustainable Live Streaming Success

Choosing a co-location facility

If you decide to buy and run your transcoding servers versus a public cloud, you must choose where to host the servers. If you have a well-connected data center, that’s an option. But if you don’t, you’ll want to consider a co-location facility or co-lo.

A co-location facility is a data center that rents space to third parties for servers and other computing hardware. This rented space typically includes the physical area for the hardware (often measured in rack units or cabinets) and the necessary power, cooling, and security.

While prices vary greatly, in the US, you can expect to pay between $50 – $200 per month per RU, with prices ranging from $60 – $250 per RU in Europe, $80 – $300 per month per RU in South American, and $70 – $280 per month per RU in Asia.

Co-location facilities will provide a high-bandwidth internet connection, redundant power supplies, and sophisticated cooling systems to ensure optimal performance and uptime for hosted equipment. They also include robust physical security measures, including surveillance cameras, biometric access controls, and round-the-clock security personnel.

At a high level, businesses use co-location facilities to leverage economies of scale they couldn’t achieve on their own. By sharing the infrastructure costs with other tenants, companies can access high-level data center capabilities without a significant upfront investment in building and maintaining their facility.

Choosing a Co-lo for Live Streaming

Choosing a co-lo facility for any use involves many factors. However, live streaming demands require a focus on a few specific capabilities. We discuss these below to help you make an informed decision and maximize the efficiency and cost-effectiveness of your live-streaming operations.

Network Infrastructure and Connectivity

Live streaming requires high-performance and reliable network connections. If you’re using a particular content delivery network, ensure the link to the CDN is high performing. Beyond this, consider a co-lo with multiple (and redundant) high-speed connections to multiple top-tier telecom and cloud providers, which can ensure your live stream remains stable, even if one of the connections has issues.

Multiple content distribution providers can also reduce costs by enabling competitive pricing. If you need to connect to a particular cloud provider, perhaps for content management, analytics, or other services, make sure these connections are also available.

Geographic Location and Service

Choosing the best location or locations is a delicate balance. From a pure quality of experience perspective, facilities closer to your target audience can reduce latency and ensure a smoother streaming experience. However, during your launch, cost considerations may dictate a single centralized location that you can supplement over time with edge servers near heavy concentrations of viewers.

During the start-up phase and any expansion, you may need access to the co-lo facility to update or otherwise service existing servers and install new ones. That’s simpler to perform when the facility is closer to your IT personnel.

If circumstances dictate choosing a facility far from your IT staff, consider choosing a provider with the necessary managed services. While the services offered will vary considerably among the different providers, most locations provide hardware deployment and management services, which should cover you for expansion and maintenance.

Similarly, live streaming operations usually run round-the-clock, so you need a facility that offers 24/7 technical support. A highly responsive, skilled, and knowledgeable support team can be crucial in resolving any unexpected issues quickly and efficiently.

Scalability

Your current needs may be modest, but your infrastructure needs to scale as your audience grows. The chosen co-lo facility (or facilities) should have ample space and resources to accommodate future growth and expansion. Check whether they have flexible plans allowing upgrades and scalability as needed.

Redundancy and Disaster Recovery

In live streaming, downtime is unacceptable. Check for guarantees in volatile coastal or mountain regions that data centers can withstand specific types of disasters, like floods and hurricanes.

When disaster strikes, the co-location facility should have redundant power supplies, backup generators, and efficient cooling systems to prevent potential hardware failures. Check for procedures to protect equipment, backup data, and other steps to minimize the risk and duration of loss of service. For example, some facilities offer disaster recovery services to help customers restore disrupted environments. Walk through the various scenarios that could impact your service and ensure that the providers you consider have plans to minimize disruption and get you up and running as quickly as possible.

Security and Compliance

Physical and digital security should be a primary concern, particularly if you’re streaming third-party premium content that must remain protected. Ensure the facility uses modern security measures like CCTV, biometric access, fire suppression systems, and 24/7 on-site staff. Digital security should include robust firewalls, DDoS mitigation services, and other necessary precautions.

Environment Sustainability

An essential requirement for most companies today is environmental sustainability. ASIC-based transcoding is the most power-efficient of all transcoding alternatives. We believe that all companies should work to reduce their carbon footprints. Accordingly, choosing a co-location facility committed to energy efficiency and renewable energy sources will lower your energy costs and align with your company’s environmental goals.

Remember, the co-location facility is an extension of your live-streaming business. With the proper infrastructure, you can ensure high-quality, reliable live streams that satisfy your audience and grow your business. Take the time to visit potential facilities, ask questions, and thoroughly evaluate before deciding.

Cloud services are an effective way to begin live streaming. Still, once you reach a particular scale, it’s common to realize that you’re paying too much and can save significant OPEX by deploying transcoding infrastructure yourself. The question is, how to get started?

NETINT’s Build Your Own Live Streaming Platform symposium gathers insights from the brightest engineers and game-changers in the live-video processing industry on how to build and deploy a live-streaming platform.

In just three hours, we’ll cover the following:

  • Hardware options for live transcoding and encoding to cut costs by as much as 80%.
  • Software options for producing, delivering, and playing your live video streams.
  • Co-location selection criteria to achieve cloud-like performance with on-premise affordability.

You’ll also hear from two engineers who will demystify the process of assembling a live-streaming facility, how they identified and solved key hurdles, along with real costs and performance data.

Denser / Leaner / Greener - Symposium on Building Your Live Streaming Cloud

Build Your Own Streaming Infrastructure – Software

Build Your Own Streaming Infrastructure - Article by Jan Ozer from NETINT Technologies

My assumption is that you’re currently using a cloud-based service like AWS for your live streaming and are seeking to reduce costs by buying your own transcoding hardware, installing the necessary software, and hosting the server on-premises or in a co-location facility. This article covers the software side.

To begin, let’s acknowledge that AWS and other cloud services have created a well-featured and highly integrated ecosystem for live streaming and distribution. The downside is the cost.

To illustrate the potential savings, I’ll refer to this article, which compared the cost of producing 21 H.264 ladders and 27 HEVC ladders via AWS MediaLive and by encoding with NETINT’s recently launched Logan Video Server. As you can see in the table, MediaLive costs around $400K for H.264 and $1.8 million for HEVC, as compared to $11,140 in both cases for the co-located server.

Streaming Infrastructure - Table from article 'cloud or on-prem'
Table 1. Five-year cost comparison . AWS MediaLive pricing compared to the NETINT Server

While there are less expensive options available inside and outside of AWS, whenever you pay for hardware by the minute or hour of production, you’re vastly overpaying as compared to owning your own hardware. Sure, you say, but it’s so easy compared to running your own hardware.

If that’s a concern, here are some comforting words from David Heinemeier Hansson, co-owner, and CTO of software developer 37signals, the developer of the project management platform Basecamp and email service Hey. Recently, Hansson wrote  Why we’re leaving the cloud, a blog that detailed his companies’ decisions to do just that. Here’s the relevant quote.

Up until very recently, everyone ran their own servers, and much of the progress in tooling that enabled the cloud is available for your own machines as well. Don’t let the entrenched cloud interests dazzle you into believing that running your own setup is too complicated. Everyone and their dog did it to get the internet off the ground, and it’s only gotten easier since.

My wife has chihuahuas, and given their difficulties with potty training, I seriously doubt they could do it, but you get the point. To paraphrase FDR, all you have to fear is fear itself. The bottom line is that running your own live streaming service should cost relatively little CAPEX, will save significant OPEX, and won’t be nearly as challenging as you might be fearing.

Let’s look at your options for the software required to run your homegrown system.

Transcoding and Packaging Software

Figure 1 shows the minimum software and infrastructure needed for a live-streaming service. Presumably, you’ve already got the live production covered, and since AWS doesn’t offer a player, you have that piece addressed as well. You’ll need a content delivery network to deliver your streaming video, but you can continue to use CloudFront or other CDN. The software that you absolutely have to replace is the live transcoding and packaging component.

Here you have three options; multimedia frameworks, media servers, and “other.” Let’s discuss each in turn.

Multimedia Frameworks

Multimedia frameworks are software libraries, tools, and APIs that provide a set of functionalities and capabilities for multimedia processing, manipulation, and streaming. The best-known framework is FFmpeg, followed by GStreamer and GPAC, and they are all available open source.

Build Your Own Streaming Infrastructure - Software- diagram-2
Figure 1. Netflix uses GPAC for its packaging,
a significant technology endorsement for GPAC
and for multimedia frameworks in general.

Multimedia frameworks excel in projects at both ends of the complexity spectrum. For simple projects, like transcoding an input stream to an encoding ladder, you can create a script that inputs the stream, transcodes, and hands the packaged output streams off to a CDN in a matter of minutes. You can use the script to process thousands of simultaneous jobs, all at no charge.

At the other end of the spectrum, these frameworks also excel at complex jobs with idiosyncratic custom requirements that likely aren’t available in a server or commercial software product. The development, maintenance, and modification costs are considerable, but you get maximum feature flexibility if you’re willing to pay that cost.

What you don’t get with these tools is a user interface or simple configuration options – you start with a blank slate and must program in all desired features. What could be as simple as checking a checkbox in a streaming media server could require dozens or even thousands of lines of code in a multimedia framework.

Which takes us to streaming media servers.

Streaming Media Servers

The next category of products are streaming media servers, and it includes Wowza Streaming Engine, Nimble Streamer, and two open-source servers, Red5 and Ant Media Server. These servers tend to excel for most productions in the middle of the complexity spectrum and offer multiple advantages over multimedia frameworks.

There are several reasons why you might choose to use a streaming server over a multimedia framework, including a simplified setup and configuration. Most streaming servers provide out-of-the-box streaming solutions with pre-configured settings and management interfaces that simplify the setup and configuration process. While not all offer GUIs, those that don’t offer simple option selection in configuration files.

Build Your Own Streaming Infrastructure - Software- diagram-3
Figure 2. Wowza Streaming Engine is a highly regarded streaming server

As mentioned above, streaming servers often offer simpler access to advanced features that you’d have to craft by hand with a multimedia framework. They also offer better integration with third-party services like digital rights management (DRM) and content delivery networks. Between the simplified setup, easier access to features, and improved integration with other services, packaged servers can dramatically accelerate getting your live streaming service up and running.

Once you’re operational, you’ll appreciate management interfaces that monitor the health and performance of your streaming infrastructure, track viewer analytics, manage streaming workflows, and make real-time adjustments. If you’re in a dynamic demand environment, some streaming servers offer built-in scalability features and load balancing to manage the load over multiple hard transcoding resources. You’d have to build all that by hand or with plug-ins if using a multimedia framework.

The two potential downsides of streaming servers are cost and customizability. You’ll have to pay a monthly fee for some versions of these servers, and you may find it complicated or nearly impossible to add what you might consider to be essential features.

Other Streaming-Capable Programs

Most companies building their own live-streaming infrastructures will implement either a multimedia framework or a streaming server, but there are other programs that incorporate the core encoding and packaging functions. One such program is Norsk from id3as. Norsk bills itself as “an SDK that enables developers to easily create amazing, dynamic live video workflows and deploy them at any scale.” As such, it combines both video production and streaming server-related functions

You see this in Figure 3. The top portion shows that Norsk supports the typical codecs and packaging formats deployed by live-streaming producers. At the bottom of the figure, you see that Norsk also offers production-oriented features like multiple camera support, graphics and overlays, and transitions.

Build Your Own Streaming Infrastructure - Software- diagram-4
Figure 3. Norsk offers both production and server-related functions.

Interestingly, Norsk doesn’t have a GUI, instead offering a high-level API to simplify configuration and operation, with a Workflow Visualizer component to view the running state of the application. In this fashion, Norsk attempts to provide the configurability of multimedia frameworks with the ease of operation of scripting-driven streaming media servers.

Finding a program like Norsk that combines transcoding and packaging with other essential streaming-related functions makes a lot of sense; there’s one less vendor to onboard and one less product to learn and support. As remote production becomes more common, we expect more programs like Norsk to become available.

Those are your high-level options. If you’re interested in learning more about these and other programs that can drive encoding and packaging for your live transcoder. You should plan to attend our upcoming symposium; details will be available in the next couple of weeks.

NETINT Quadra vs. NVIDIA T4 – Benchmarking Hardware Encoding Performance

Hardware Encoding - Benchmarking Hardware Encoding Performance by Jan Ozer

This article is the second in a series about benchmarking hardware encoding performance. In the first article, available here, I delineated a procedure for testing hardware encoders. Specifically, I recommended this three-step procedure:

  1. Identify the most critical quality and throughput-related options for the encoder.
  2. Test across a range of configurations from high quality/low throughput to low quality/high throughput to identify the operating point that delivers the optimum blend of quality and throughput for your application.
  3. Compute quality, cost per stream, and watts per stream at the operating point to compare against other technologies.

After laying out this procedure, I applied it to the NETINT Quadra Video Processing Unit (VPU) to find the optimum operating point and the associated quality, cost per stream, and watts per stream. In this article, we perform the same analysis on the NVIDIA T4 GPU-based encoder.

About The NVIDIA T4

The NVIDIA T4 is powered by NVIDIA Turing Tensor Cores and draws 70 watts in operation. Pricing varies by the reseller, with $2,299 around the median price, which puts it slightly higher than the $1,500 quoted for the NETINT Quadra T1  VPU in the previous article.

In creating the command line for the NVIDIA encodes, I checked multiple NVIDIA documents, including a document entitled Video Benchmark Assumptions, this blog post entitled Turing H.264 Video Encoding Speed and Quality, and a document entitled Using FFmpeg with NVIDIA GPU Hardware acceleration that requires a login. I readily admit that I am not an expert on NVIDIA encoding, but the point of this exercise is not absolute quality as much as the range of quality and throughput that all hardware enables. You should check these documents yourself and create your own version of the optimized command string.

While there are many configuration options that impact quality and throughput, we focused our attention on two, lookahead and presets. As discussed in the previous article, the lookahead buffer allows the encoder to look at frames ahead of the frame being encoded, so it knows what is coming and can make more intelligent decisions. This improves encoding quality, particularly at and around scene changes, and it can improve bitrate efficiency. But lookahead adds latency equal to the lookahead duration, and it can decrease throughput.

Note that while the NVIDIA documentation recommends a lookahead buffer of twenty frames, I use 15 in my tests because, at 20, the hardware decoder kept crashing. I tested a 20-frame lookahead using software decoding, and the quality differential between 15 and 20 was inconsequential, so this shouldn’t impact the comparative results.

I also tested using various NVIDIA presets, which like all encoding presets, trade off quality vs. throughput. To measure quality, I computed the VMAF harmonic mean and low-frame scores, the latter a measure of transient quality. For throughput, I tested the number of simultaneous 1080p30 files the hardware could process at 30 fps. I divided the stream count into price and watts/hour to determine cost/stream and watts/stream.

As you can see in Table 1, I tested with a lookahead value of 15 for selected presets 1-9, and then with a 0 lookahead for preset 9. Line two shows the closest x264 equivalent score for perspective.

In terms of operating point for comparing to  Quadra, I choose the lookahead 15/preset 4 configuration, which yielded twice the throughput of preset 2 with only a minor reduction in VMAF Harmonic mean. We will consider low-frame scores in the final comparisons.

In general, the presets worked as they should, with higher quality and lower throughput at the left end, and the reverse at the right end, though LA15/P4 performance was an anomaly since it produced lower quality and higher throughput than LA15/P6. In addition, dropping the lookahead buffer did not produce the performance increase that we saw with Quadra, though it also did not produce a significant quality decrease.

Hardware Encoding - Benchmarking Hardware Encoding Performance by Jan Ozer - Table 1
Table 1. H.264 options and results.

Table 2 shows the T4’s HEVC results. Though quality was again near the medium x265 preset with several combinations, throughput was very modest at 3 or 4 streams at that quality level. For HEVC, LA15/P4 stands out as the optimal configuration, with four times or better throughput than other combinations with higher-quality output.

In terms of expected preset behavior, LA15/P4 was again quite the anomaly, producing the highest throughput in the test suite with slightly lower quality than LA15/P6, which should deliver lower quality. Again, switching from LA 15 to LA 0 produced neither the expected spike in throughput nor a drop in quality, as we saw with the Quadra for both HEVC and H.264.

Hardware Encoding - Benchmarking Hardware Encoding Performance by Jan Ozer - Table 2
Table 2. HEVC options and results.

Quadra vs. T4

Now that we have identified the operating points for Quadra and the T4, let us compare quality, throughput, CAPEX, and OPEX. You see the data for H.264 in Table 3.

Here, the stream count was the same, so Quadra’s advantage in cost per stream and watts per stream related to its lower cost and more efficient operation. At their respective operating points, the Quadra’s VMAF harmonic mean quality was slightly higher, with a more significant advantage in the low-frame score, a predictor of transient quality problems.

Hardware Encoding - Benchmarking Hardware Encoding Performance by Jan Ozer - Table 3
Table 3. Comparing Quadra and T4 at H.264 operating points.

Table 4 shows the same comparison for HEVC. Here, Quadra output 75% more streams than the T4, which increases the cost per stream and watts per stream advantages. VMAF harmonic means scores were again very similar, though the T4’s low frame score was substantially lower.

Hardware Encoding - Benchmarking Hardware Encoding Performance by Jan Ozer - Table 4
Table 4. Comparing Quadra and T4 at HEVC operating points. 

Figure 5 illustrates the low-frames and low-frame differential between the two files. It is the result plot from the Moscow State University Video Quality Measurement Tool (VQMT), which displays the VMAF score, frame-by-frame, over the entire duration of the two video files analyzed, with Quadra in red and the T4 in green. The top window shows the VMAF comparison for the entire two files, while the bottom window is a close-up of the highlighted region of the top window, right around the most significant downward spike at frame 1590.

Hardware Encoding - Benchmarking Hardware Encoding Performance by Jan Ozer - Picture 1
Figure 5. The downward green spikes represent the low-frame scores in the T4 encode.

As you can see in the bottom window in Figure 5, the low-frame region extends for 2-3 frames, which might be borderline noticeable by a discerning viewer. Figure 6 shows a close-up of the lowest quality frame, Quadra on the left, T4 on the right, and the dramatic difference in VMAF score, 87.95 to 57, is certainly warranted. Not surprisingly, PSNR and SSIM measurements confirmed these low frames.

Hardware Encoding - Benchmarking Hardware Encoding Performance by Jan Ozer - Picture 2
Figure 6. Quality comparisons, NETINT Quadra on the left, T4 on the right.

It is useful to track low frames because if they extend beyond 2-3 frames, they become noticeable to viewers and can degrade viewer quality of experience. Mathematically, in a two-minute test file, the impact of even 10 – 15 terrible frames on the overall score is negligible. That is why it is always useful to visualize the metric scores with a tool like VQMT, rather than simply relying on a single score.

Summing Up

Overall, you should consider the procedure discussed in this and the previous article as the most important takeaway from these two articles. I am not an expert in encoding with NVIDIA hardware, and the results from a single or even a limited number of files can be idiosyncratic.

Do your own research, test your own files, and draw your own conclusions. As stated in the previous article, do not be impressed by quality scores without knowing the throughput, and expect that impressive throughput numbers may be accompanied by a significant drop in quality.

Whenever you test any hardware encoder, identify the most important quality/throughput configuration options, test over the relevant range, and choose the operating point that delivers the best combination of quality and throughput. This will give the best chance to achieve a meaningful apples vs. apples comparison between different hardware encoders that incorporates quality, cost per stream, and watts per stream.