Norsk and NETINT: Elevating Live Streaming Efficiency

With the growing demand for high-quality viewing experiences and the heightened attention on cost efficiency and environmental impact,  hardware acceleration plays an ever-more-crucial role in live streaming.

Here at NETINT, we want users to take full advantage of our transcoding hardware, so we’re pleased to announce that id3as NORSK now offers exceptionally efficient support for NETINT’s T408 and Quadra video processing unit (VPU) modules.

Here at NETINT, we want users to take full advantage of our transcoding hardware, so we’re pleased to announce that id3as NORSK now offers exceptionally efficient support for NETINT’s T408 and Quadra video processing unit (VPU) modules.

Using NETINT VPU’s, users can leverage the Norsk low-code live streaming SDK to achieve higher throughput and greater efficiency compared to running software on CPUs in on-prem or cloud configurations. Combined with Norsk’s proven high-availability track record, this makes it easy to deliver exceptional services with maximum reliability and performance at a never-before-available OPEX. 

Norsk and NETINT.

Norsk also takes advantage of Quadra’s hardware acceleration and onboard scaling to achieve complex compositions like picture-in-picture and resizing directly on the card. Even better, Norsk’s built-in ability to “do the right thing” also means that it knows when it can take advantage of hardware acceleration and when it can’t.  

 

For example, if you’re running Norsk on the T408, decoding will take place on the card, but Norsk will automatically utilize the host CPU for functions like picture-in-picture and resizing that the T408 doesn’t natively support, before returning the enriched media to the card for encoding (Scaling and resizing functions are native to Quadra VPUs so are performed onboard without the host CPU). 

 

“As founding members of Greening of Streaming, we’re keenly aware of the pressing need to focus on energy efficiency at every point of the video stack,” says Norsk CEO Adrian Roe. “By utilizing the Quadra and T408 VPU modules, users can reduce energy usage while achieving maximum performance even on compute-intensive tasks. With Norsk seamlessly running on NETINT hardware, live streaming services can consume as little energy as possible while delivering a fantastic experience to their customers.” 

“By utilizing the Quadra and T408 VPU modules, users can reduce energy usage while achieving maximum performance even on compute-intensive tasks. With Norsk seamlessly running on NETINT hardware, live streaming services can consume as little energy as possible while delivering a fantastic experience to their customers.” 

– Norsk CEO Adrian Roe. 

“Id3as has proven expertise in helping its customers produce polished, high-volume, compelling productions, and as a product, Norsk makes that expertise widely accessible,” commented Alex Liu, NETINT founder and COO. “With Norsk’s deep integration with our T408 and Quadra products, this partnership makes NETINT’s proven ASIC-based technology available to any video engineer seeking to create high-quality productions at scale.” 

“With Norsk’s deep integration with our T408 and Quadra products, this partnership makes NETINT’s proven ASIC-based technology available to any video engineer seeking to create high-quality productions at scale.”  

– Alex Liu, NETINT founder and COO.

Both Norsk and NETINT will be at IBC in Amsterdam, September 15-18. Click to request a meeting with Norsk, or NETINT, and/or visit NETINT at booth 5.A86

ON-DEMAND: Adrian Roe - Make Live Easy with NORSK SDK

Simplify Building Your Own Streaming Cloud with Wowza

Transcoding and packaging software is a key component of any live-streaming cloud, and one of the most functional and flexible programs available is the Wowza Streaming Engine. During the symposium, Barry Owen, Chief Solutions Architect at Wowza, detailed how to create a scalable streaming infrastructure using the Wowza Streaming Engine (WSE).

He started by discussing Wowza’s history, from its formation in 2005 to its recent acquisition of FlowPlayer. After defining the typical live streaming production pipeline, Barry detailed how WSE can serve as an origin server, transcoder, and packager, ensuring optimal viewer experience. He discussed WSE’s adaptability, including its ability to scale through GPU- and VPU-based transcoding, and emphasized WSE’s deployment options, which range from on-premises to cloud-based infrastructures. He then outlined Wowza’s infrastructure for distributing to audiences large and small.

Barry concluded by validating the session title by getting WSE up and running in under five minutes using Docker in a demo that you can watch below, at the end of this article.

Simplify Building Your Own Streaming Cloud with WOWZA

Start Streaming in Minutes with Wowza Streaming Engine

The focus of Barry’s talk was how to create a highly scalable streaming infrastructure with Wowza Streaming Engine (WSE). He began by recounting Wowza’s history. Established in 2005, the company launched its inaugural product, the Wowza Media Server, in 2007. This was later complemented by the Wowza Cloud, a SaaS solution, in 2013. Since its inception, Wowza has grown to support over 6,000 customers in 170 countries and boasts more than 35,000 streaming implementations. Their products are responsible for 38 million video transcoding hours each month. Recently, the company acquired FlowPlayer, adding a premier video player to its product lineup.

Barry emphasized Wowza’s commitment to providing streaming solutions that are reliable, scalable, and adaptable. He noted the importance of customization in the streaming sector and highlighted the company’s robust support team and services, which are designed to ensure customer success.

Wowza Streaming Engine Functionality

Barry then moved to the heart of his talk, which he set up by illustrating the streaming pipeline, which begins with video capture from sources like cameras, encoders, or mobile devices (Figure 1). Within this pipeline, WSE serves as a comprehensive media server that’s capable of functioning as an origin server, transcoder, and packager in a single system.

In this role, WSE offers real-time encoding and transcoding, producing multiple-bit rate streams for optimal viewer experience. It also performs real-time packaging into formats like HLS and DASH, facilitating compatibility across devices, and ancillary functions like adding DRM and captions, ad insertion, and metadata handling. Once processed, the stream is ready for delivery to a vast audience through one or multiple CDNs, depending on the desired scale and workflow.

NETINT Symposium - Figure 1. The role WSE plays in the streaming pipeline.
Figure 1. The role WSE plays in the streaming pipeline.

Then Barry dug deeper into the capabilities of the Wowza Streaming Engine, emphasizing its comprehensive nature as an end-to-end media server. These capabilities include:

  • Input Protocols: The Streaming Engine can ingest almost any input protocol, including RTSP, RTMP, SRT, WebRTC, HLS, and more.
  • Transcoding: WSE offers just-in-time, real-time transcoding with minimal latency. It also supports features like compositing and overlays, preparing the stream for packaging.
  • Packaging: WSE supports commonly used formats like HLS and DASH, as well as more specialty formats such as WebRTC, RTSP, and MPEG-TS .
  • Delivery: Wowza supports both push and pull models for stream delivery. It can integrate with multiple CDN vendors, including its own, and allows syndication to platforms like Facebook and LinkedIn.
  • Extensibility: A significant feature of the Streaming Engine is its flexibility. It offers a complete Java API for custom processing and a REST API for system command and control. WSE’s user interface (Streaming Engine Manager) is built on this REST API, demonstrating its functionality.
  • Configuration and Control: This Streaming Engine Manager allows users to manage one or more Streaming Engine instances from one web interface. Advanced users can also programmatically edit configurations to integrate with their systems.

Barry underscored WSE’s adaptability, highlighting its ability to cater to custom workflows, from complex ad insertions to machine learning applications. He also mentioned the availability of GitHub libraries with examples and encouraged exploring the Streaming Engine Manager for system configuration and monitoring.

Deploying Wowza Streaming Engine
NETINT Symposium - Figure 2. WSE deployment options.
Figure 2. WSE deployment options.

Barry next discussed the deployment options for the Wowza Streaming Engine. These include:

  • On-Premises: WSE can be deployed on-premises, offering cost-effective and efficient solutions, especially in high-density scenarios or when access to a personal data center is available.
  • Managed Hardware Platforms: WSE can be set up on platforms like Linode, providing access to bare metal in a managed environment.
  • Public Clouds: Pre-built images are available for major cloud platforms, allowing quick setup. Users can choose from marketplace images or standard ones, where they bring their own license key. Pre-configurations for common use cases are also provided.
  • Docker: Wowza offers Docker images for users, emphasizing its significance in automating deployment, scaling, and ensuring high availability in modern infrastructure setups.

Barry emphasized WSE’s adaptability to various deployment needs, from traditional setups to modern cloud-based infrastructures.

Scaling Wowza Streaming Engine
NETINT Symposium - Figure 3. Scaling stream processing with GPUs and VPUs (ASICS).
Figure 3. Scaling stream processing with GPUs and VPUs (ASICS).

Barry shifted the discussion to scaling and stream processing, emphasizing the different approaches and addressing their pros and cons. For stream processing, WSE can deploy CPU, GPU, and VPU-based transcoding. Here’s a brief discussion of each option.

CPU-Based Transcoding:

Barry highlighted the traditional approach of using software CPU-based transcoding. The Wowza Streaming Engine can efficiently leverage the processing power of CPUs to handle video streams. This method is straightforward and can be scaled by adding more servers or opting for higher-capacity CPUs.

He shared that CPU-based transcoding offers a wide range of adaptability, allowing for various encoding and decoding combinations. Given that CPUs are a standard component in servers, there’s no need for specialized hardware. On the other hand, he pointed out CPUs aren’t the best option for achieving high density or low power consumption.

GPU-Based Transcoding:

Regarding GPU-based transcoding, Barry stated that GPUs can handle a significant number of streams, and take on the heavy lifting from the CPU, ensuring smoother operation. However, they are expensive, and not exclusively designed for video processing, which can lead to higher power consumption.

VPU-Based Transcoding:

Barry expressed considerable enthusiasm for the capabilities of Video Processing Units (VPUs), or ASIC-based transcoders. Unlike general-purpose CPUs and GPUs, VPUs are purpose-built for video processing which allows them to handle video streams with remarkable efficiency. In recent years, VPUs have emerged as a promising solution, especially when it comes to achieving high-density streaming. Barry noted that these units not only offer a competitive price per channel but also boast minimal power consumption.

The Evolution Towards Specialization:

Drawing from his insights, Barry seemed to suggest a trend in the streaming industry: a move towards more specialized solutions. While CPUs and GPUs have been stalwarts in the industry, the rise of VPUs indicates a shift towards tools and technologies tailored specifically for streaming. This specialization promises not only enhanced performance but also greater efficiency in terms of cost and energy consumption.

Distributing Your Streams

Barry concluded his talk by discussing the distribution options available from Wowza. He emphasized the importance of adaptability when it comes to scaling outputs, especially given the diverse audience sizes that streaming services might cater to. WSE offers multiple distribution options to ensure that content reaches its intended audience efficiently, regardless of its size.

On-Premises Scaling:

One of the primary methods Barry discussed was scaling on-premises. By simply adding more servers to the existing infrastructure, streaming services can handle a larger load. This method is particularly useful for organizations that already have a significant on-premises setup and are looking to leverage that infrastructure.

CDN (Content Delivery Network):

For those expecting a vast number of viewers, Barry recommended using a content delivery network, or CDN. CDNs are designed to handle large-scale content delivery, distributing the content across a network of servers to ensure smooth and efficient delivery to a global audience. By offloading the streaming to a CDN, services can ensure that their content reaches viewers without any hitches, even during peak times.

Hybrid Approaches:

Barry found the hybrid model particularly intriguing. This approach combines the strengths of both on-premises scaling and CDNs. For instance, an organization could use its on-premises setup for regular streaming to a smaller audience. However, during events or times when a larger audience is expected, they could “burst” to the cloud, leveraging the power of CDNs to handle the increased load. This model offers both cost efficiency and scalability, ensuring that services are not overextending their resources during regular times but are also prepared for peak events.

In essence, Barry underscored the importance of flexibility in scaling. The ability to choose between on-premises, CDN, or a hybrid approach ensures that streaming services can adapt to meet any audience size.

NETINT Symposium - Wowza - Figure 4. Options for distributing to various audience sizes.
Figure 4. Options for distributing to various audience sizes.
Figure 8. A simple production with two cameras, a source switcher, and WebRTC output.

Start Streaming in Minutes with WSE: The Demonstration

Play Video about NETINT Symposium - Wowza
Figure 5. Click the image to run Barry’s demo.

Barry then ran a recorded demonstration to illustrate the simplicity of setting up the Wowza Streaming Engine using Docker – you can run this below. He ran the demo using Docker Desktop and Docker Compose, and the objective was to launch two containers: one for the Wowza Streaming Engine and another for its manager.

He began by activating the services using the command ‘Docker compose up’. Since he recorded the demo on an M1 Mac, he noted that the process might be slightly slower due to the Rosetta translation layer. As the services initialized, Barry explained the YAML file he used to provision these services. The file contained configurations for both the Streaming Engine and its Manager, detailing aspects like image sources, environment variables, and port settings.

With the services up and running, Barry navigated to Docker Desktop to monitor the performance of the two launched services, observing metrics like CPU and memory usage. He then accessed the Streaming Engine Manager via a web browser. Barry highlighted the versatility of Docker Compose, mentioning that it can manage multiple service instances, which can be beneficial for scalability, high availability, or clustering.

Upon accessing the manager, Barry logged in to view the server’s health snapshot, providing insights into its status. He then navigated to a pre-configured application named ‘live’ to stream content. Using a live streaming program called Open Broadcaster Software on his system, Barry set it up to stream to the server, pointing out the server’s recognition of the incoming stream and its subsequent packaging.

Returning to the manager, Barry verified the incoming stream’s presence and details. He then extracted the HLS URL for the stream, which he opened in a Safari browser tab to demonstrate live playback. The stream played seamlessly, underscoring the efficiency and ease of the entire process.

The demo showcased how, in a matter of minutes, you can configure, initiate, and stream using the Wowza Streaming Engine. You can get started yourself by downloading a trial version of WSE here.

ON-DEMAND:
Barry Owen, Start Streaming in Minutes with Wowza Streaming Engine

Simplify Building Your Own Streaming Cloud with GPAC

Romain Bouqueau is CEO of Motion Spell and one of the principal architects of the GPAC open-source software, one of the three software alternatives presented in the symposium. He spoke about the three challenges facing his typical customers: features, cost, and flexibility, and identified how GPAC delivers on each challenge.

Then, he illustrated these concepts with three impressive case studies: Synamedia/Quortex, Instagram, and Netflix. Overall, Romain made a strong case for GPAC as the transcoding/packaging element of your live streaming cloud.

Simplify Building Your Own Streaming Cloud with GPAC

NETINT Symposium - GPAC

Romain began his talk with an excellent summary of the situation facing many live-streaming engineers. “It’s a pleasure to discuss the challenges of building your own live-streaming cloud. Cloud services are convenient, but once you scale, you may realize that you’re paying too much and you are not as flexible as you’d like to be. I hope to convince you that the cost of customization that you have when using GPAC is actually an investment with a very interesting ROI if you make the right choices. That’s what we’re going to talk about.”

NETINT Symposium - GPAC - Figure 1. About Romain, GPAC, and Motion Spell.
Figure 1. About Romain, GPAC, and Motion Spell.

Then, he briefly described his background as a principal architect of the GPAC open-source software, which he has contributed to for over 15 years. In this role, Romain is known for his advocacy of open source and open standards and as a media streaming entrepreneur. His primary focus has been on GPAC, a multimedia framework recognized for its emphasis on modularity and standards compliance.

He described that GPAC offers tools for media content processing, inspection, packaging, streaming playback, and interaction. Unlike many multimedia frameworks that cater to 2D TV-like experiences, GPAC is characterized by versatility, controlled latency, and the ability to support various scenarios, including hybrid broadcast broadband setups, interactivity, scripting, virtual reality, and 3D scenes.

Romain’s notable achievements include streamlining the MPEG ISO-based media file format used in formats like MP4, CMAF, DASH, and HLS. His work earned recognition through a technology engineering EMMY award. To facilitate the wider use of GPAC, Romain established Motion Spell, which serves as a bridge between GPAC and its practical applications. Motion Spell provides consulting, support, and training, acting as the exclusive commercial licenser of GPAC.

During his introduction, Romain discussed challenges faced by companies in choosing between commercial solutions and open source for video encoding and packaging. He posited that many companies often lack the confidence and necessary skills to fully implement GPAC but emphasized that despite this, the implementation process is both achievable and simpler than commonly assumed.

He shared that his customers face three major challenges, features, cost, and flexibility, and addressed each in turn.

Features

NETINT Symposium - GPAC -  Figure 2. The three challenges facing those building their live streaming cloud.
Figure 2. The three challenges facing those building their live streaming cloud.

The first challenge Romain highlighted relates to features and capabilities. He advised the audience to create a comprehensive list that encompasses the needed capabilities, including codecs, formats, containers, DRMs, captions, and metadata management.

He also underscored the importance of seamless integration with the broader ecosystem, which involves interactions with external players, analytics probes, and specific content protocols. Romain noted that while some solutions offer user-friendly graphical interfaces, deeper configuration details often need to be addressed to accommodate diverse codecs, parameters, and use cases, especially at scale.

Highlighting Netflix’s usage of GPAC, Romain emphasized that GPAC is well-equipped to handle features and innovation, given its research and standardization foundation. He acknowledged that while GPAC is often a step ahead in the industry, it cannot implement everything alone. Thus, sponsorship and contributions from the industry are crucial for the continued development of this open-source software.

Romain explained that GPAC’s compatibility with the ecosystem is a result of its broad availability. Its role as a reference implementation, driven by standardization efforts, makes it a favored choice. Additionally, he mentioned that Motion Spell’s efforts have led to GPAC becoming part of numerous plugin systems across the industry.

Cost

The second challenge highlighted by Romain is cost optimization. He explained that costs are typically divided into Capital Expenditure (CAPEX) and Operational Expenditure (OPEX). He noted that GPAC, being written in the efficient C programming language, benefits from rigorous scrutiny from the open-source community, making it highly efficient. He acknowledged that while GPAC offers various features, each use case varies, leading to questions about resource allocation. Romain encouraged considerations like the need for CDNs for all channels and premium encoders for all content.

Regarding CAPEX, Romain mentioned integration costs associated with open-source software, emphasizing that some costs might be challenging to evaluate, such as error handling. He referenced the Synamedia/Quortex architecture as an example of efficient error management. Romain also addressed the misconception that open source implies free software, referencing a seminar he participated in that compared the costs of different options.

He shared an example of a broadcaster with a catalog of 100,000 videos and 500 concurrent streams. The CAPEX for packaging ranged from $100,000 to $200,000, depending on factors like developer rates and location, with running costs being relatively low compared to transcoding costs.

Romain revealed that, based on his research, open source consistently ranked as the most cost-efficient option or a close competitor across different use cases. He concluded that combining GPAC with Motion Spell’s professional services and efficient encoding appliances like NETINT‘s aligns well with the industry’s efficiency challenges.

Flexibility

The final challenge discussed by Romain was flexibility, emphasizing the importance of moving swiftly in a fast-paced environment. He described how Netflix successfully transitioned from SVOD to AVOD, adapted from on-demand to live streaming, switched from H.264 to newer codecs, and consolidated multiple containers into one over short time frames, contributing to their profitability. Romain underlined the potential for others to achieve similar success using GPAC.

He introduced a new application within GPAC called “gpac”, designed to build customized media pipelines. In contrast to historical GPAC applications that offered fixed media pipelines, this new “gpac” application enables users to create tailored pipelines to address specific requirements. This includes transcoding packaging, content protection, networking, and in general, any feature you need for your private cloud.

The Synamedia/Quortex “just-in-time everything” paradigm

NETINT Symposium - GPAC -  Figure 3. Motion Spell’s work with Quortex which was acquired by Synamedia.
Figure 3. Motion Spell’s work with Quortex, which was acquired by Synamedia.

Romain then moved on to the Synamedia/Quortex use case that illustrated the challenge of GPAC supplying comprehensive features. He described Quortex’s innovative “just-in-time everything” paradigm for media pipelines.

Unlike the traditional 24/7 transcoder that is designed to never fail and requires backup solutions for seamless switching, Quortex divides the media pipeline into small components that can fail and be relaunched when necessary. This approach is particularly effective for live streaming scenarios, offering low latency.

Romain highlighted that the Quortex approach is highly adaptable as it can run on various instances, including cloud instances that are cost-effective but might experience interruptions. The system generates content on-demand, meaning that when a user wants to watch specific content on a device, it’s either cached, already generated, or created just-in-time. This includes packaging, transcoding, and other media processing tasks.

Romain attributed the success of the development project to Quortex’s vision and talented teams, as well as the strategic partnership with Motion Spell. He also shared that after project completion, Synamedia acquired Quortex.

Instagram

NETINT Symposium - GPAC - Figure 4. GPAC helped Instagram cut compute times by 94%.
Figure 4. GPAC helped Instagram cut compute times by 94%.

The second use case addressed the challenge of cost and involved Instagram, a member of the Meta Group. According to Romain, Instagram utilized GPAC’s MP4Box to reduce video compute times by an impressive 94%. This strategic decision helped prevent a capacity shortage within just twelve months, ensuring the platform’s ability to provide video uploads for all users.

Romain presented Instagram’s approach as noteworthy because it emphasizes the importance of optimizing costs based on content usage patterns. The platform decided to prioritize transmission and packaging of content over transcoding, recognizing that a significant portion of Instagram’s content is watched only a few times. In this scenario, the cost of transcoding outweighs the savings on distribution expenses. As Romain explained, “It made more sense for them to package and transmit most content instead of transcoding it, because most of Instagram’s content is watched only a few times. The cost of transcoding, in their case, outweighs the savings on the distribution cost.”

According to Romain, this strategy aligns with the broader efficiency trend in the media tech industry. By adopting a combined approach, Instagram used lower quality and color profiles for less popular content, while leveraging higher quality encoders for content requiring better compression. This optimization was possible because Instagram controls its own encoding infrastructure, which underscores the value of open-source solutions in providing control and flexibility to organizations.

The computational complexity of GPAC’s packaging is close to a bit-for-bit copy, contributing to the 94% reduction in compute times. Romain felt that Instagram’s successful outcome exemplifies how open-source solutions like GPAC can empower organizations to make significant efficiency gains while retaining control over their systems.

Netflix

NETINT Symposium - GPAC - Figure 5. GPAC helped Netflix transition from SVOD to AVOD, from On-Demand to live, and from H264 to newer codecs.
Figure 5. GPAC helped Netflix transition from SVOD to AVOD,
from On-Demand to live, and from H264 to newer codecs.

The final use case addresses the challenge of flexibility and involves a significant collaboration between GPAC, Motion Spell, and Netflix. According to Romain, this collaboration had a profound impact on Netflix’s video encoding and packaging platform, and contributed to an exceptional streaming experience for millions of viewers globally.

At the NAB Streaming Summit, Netflix and Motion Spell took the stage to discuss the successful integration of GPAC’s open-source software into Netflix’s content operations. During the talk, Netflix highlighted the ubiquity of the ISO BMFF (MPEG ISO-based media file format) in their workflows and emphasized their commitment to open standards and innovation. The alignment between GPAC and Netflix’s goals allowed them to leverage GPAC’s innovations for free, thanks to sponsorships and prior implementations.

Romain explained how Netflix’s transformation from SVOD to AVOD, from On-Demand to live, and from H264 to newer codecs was facilitated by GPAC’s ease of integration and efficiency in operations. In this fashion, he asserted, the collaboration between Motion Spell and Netflix exemplifies the capacity of open-source solutions to drive innovation and adaptability.

Romain further described how GPAC’s rich feature set, rooted in research and standardization, offers capabilities beyond most publishers’ current needs. The unified “gpac” executable simplifies deployment, making it accessible for service implementation. Leveraging open-source principles, GPAC proves to be cost-competitive and easy to integrate. Motion Spell’s role in helping organizations maximize GPAC’s potential, as demonstrated with Netflix, underscores the practical benefits of the collaboration.

Romain summarized how GPAC’s flexibility empowers organizations to optimize and differentiate themselves rapidly. Examples like Netflix’s interactive Bandersnatch, intelligent previews, exceptional captioning, and accessibility enhancements showcase GPAC’s adaptability to evolving demands. Looking forward, Romain described how user feedback continues to shape GPAC’s evolution, ensuring its continued improvement and relevance in the media tech landscape.

With a detailed description of GPAC’s features and capabilities, underscored by very relevant case studies, Romain clearly demonstrated how GPAC can help live streaming publishers overcome any infrastructure-related challenge. And for those who would like to learn more, or need support or assistance integrating GPAC into their workflows, he invited them to contact him directly.

NETINT Symposium - GPAC

ON-DEMAND:
Romain Bouqueau, Deploying GPAC for Transcoding and Packaging

Simplify Building Your Own Streaming Cloud with NORSK SDK

Adrian Roe from id3as discussed Norsk, a technology designed to simplify the building of large-scale media workflows. id3as, originally a consultancy-led organization, works with major clients like DAZN and Nasdaq and is now pivoting to concentrate on Norsk, which it sells as an SDK. The technology underlying Norsk is responsible for delivering hundreds of thousands of live events annually and offers extensive expertise in low-latency and early-adoption technologies.

Adrian emphasized the company’s commitment to reliability, especially during infrastructure failures, and its initiatives in promoting energy-efficient streaming, including founding the Greening of Streaming organization. He also highlighted that about half of their deployments are cloud-based, suitable for fluctuating workloads, while the other half are on-premises or hybrid models, often driven by the need for high density at low cost and low energy consumption.

NETINT Symposium - About id3as

Simplify Building Your Own Streaming Cloud with Norsk SDK

Encoding Infrastructure is Simpler and Cheaper than Ever Before

The focus of the symposium was creating your own encoding infrastructure, and Adrian next focused on how new technologies were simplifying this and making it more affordable. For example, Adrian mentioned that advancements like NETINT’s Quadra video processing units (VPU) are changing the game, allowing some clients to consider shifting back to on-premises solutions.

Then, he described a recent server purchase to highlight the advancements in computing hardware capabilities. The server, which is readily available off-the-shelf and not particularly expensive, boasts impressive specs with 256 physical cores, 512 logical cores, and room for 24 U.2 cards like NETINT’s T408 or Quadra T1U.

Adrian then shared that during load testing, the server’s CPU profile was so extensive that it exceeded the display capacity of his screen, and he joked that it gave him an excuse to file an expense report for a new monitor. This anecdote emphasized the enormous processing capacity now available in relatively affordable hardware setups. The server occupies just 2U of rack space, and Adrian speculated that it could potentially deliver hundreds of channels in a fully loaded configuration, showcasing the leaps in efficiency and power in modern server hardware.

I think he used the second person — “it gives you an excuse to file an expense report for a new monitor” — but close enough.

NETINT Symposium - Figure 2. Infrastructure is getting cheaper and more capable.
Figure 2. Infrastructure is getting cheaper and more capable.

Why Norsk?

Adrian then shifted his focus to Norsk. He emphasized that Norsk is designed to cater to large broadcasters and systems integrators who require more than just off-the-shelf solutions. These clients often need specialized functionalities, like the ability to make automated decisions such as switching to a backup source if the primary one fails, without the need for human intervention.

They may also require complex multi-camera setups and dynamic overlays of scores or graphics during live events. Norsk is engineered to simplify these historically challenging tasks, enabling clients to easily put together sophisticated streaming solutions.

NETINT Symposium - Figure 3. Why Norsk in a nutshell.
Figure 3. Why Norsk in a nutshell.

He also pointed out that while some existing solutions may offer these features out of the box, creating such capabilities from scratch usually requires a significant engineering effort and demands professionals with advanced skills and a deep understanding of media technology, including intricate details of different video and container formats and how to handle them.

According to Adrian, Norsk eliminates these complexities, making it easier for companies to implement advanced streaming functionalities without the need for specialized knowledge. In short, Norsk fills the gap in the market for large broadcasters and systems integrators who require customized, automated decision-making capabilities in their streaming solutions.

Norsk In Action

Adrian then began demonstrating Norsk’s operation. He started by showing Figure 4 as an example of an output that Norsk might produce. This involved multiple inputs and overlays of scores or graphics that might need to update dynamically.

NETINT Symposium - Figure 4.  A typical production with multiple inputs and overlays that needed to change dynamically.
Figure 4. A typical production with multiple inputs and overlays
that needed to change dynamically.

Figure 5 shows the code Norsk uses to produce this output in its entirety via its “low code” approach. Parsing through the code, in the top section, you see the inputs, outputs, and transformation nodes. In this example, Norsk ingests RTMP and SRT (and also a logo from the file) and publishes the output over WHEP, a WebRTC-HTTP Egress protocol.  However, with Norsk it is easy to accommodate any of the common formats; for example, to change the output to (Low Latency) HLS, you would simply replace the “whep” output with HLS, and you’d be done.

NETINT Symposium - Figure 5. Norsk’s low code approach to the production shown in Figure 4.
Figure 5. Norsk’s low-code approach to the production shown in Figure 4.

The next section of code directs how the media flows between the nodes. Compose takes the video from the various inputs, while the audio mixer combines the audio from inputs 1 and 2.  Finally, the WHEP output subscribes to the outputs of the audio mixer and compose nodes. That’s all the code needed to create a complex picture in picture.

Adrian then went over the building blocks of how Norsk solutions can be constructed.  This started with an example of a pass-through setup where an RTMP input is published as a WebRTC output (Figure 6). With Norsk, all that’s needed is to specify that the output should get its audio and video from a particular input node, in this case, RTMP.

He then shared that Norsk is designed to be format-agnostic so that if the input node changes to another format like SRT or SDI, everything else in the setup will continue to function seamlessly. This ease of use allows for the quick development of sophisticated streaming solutions without requiring deep technical expertise.

NETINT Symposium - Figure 6. A simple example of an RTMP input published as WebRTC.
Figure 6. A simple example of an RTMP input published as WebRTC.

Adrian then described how Norsk handles potential incompatibilities that might arise in a workflow. In the above example he noted that WebRTC supports only the Opus audio codec, which is not supported by RTMP.

In these cases, Norsk automatically identifies the incoming audio codec (in this case, AAC) and transcodes it to Opus for compatibility with WebRTC. It also changed the encapsulation of the H.264 video for WebRTC compatibility. These automated adjustments showcase Norsk’s ability to simplify complex streaming workflows by making intelligent decisions to ensure compatibility and functionality.

Norsk will automatically adjust your workflow to make it work

NETINT Symposium - Figure 7. Norsk will automatically adjust your workflow to make it work; in this case, converting AAC to Opus and encapsulating the H.264 encoded video for WebRTC output.
Figure 7. Norsk will automatically adjust your workflow to make it work;
in this case, converting AAC to Opus and encapsulating the H.264 encoded video
for WebRTC output.

Next in the quick tour of “building blocks,” Adrian showed how easy it is to build a source switcher, allowing the user to switch dynamically between two camera inputs (Figure 8). He explained how id3as’ low-code approach made it easy and natural to extend this (for example to handle an unknown number of sources that might come and go during a live event).  

NETINT Symposium - Figure 8. A simple production with two cameras, a source switcher, and WebRTC output.
Figure 8. A simple production with two cameras, a source switcher, and WebRTC output.

According to Adrian, this simplicity allows engineers building solutions with Norsk to focus on the user experience they want to deliver to their customers as well as how to automate and simplify operations. They can focus on the intended result, not on the highly complex media technology required to deliver that result. This puts their logic into a very transparent context and simplifies building an application that delivers what’s intended.

Visualizing Productions

To better manage and control operations, Norsk supports visualizations via an OpenTelemetry API, which enables real-time data retrieval and input into a decisioning system for monitoring. As well as simple integration with these monitoring tools, Norsk also includes the visualizer shown in Figure 9 that renders this data as an easy-to-understand flow of media between nodes. You’ll see another two examples of this below.

NETINT Symposium - Figure 9. Norsk’s workflow visualization makes it simple to understand the media flow within an application.
Figure 9. Norsk’s workflow visualization makes it simple
to understand the media flow within an application.

Adrian then returned to the first picture-in-picture application shown to illustrate how the effect was created. You see that it’s very easy to position, size, and control each of the three elements, so the engineer can focus on the desired output, not anything to do with what the media itself looks like.

NETINT Symposium - Figure 10. Integrating three production inputs into a picture-in-picture presentation in Norsk.
Figure 10. Integrating three production inputs into a picture-in-picture presentation in Norsk.

Adrian highlighted the convenience and flexibility of Norsk’s low-code approach by describing how the system handles dynamic updates and configurations using code. He emphasized that the entire process of making configuration changes, like repositioning embedded areas or switching sources, involves just a few lines of code. This approach allows users to easily build complex functionalities like a video mixer with minimal engineering efforts.

Additionally, Adrian described how overlays are seamlessly integrated into the workflow. He explained that a browser overlay is treated as just another source which can be transformed and composed alongside other sources. By combining and outputting these elements, a sophisticated output with overlays can be achieved with minimal code.

Adrian emphasized that the features he demonstrated are sufficient to build a comprehensive live production system using Norsk like that shown in Figure 11. With Norsk’s low-code approach, he asserted, there are no additional complex calls required to achieve the level of sophistication demonstrated. With Norsk, he reiterated engineers building media applications can focus on creating the desired user experience rather than dealing with intricate technical details.

NETINT Symposium - Figure 11. Norsk enables productions like this with just a few lines of code.
Figure 11. Norsk enables productions like this with just a few lines of code.

Taking a big-picture view of how productions are created and refined, Adrian shared how the entire process of describing media requirements and building proof of concepts is streamlined with Norsk’s approach. With just a few lines of code, proof of concepts can be developed in a matter of hours or days. This leads to shorter feedback cycles with potential users, enabling quicker validation of whether the solution meets their needs. In this manner, Adrian noted that Norsk enables rapid feature development and allows for quicker feature launches to the market.

Integrating Encoding Hardware

Adrian then shifted his focus to integrations with encoding hardware, noting that many customers have production hardware that utilizes transcoders and VPUs like those supplied by NETINT to achieve high-scale performance. However, the development teams might not have the same production setup for testing and development purposes. Norsk addresses this challenge by providing an easy way for developers to work productively on their applications without requiring the exact production hardware.

You see this in Figure 12, an example of where developers can configure different settings for different environments. For instance, in a production or QA environment, the output settings could be configured for 1080p at 60 frames with specific Quadra configurations.

In contrast, in a development environment, the output settings might be configured for the x264 codec outputting 720p with different parameters, like using the ultrafast preset and zero latency. This approach allows engineers to have a productive development experience while not requiring the same processing power or hardware as the production setup.

NETINT Symposium - Figure 12. Norsk can use one set of transcoding parameters for development (on the right), and another for production.
Figure 12. Norsk can use one set of transcoding parameters
for development (on the right), and another for production.

Adrian then described how Norsk maximizes the acceleration hardware capabilities of third-party transcoders to optimize performance, sharing that with the NETINT cards, Norsk outperformed FFmpeg. For example, when using hardware transcoders, it’s generally more efficient to keep the processing on the hardware as much as possible to avoid unnecessary data transfers.

Adrian provided a comparison between scenarios where hardware acceleration is used and scenarios where it’s not. In one example, he showed how a NETINT T408 was used for hardware decoding, but some manipulations like picture-in-picture and resizing weren’t natively supported by the hardware. In this case, Norsk pulled the content to the CPU, performed the necessary manipulations, and then sent it back to the hardware for encoding (Figure 13).

NETINT Symposium - Figure 13. Working with the T408, Norsk had to scale and overlay via the host CPU.
Figure 13. Working with the T408, Norsk had to scale and overlay via the host CPU.

In contrast, with a Quadra card that does support onboard scaling and overlay, Norsk performed these functions on the hardware, remarkably using the same exact application code as for the T408 version (Figure 14). This way, Adrian emphasized, Norsk maximized the efficiency of the hardware transcoder and optimized overall system performance.

NETINT Symposium - Figure 14. Norsk was able to scale an overlay on the Quadra using the same code as on the T408
Figure 14. Norsk was able to scale an overlay on the Quadra using the same code as on the T408.

Adrian also highlighted the practicality of using Norsk by offering trial licenses for users to experience its capabilities. The trial license allows users to explore Norsk’s features and benefits, showcasing how it leverages emerging hardware technologies in the market to deliver high-density, high-availability, and energy-efficient media experiences. He noted that the trial software was fully capable, though no single session can exceed 20 minutes in duration.

Adrian then took a question from the audience, addressing Norsk’s support for SCTE-35. Adrian highlighted that Norsk is capable of SCTE-35 insertion to signal events such as ad insertion and program switching. Additionally, he noted that Norsk allows the insertion of tags into HLS and DASH manifest files, which can trigger specific events in downstream systems. This functionality enables seamless integration and synchronization with various parts of the media distribution workflow.

Adrian also mentioned that Norsk offers integration with digital rights management (DRM) providers. This means that after content is processed and formatted, it can be securely packaged to ensure that only authorized viewers have access to it. Norsk’s background in the broadcast industry has enabled it to incorporate these capabilities that are essential for delivering content to the right audiences while maintaining content protection and security.

For more information about Norsk, contact the company via their website or request a meeting. And if you’ll be at IBC, you can set up a meeting with them HERE.

ON-DEMAND: Adrian Roe, CEO at id3as | Make Live Easy with NORSK SDK

From Cloud to Control. Building Your Own Live Streaming Platform

Cloud services are an effective way to begin live streaming. Still, once you reach a particular scale, it’s common to realize that you’re paying too much and can save significant OPEX by deploying transcoding infrastructure yourself. The question is, how to get started?

NETINT’s Build Your Own Live Streaming Platform symposium gathers insights from the brightest engineers and game-changers in the live-video processing industry on how to build and deploy a live-streaming platform.

In just three hours, we’ll cover the following:

  • Hardware options for live transcoding and encoding to cut costs by as much as 80%.
  • Software options for producing, delivering, and playing your live video streams.
  • Co-location selection criteria to achieve cloud-like performance with on-premise affordability.

You’ll also hear from two engineers who will demystify the process of assembling a live-streaming facility, how they identified and solved key hurdles, along with real costs and performance data.

Cloud? Or your own hardware?

It’s clear to many that producing live streams via a public cloud like AWS can be vastly more expensive than owning your hardware. (You can learn more by reading “Cloud or On-Premises? The Streaming Dilemma” and “How to Slash CAPEX, OPEX, and Carbon Emissions Using the NETINT T408 Video Transcoder”). 

To quote serial entrepreneur David Hansson, who recently migrated two SaaS services from the cloud to on-premise, “Don’t let the entrenched cloud interests dazzle you into believing that running your own setup is too complicated. Everyone and their dog did it to get the internet off the ground, and it’s only gotten easier since.” 

For those who have only operated in the cloud, there’s fear of the unknown. Fear buying hardware transcoders, selecting the right software, and choosing the best colocation service. So, we decided to fight fear with education and host a symposium to educate streaming engineers on all these topics.  

“Building Your Own Live Streaming Cloud” will uncover how owning your encoding stack can slash operating costs and boost performance with minimal CAPEX.

Learn to select the optimal transcoding hardware, transcoding and packaging software, and colocation facilities. We’ll also discuss strategies to reduce carbon emissions from your transcoding engine. 

This FREE virtual event takes place on August 17th, from 11:00 AM – 2:15 PM EST.

Five issues tackled by nine experts:

Transcoding Hardware Options:

Learn the pros and cons of CPU, GPU, and ASIC-based transcoding via detailed throughput and cost examples shared by Kenneth Robinson, Manager of Field Application Engineers at NETINT Technologies. Then Ilya Mikhaelis, Streaming Backend Tech Lead at Mayflower, will describe his company’s journey from CPU to GPU to ASICs, covering costs, power consumption, latency, and density metrics.

Software Options:

Jan Ozer from NETINT will identify the three categories of transcoding software: multimedia frameworks, media servers, and other tools. Then, you’ll hear from experts in each category, starting with Romain Bouqueau, founder of Motion Spell, who will discuss the capabilities of the GPAC multimedia framework. Barry Owen, Chief Solutions Architect at Wowza, will discuss Wowza Streaming Engine’s suitability for private clouds. Lastly, Adrian Roe, Director at Id3as, developer of Norsk, will demonstrate Norsk’s simple, scripting-based operation, and extensive production and transcoding features.

Housing Options:

Once you select your hardware and software, the next step is finding the right co-location facility to house your live streaming infrastructure. Kyle Faber, with experience in building Edgio’s video streaming infrastructure, will guide you through the essential factors to consider when choosing a co-location facility.

Minimizing the Environmental Impact:

As responsible streaming professionals, it’s essential to address the environmental impact of our operations. Barbara Lange, Secretariat of Greening of Streaming, will outline actionable steps video engineers can take to minimize power consumption when acquiring and deploying transcoding servers.

Pulling it All Together:

Stef van der Ziel, founder of live-streaming pioneer Jet-Stream, will share lessons learned from his experience in creating both Jet-Stream’s private cloud and cloud transcoding solutions for customers. In his closing talk, Stef will demystify the process of choosing hardware, software, and a hosting facility, bringing all the previous discussions together into a cohesive plan.

Full Agenda:

11:00 am. – 11:10 am EST

Introduction (10 minutes):
Mark Donnigan, Head of Strategic Marketing at NETINT Technologies
Welcome, overview, and what you will learn.

 

11:10 am. – 11:40 am EST

Choosing transcoding hardware (30 minutes):
Kenneth Robinson, Manager of Field Application Engineers at NETINT Technologies
You have three basic approaches to transcoding, CPU-only, GPU, and ASICs. Kenneth outlines the pros and cons of each approach with extensive throughput and CAPEX and OPEX examples for each.

 

11:40 am. – 12:00 pm EST

From CPU to GPU to ASIC: Our Transcoding Journey (20 minutes):
Ilya Mikhaelis, Streaming Backend Tech Lead at Mayflower
Charged with supporting very high-volume live transcoding operations, Ilya started with libx264 software transcoding, which consumed massive power but yielded low stream density per server. Then he experimented with GPUs and other hardware and ultimately transitioned to an ASIC-based solution with much lower power consumption and much higher stream density per server. Ilya will detail the costs, power consumption, and density of all options, providing both data and an invaluable evaluation framework.

 

12:00 pm. – 12:10 pm EST

Choosing your live production software (10 minutes): 
Jan Ozer, Senior Director of Video Technology at NETINT Technologies
The core of every live streaming system is transcoding and packaging software. This comes in many shapes and sizes, from open-source software like FFmpeg and GPAC, to streaming servers like Wowza, and production systems like Norsk. Jan discusses these multiple options so you can cohesively and affordably build your own live-streaming ecosystem.

 

12:10 pm. – 1:10 pm EST

Speed Round (60 minutes):
20-minute presentations from GPAC, Wowza, and NORSK.
Speakers from GPAC, Wowza, and NORSK discussing the features, functions, operational paradigms, and cost structure of their live software offering.

Speakers include:

  • Adrian Roe, CEO at id3as, Product: Norsk, Title: Make Live Easy with NORSK SDK
  • Romain Bouqueau, Founder and CEO, Motion Spell (home for GPAC Licensing), Product: GPAC Title of Talk: Deploying GPAC for Transcoding and Packaging
  • Barry Owen, Chief Solutions Architect at Wowza, Title of Talk: Start Streaming in Minutes with Wowza Streaming Engine



1:10 pm. – 1:40 pm EST

Choosing a co-location facility (30 minutes): 
Kyle Faber, Senior Director of Product Management at Edgio.
Once you’ve chosen your hardware and software, you need a place to install them. If you don’t have your own connected data center, you may consider a colocation facility. In his talk, Kyle addresses the key factors to consider when choosing a co-location facility for your live streaming infrastructure.

 

1:40 pm. – 1:55 pm EST

How to Greenify Your Encoding Stack (15 minutes):
Barbara Lange, Secretariat of Greening of Streaming.
Learn how video streaming companies can work to significantly reduce their energy footprint and contribute to a greener streaming industry. Implement hardware and infrastructure optimization using immersion cooling and data center design improvements to maximize energy efficiency in your streaming infrastructure.

 

1:55 pm. – 2:15 pm EST

Closing Keynote (20 minutes):
Stef van der Ziel, Founder Jet-Stream
Jet-stream has delivered streaming solutions since its launch in 1994 and offers its own live streaming platform. One focus has been creating custom transcoding solutions for customers seeking to create their own private cloud for various applications. In his closing talk, Stef will demystify the process of choosing hardware, software, and a hosting facility and wrap a pretty bow around all previous presentations.

Co-location for Optimized, Sustainable Live Streaming Success

Choosing a co-location facility

If you decide to buy and run your transcoding servers versus a public cloud, you must choose where to host the servers. If you have a well-connected data center, that’s an option. But if you don’t, you’ll want to consider a co-location facility or co-lo.

A co-location facility is a data center that rents space to third parties for servers and other computing hardware. This rented space typically includes the physical area for the hardware (often measured in rack units or cabinets) and the necessary power, cooling, and security.

While prices vary greatly, in the US, you can expect to pay between $50 – $200 per month per RU, with prices ranging from $60 – $250 per RU in Europe, $80 – $300 per month per RU in South American, and $70 – $280 per month per RU in Asia.

Co-location facilities will provide a high-bandwidth internet connection, redundant power supplies, and sophisticated cooling systems to ensure optimal performance and uptime for hosted equipment. They also include robust physical security measures, including surveillance cameras, biometric access controls, and round-the-clock security personnel.

At a high level, businesses use co-location facilities to leverage economies of scale they couldn’t achieve on their own. By sharing the infrastructure costs with other tenants, companies can access high-level data center capabilities without a significant upfront investment in building and maintaining their facility.

Choosing a Co-lo for Live Streaming

Choosing a co-lo facility for any use involves many factors. However, live streaming demands require a focus on a few specific capabilities. We discuss these below to help you make an informed decision and maximize the efficiency and cost-effectiveness of your live-streaming operations.

Network Infrastructure and Connectivity

Live streaming requires high-performance and reliable network connections. If you’re using a particular content delivery network, ensure the link to the CDN is high performing. Beyond this, consider a co-lo with multiple (and redundant) high-speed connections to multiple top-tier telecom and cloud providers, which can ensure your live stream remains stable, even if one of the connections has issues.

Multiple content distribution providers can also reduce costs by enabling competitive pricing. If you need to connect to a particular cloud provider, perhaps for content management, analytics, or other services, make sure these connections are also available.

Geographic Location and Service

Choosing the best location or locations is a delicate balance. From a pure quality of experience perspective, facilities closer to your target audience can reduce latency and ensure a smoother streaming experience. However, during your launch, cost considerations may dictate a single centralized location that you can supplement over time with edge servers near heavy concentrations of viewers.

During the start-up phase and any expansion, you may need access to the co-lo facility to update or otherwise service existing servers and install new ones. That’s simpler to perform when the facility is closer to your IT personnel.

If circumstances dictate choosing a facility far from your IT staff, consider choosing a provider with the necessary managed services. While the services offered will vary considerably among the different providers, most locations provide hardware deployment and management services, which should cover you for expansion and maintenance.

Similarly, live streaming operations usually run round-the-clock, so you need a facility that offers 24/7 technical support. A highly responsive, skilled, and knowledgeable support team can be crucial in resolving any unexpected issues quickly and efficiently.

Scalability

Your current needs may be modest, but your infrastructure needs to scale as your audience grows. The chosen co-lo facility (or facilities) should have ample space and resources to accommodate future growth and expansion. Check whether they have flexible plans allowing upgrades and scalability as needed.

Redundancy and Disaster Recovery

In live streaming, downtime is unacceptable. Check for guarantees in volatile coastal or mountain regions that data centers can withstand specific types of disasters, like floods and hurricanes.

When disaster strikes, the co-location facility should have redundant power supplies, backup generators, and efficient cooling systems to prevent potential hardware failures. Check for procedures to protect equipment, backup data, and other steps to minimize the risk and duration of loss of service. For example, some facilities offer disaster recovery services to help customers restore disrupted environments. Walk through the various scenarios that could impact your service and ensure that the providers you consider have plans to minimize disruption and get you up and running as quickly as possible.

Security and Compliance

Physical and digital security should be a primary concern, particularly if you’re streaming third-party premium content that must remain protected. Ensure the facility uses modern security measures like CCTV, biometric access, fire suppression systems, and 24/7 on-site staff. Digital security should include robust firewalls, DDoS mitigation services, and other necessary precautions.

Environment Sustainability

An essential requirement for most companies today is environmental sustainability. ASIC-based transcoding is the most power-efficient of all transcoding alternatives. We believe that all companies should work to reduce their carbon footprints. Accordingly, choosing a co-location facility committed to energy efficiency and renewable energy sources will lower your energy costs and align with your company’s environmental goals.

Remember, the co-location facility is an extension of your live-streaming business. With the proper infrastructure, you can ensure high-quality, reliable live streams that satisfy your audience and grow your business. Take the time to visit potential facilities, ask questions, and thoroughly evaluate before deciding.

Cloud services are an effective way to begin live streaming. Still, once you reach a particular scale, it’s common to realize that you’re paying too much and can save significant OPEX by deploying transcoding infrastructure yourself. The question is, how to get started?

NETINT’s Build Your Own Live Streaming Platform symposium gathers insights from the brightest engineers and game-changers in the live-video processing industry on how to build and deploy a live-streaming platform.

In just three hours, we’ll cover the following:

  • Hardware options for live transcoding and encoding to cut costs by as much as 80%.
  • Software options for producing, delivering, and playing your live video streams.
  • Co-location selection criteria to achieve cloud-like performance with on-premise affordability.

You’ll also hear from two engineers who will demystify the process of assembling a live-streaming facility, how they identified and solved key hurdles, along with real costs and performance data.

Denser / Leaner / Greener - Symposium on Building Your Live Streaming Cloud

Build Your Own Streaming Infrastructure – Software

Build Your Own Streaming Infrastructure - Article by Jan Ozer from NETINT Technologies

My assumption is that you’re currently using a cloud-based service like AWS for your live streaming and are seeking to reduce costs by buying your own transcoding hardware, installing the necessary software, and hosting the server on-premises or in a co-location facility. This article covers the software side.

To begin, let’s acknowledge that AWS and other cloud services have created a well-featured and highly integrated ecosystem for live streaming and distribution. The downside is the cost.

To illustrate the potential savings, I’ll refer to this article, which compared the cost of producing 21 H.264 ladders and 27 HEVC ladders via AWS MediaLive and by encoding with NETINT’s recently launched Logan Video Server. As you can see in the table, MediaLive costs around $400K for H.264 and $1.8 million for HEVC, as compared to $11,140 in both cases for the co-located server.

Streaming Infrastructure - Table from article 'cloud or on-prem'
Table 1. Five-year cost comparison . AWS MediaLive pricing compared to the NETINT Server

While there are less expensive options available inside and outside of AWS, whenever you pay for hardware by the minute or hour of production, you’re vastly overpaying as compared to owning your own hardware. Sure, you say, but it’s so easy compared to running your own hardware.

If that’s a concern, here are some comforting words from David Heinemeier Hansson, co-owner, and CTO of software developer 37signals, the developer of the project management platform Basecamp and email service Hey. Recently, Hansson wrote  Why we’re leaving the cloud, a blog that detailed his companies’ decisions to do just that. Here’s the relevant quote.

Up until very recently, everyone ran their own servers, and much of the progress in tooling that enabled the cloud is available for your own machines as well. Don’t let the entrenched cloud interests dazzle you into believing that running your own setup is too complicated. Everyone and their dog did it to get the internet off the ground, and it’s only gotten easier since.

My wife has chihuahuas, and given their difficulties with potty training, I seriously doubt they could do it, but you get the point. To paraphrase FDR, all you have to fear is fear itself. The bottom line is that running your own live streaming service should cost relatively little CAPEX, will save significant OPEX, and won’t be nearly as challenging as you might be fearing.

Let’s look at your options for the software required to run your homegrown system.

Transcoding and Packaging Software

Figure 1 shows the minimum software and infrastructure needed for a live-streaming service. Presumably, you’ve already got the live production covered, and since AWS doesn’t offer a player, you have that piece addressed as well. You’ll need a content delivery network to deliver your streaming video, but you can continue to use CloudFront or other CDN. The software that you absolutely have to replace is the live transcoding and packaging component.

Here you have three options; multimedia frameworks, media servers, and “other.” Let’s discuss each in turn.

Multimedia Frameworks

Multimedia frameworks are software libraries, tools, and APIs that provide a set of functionalities and capabilities for multimedia processing, manipulation, and streaming. The best-known framework is FFmpeg, followed by GStreamer and GPAC, and they are all available open source.

Build Your Own Streaming Infrastructure - Software- diagram-2
Figure 1. Netflix uses GPAC for its packaging,
a significant technology endorsement for GPAC
and for multimedia frameworks in general.

Multimedia frameworks excel in projects at both ends of the complexity spectrum. For simple projects, like transcoding an input stream to an encoding ladder, you can create a script that inputs the stream, transcodes, and hands the packaged output streams off to a CDN in a matter of minutes. You can use the script to process thousands of simultaneous jobs, all at no charge.

At the other end of the spectrum, these frameworks also excel at complex jobs with idiosyncratic custom requirements that likely aren’t available in a server or commercial software product. The development, maintenance, and modification costs are considerable, but you get maximum feature flexibility if you’re willing to pay that cost.

What you don’t get with these tools is a user interface or simple configuration options – you start with a blank slate and must program in all desired features. What could be as simple as checking a checkbox in a streaming media server could require dozens or even thousands of lines of code in a multimedia framework.

Which takes us to streaming media servers.

Streaming Media Servers

The next category of products are streaming media servers, and it includes Wowza Streaming Engine, Nimble Streamer, and two open-source servers, Red5 and Ant Media Server. These servers tend to excel for most productions in the middle of the complexity spectrum and offer multiple advantages over multimedia frameworks.

There are several reasons why you might choose to use a streaming server over a multimedia framework, including a simplified setup and configuration. Most streaming servers provide out-of-the-box streaming solutions with pre-configured settings and management interfaces that simplify the setup and configuration process. While not all offer GUIs, those that don’t offer simple option selection in configuration files.

Build Your Own Streaming Infrastructure - Software- diagram-3
Figure 2. Wowza Streaming Engine is a highly regarded streaming server

As mentioned above, streaming servers often offer simpler access to advanced features that you’d have to craft by hand with a multimedia framework. They also offer better integration with third-party services like digital rights management (DRM) and content delivery networks. Between the simplified setup, easier access to features, and improved integration with other services, packaged servers can dramatically accelerate getting your live streaming service up and running.

Once you’re operational, you’ll appreciate management interfaces that monitor the health and performance of your streaming infrastructure, track viewer analytics, manage streaming workflows, and make real-time adjustments. If you’re in a dynamic demand environment, some streaming servers offer built-in scalability features and load balancing to manage the load over multiple hard transcoding resources. You’d have to build all that by hand or with plug-ins if using a multimedia framework.

The two potential downsides of streaming servers are cost and customizability. You’ll have to pay a monthly fee for some versions of these servers, and you may find it complicated or nearly impossible to add what you might consider to be essential features.

Other Streaming-Capable Programs

Most companies building their own live-streaming infrastructures will implement either a multimedia framework or a streaming server, but there are other programs that incorporate the core encoding and packaging functions. One such program is Norsk from id3as. Norsk bills itself as “an SDK that enables developers to easily create amazing, dynamic live video workflows and deploy them at any scale.” As such, it combines both video production and streaming server-related functions

You see this in Figure 3. The top portion shows that Norsk supports the typical codecs and packaging formats deployed by live-streaming producers. At the bottom of the figure, you see that Norsk also offers production-oriented features like multiple camera support, graphics and overlays, and transitions.

Build Your Own Streaming Infrastructure - Software- diagram-4
Figure 3. Norsk offers both production and server-related functions.

Interestingly, Norsk doesn’t have a GUI, instead offering a high-level API to simplify configuration and operation, with a Workflow Visualizer component to view the running state of the application. In this fashion, Norsk attempts to provide the configurability of multimedia frameworks with the ease of operation of scripting-driven streaming media servers.

Finding a program like Norsk that combines transcoding and packaging with other essential streaming-related functions makes a lot of sense; there’s one less vendor to onboard and one less product to learn and support. As remote production becomes more common, we expect more programs like Norsk to become available.

Those are your high-level options. If you’re interested in learning more about these and other programs that can drive encoding and packaging for your live transcoder. You should plan to attend our upcoming symposium; details will be available in the next couple of weeks.