Adrian Roe from id3as discussed Norsk, a technology designed to simplify the building of large-scale media workflows. id3as, originally a consultancy-led organization, works with major clients like DAZN and Nasdaq and is now pivoting to concentrate on Norsk, which it sells as an SDK. The technology underlying Norsk is responsible for delivering hundreds of thousands of live events annually and offers extensive expertise in low-latency and early-adoption technologies.
Adrian emphasized the company’s commitment to reliability, especially during infrastructure failures, and its initiatives in promoting energy-efficient streaming, including founding the Greening of Streaming organization. He also highlighted that about half of their deployments are cloud-based, suitable for fluctuating workloads, while the other half are on-premises or hybrid models, often driven by the need for high density at low cost and low energy consumption.
Simplify Building Your Own Streaming Cloud with Norsk SDK
Encoding Infrastructure is Simpler and Cheaper than Ever Before
The focus of the symposium was creating your own encoding infrastructure, and Adrian next focused on how new technologies were simplifying this and making it more affordable. For example, Adrian mentioned that advancements like NETINT’s Quadra video processing units (VPU) are changing the game, allowing some clients to consider shifting back to on-premises solutions.
Then, he described a recent server purchase to highlight the advancements in computing hardware capabilities. The server, which is readily available off-the-shelf and not particularly expensive, boasts impressive specs with 256 physical cores, 512 logical cores, and room for 24 U.2 cards like NETINT’s T408 or Quadra T1U.
Adrian then shared that during load testing, the server’s CPU profile was so extensive that it exceeded the display capacity of his screen, and he joked that it gave him an excuse to file an expense report for a new monitor. This anecdote emphasized the enormous processing capacity now available in relatively affordable hardware setups. The server occupies just 2U of rack space, and Adrian speculated that it could potentially deliver hundreds of channels in a fully loaded configuration, showcasing the leaps in efficiency and power in modern server hardware.
I think he used the second person — “it gives you an excuse to file an expense report for a new monitor” — but close enough.
Figure 2. Infrastructure is getting cheaper and more capable.
Adrian then shifted his focus to Norsk. He emphasized that Norsk is designed to cater to large broadcasters and systems integrators who require more than just off-the-shelf solutions. These clients often need specialized functionalities, like the ability to make automated decisions such as switching to a backup source if the primary one fails, without the need for human intervention.
They may also require complex multi-camera setups and dynamic overlays of scores or graphics during live events. Norsk is engineered to simplify these historically challenging tasks, enabling clients to easily put together sophisticated streaming solutions.
Figure 3. Why Norsk in a nutshell.
He also pointed out that while some existing solutions may offer these features out of the box, creating such capabilities from scratch usually requires a significant engineering effort and demands professionals with advanced skills and a deep understanding of media technology, including intricate details of different video and container formats and how to handle them.
According to Adrian, Norsk eliminates these complexities, making it easier for companies to implement advanced streaming functionalities without the need for specialized knowledge. In short, Norsk fills the gap in the market for large broadcasters and systems integrators who require customized, automated decision-making capabilities in their streaming solutions.
Norsk In Action
Adrian then began demonstrating Norsk’s operation. He started by showing Figure 4 as an example of an output that Norsk might produce. This involved multiple inputs and overlays of scores or graphics that might need to update dynamically.
Figure 4. A typical production with multiple inputs and overlays
that needed to change dynamically.
Figure 5 shows the code Norsk uses to produce this output in its entirety via its “low code” approach. Parsing through the code, in the top section, you see the inputs, outputs, and transformation nodes. In this example, Norsk ingests RTMP and SRT (and also a logo from the file) and publishes the output over WHEP, a WebRTC-HTTP Egress protocol. However, with Norsk it is easy to accommodate any of the common formats; for example, to change the output to (Low Latency) HLS, you would simply replace the “whep” output with HLS, and you’d be done.
Figure 5. Norsk’s low-code approach to the production shown in Figure 4.
The next section of code directs how the media flows between the nodes. Compose takes the video from the various inputs, while the audio mixer combines the audio from inputs 1 and 2. Finally, the WHEP output subscribes to the outputs of the audio mixer and compose nodes. That’s all the code needed to create a complex picture in picture.
Adrian then went over the building blocks of how Norsk solutions can be constructed. This started with an example of a pass-through setup where an RTMP input is published as a WebRTC output (Figure 6). With Norsk, all that’s needed is to specify that the output should get its audio and video from a particular input node, in this case, RTMP.
He then shared that Norsk is designed to be format-agnostic so that if the input node changes to another format like SRT or SDI, everything else in the setup will continue to function seamlessly. This ease of use allows for the quick development of sophisticated streaming solutions without requiring deep technical expertise.
Figure 6. A simple example of an RTMP input published as WebRTC.
Adrian then described how Norsk handles potential incompatibilities that might arise in a workflow. In the above example he noted that WebRTC supports only the Opus audio codec, which is not supported by RTMP.
In these cases, Norsk automatically identifies the incoming audio codec (in this case, AAC) and transcodes it to Opus for compatibility with WebRTC. It also changed the encapsulation of the H.264 video for WebRTC compatibility. These automated adjustments showcase Norsk’s ability to simplify complex streaming workflows by making intelligent decisions to ensure compatibility and functionality.
Norsk will automatically adjust your workflow to make it work
Figure 7. Norsk will automatically adjust your workflow to make it work;
in this case, converting AAC to Opus and encapsulating the H.264 encoded video
for WebRTC output.
Next in the quick tour of “building blocks,” Adrian showed how easy it is to build a source switcher, allowing the user to switch dynamically between two camera inputs (Figure 8). He explained how id3as’ low-code approach made it easy and natural to extend this (for example to handle an unknown number of sources that might come and go during a live event).
Figure 8. A simple production with two cameras, a source switcher, and WebRTC output.
According to Adrian, this simplicity allows engineers building solutions with Norsk to focus on the user experience they want to deliver to their customers as well as how to automate and simplify operations. They can focus on the intended result, not on the highly complex media technology required to deliver that result. This puts their logic into a very transparent context and simplifies building an application that delivers what’s intended.
To better manage and control operations, Norsk supports visualizations via an OpenTelemetry API, which enables real-time data retrieval and input into a decisioning system for monitoring. As well as simple integration with these monitoring tools, Norsk also includes the visualizer shown in Figure 9 that renders this data as an easy-to-understand flow of media between nodes. You’ll see another two examples of this below.
Figure 9. Norsk’s workflow visualization makes it simple
to understand the media flow within an application.
Adrian then returned to the first picture-in-picture application shown to illustrate how the effect was created. You see that it’s very easy to position, size, and control each of the three elements, so the engineer can focus on the desired output, not anything to do with what the media itself looks like.
Figure 10. Integrating three production inputs into a picture-in-picture presentation in Norsk.
Adrian highlighted the convenience and flexibility of Norsk’s low-code approach by describing how the system handles dynamic updates and configurations using code. He emphasized that the entire process of making configuration changes, like repositioning embedded areas or switching sources, involves just a few lines of code. This approach allows users to easily build complex functionalities like a video mixer with minimal engineering efforts.
Additionally, Adrian described how overlays are seamlessly integrated into the workflow. He explained that a browser overlay is treated as just another source which can be transformed and composed alongside other sources. By combining and outputting these elements, a sophisticated output with overlays can be achieved with minimal code.
Adrian emphasized that the features he demonstrated are sufficient to build a comprehensive live production system using Norsk like that shown in Figure 11. With Norsk’s low-code approach, he asserted, there are no additional complex calls required to achieve the level of sophistication demonstrated. With Norsk, he reiterated engineers building media applications can focus on creating the desired user experience rather than dealing with intricate technical details.
Figure 11. Norsk enables productions like this with just a few lines of code.
Taking a big-picture view of how productions are created and refined, Adrian shared how the entire process of describing media requirements and building proof of concepts is streamlined with Norsk’s approach. With just a few lines of code, proof of concepts can be developed in a matter of hours or days. This leads to shorter feedback cycles with potential users, enabling quicker validation of whether the solution meets their needs. In this manner, Adrian noted that Norsk enables rapid feature development and allows for quicker feature launches to the market.
Integrating Encoding Hardware
Adrian then shifted his focus to integrations with encoding hardware, noting that many customers have production hardware that utilizes transcoders and VPUs like those supplied by NETINT to achieve high-scale performance. However, the development teams might not have the same production setup for testing and development purposes. Norsk addresses this challenge by providing an easy way for developers to work productively on their applications without requiring the exact production hardware.
You see this in Figure 12, an example of where developers can configure different settings for different environments. For instance, in a production or QA environment, the output settings could be configured for 1080p at 60 frames with specific Quadra configurations.
In contrast, in a development environment, the output settings might be configured for the x264 codec outputting 720p with different parameters, like using the ultrafast preset and zero latency. This approach allows engineers to have a productive development experience while not requiring the same processing power or hardware as the production setup.
Figure 12. Norsk can use one set of transcoding parameters
for development (on the right), and another for production.
Adrian then described how Norsk maximizes the acceleration hardware capabilities of third-party transcoders to optimize performance, sharing that with the NETINT cards, Norsk outperformed FFmpeg. For example, when using hardware transcoders, it’s generally more efficient to keep the processing on the hardware as much as possible to avoid unnecessary data transfers.
Adrian provided a comparison between scenarios where hardware acceleration is used and scenarios where it’s not. In one example, he showed how a NETINT T408 was used for hardware decoding, but some manipulations like picture-in-picture and resizing weren’t natively supported by the hardware. In this case, Norsk pulled the content to the CPU, performed the necessary manipulations, and then sent it back to the hardware for encoding (Figure 13).
Figure 13. Working with the T408, Norsk had to scale and overlay via the host CPU.
In contrast, with a Quadra card that does support onboard scaling and overlay, Norsk performed these functions on the hardware, remarkably using the same exact application code as for the T408 version (Figure 14). This way, Adrian emphasized, Norsk maximized the efficiency of the hardware transcoder and optimized overall system performance.
Figure 14. Norsk was able to scale an overlay on the Quadra using the same code as on the T408.
Adrian also highlighted the practicality of using Norsk by offering trial licenses for users to experience its capabilities. The trial license allows users to explore Norsk’s features and benefits, showcasing how it leverages emerging hardware technologies in the market to deliver high-density, high-availability, and energy-efficient media experiences. He noted that the trial software was fully capable, though no single session can exceed 20 minutes in duration.
Adrian then took a question from the audience, addressing Norsk’s support for SCTE-35. Adrian highlighted that Norsk is capable of SCTE-35 insertion to signal events such as ad insertion and program switching. Additionally, he noted that Norsk allows the insertion of tags into HLS and DASH manifest files, which can trigger specific events in downstream systems. This functionality enables seamless integration and synchronization with various parts of the media distribution workflow.
Adrian also mentioned that Norsk offers integration with digital rights management (DRM) providers. This means that after content is processed and formatted, it can be securely packaged to ensure that only authorized viewers have access to it. Norsk’s background in the broadcast industry has enabled it to incorporate these capabilities that are essential for delivering content to the right audiences while maintaining content protection and security.