Streaming technologies have revolutionized the digital media landscape, transforming how content is distributed and delivered to audiences worldwide. One pioneering figure in this field is Alex Zambelli, whose career at Microsoft has been closely intertwined with the rise of streaming as the dominant digital media distribution method. Zambelli’s work with NBC Sports, particularly during the 2008 Beijing Olympics and subsequent events, was pivotal in advancing online streaming capabilities and earning industry recognition. This article, based on Jan Ozer‘s conversation with Alex during Voices of Video, explores Zambelli’s contributions to streaming technologies, the implementation of multi-view camera angles in Sunday Night Football, and key considerations in livestreaming from insights gained during Olympic events.
Evolution of Streaming Technologies
Alex Zambelli’s career at Microsoft has coincided with the transition from physical media to streaming as the dominant method of distributing digital media. Around 2007, streaming started gaining momentum, gradually overtaking CDs, DVDs, and Blu-rays. Zambelli’s focus on streaming technologies led him to work on Microsoft’s Silverlight, a competitor to Flash, which facilitated the creation of rich web experiences and premium media delivery, including digital rights management. This technology was a significant milestone in the evolution of streaming.
Zambelli’s collaboration with NBC Sports began with the 2008 Beijing Olympics, where they aimed to pioneer online streaming of all Olympics content. Initially, they utilized Windows Media and Silverlight, incorporating adaptive streaming capabilities. The subsequent transition to Microsoft’s Smooth Streaming technology for the 2010 Vancouver Olympics marked a significant advancement. This technology offered on-demand and live streams in high definition, providing viewers with an immersive and seamless experience. These groundbreaking endeavors earned Zambelli and the team recognition from the industry, including nominations for sports Emmys.
Multi-View Camera Angles in Sunday Night Football
The implementation of Smooth Streaming technology played a crucial role in enabling the seamless transition between camera angles in Sunday Night Football broadcasts. By utilizing a single manifest that contained all four camera angles, switching between views became as smooth as switching between bitrates in modern streaming protocols like DASH or HLS. This technology, developed by the broadcast team, allowed viewers to simultaneously watch multiple camera angles, enhancing the overall viewing experience.
Key Considerations in Livestreaming: Insights from Olympic Events
Livestreaming presents unique challenges compared to on-demand streaming due to its real-time nature. Issues such as packet loss, segment loss, blackouts, and ad insertions demand immediate attention and resolution. Unlike on-demand streaming, where there is some leeway to address content or delivery chain issues over time, livestreaming requires constant vigilance. Even a brief interruption or technical problem can significantly impact the viewer experience.
Successful livestreaming events often involve collaborative efforts from multiple companies, including Microsoft, NBC, Akamai, and iStreamPlanet. These events require dedicated teams ready to address and resolve any issues that arise in real time. The nature of livestreaming necessitates a higher level of focus and attention compared to on-demand streaming. It is crucial to prioritize and allocate sufficient resources to ensure the seamless execution of live events. The potential for unexpected issues or failures makes constant monitoring and immediate troubleshooting essential, as even a minor disruption can have significant consequences.
Voices of Video - Cloud Gaming being Real
VOICES OF VIDEO
Scalable distribution in the age of DRM: Key Challenges and Implications.
Watch the full conversation on YouTube: https://youtu.be/s_afoa71muM
Evolution of Video Codecs and Streaming Protocols
The evolution of video codecs and streaming protocols has played a vital role in shaping the streaming landscape. In the early 2000s, the popular video codecs for streaming were VC-1 (supported by Silverlight) and H.264 (supported by Flash). However, the introduction of HTML5 posed challenges for streaming solutions, as the HTML specification lacked the necessary APIs to provide the required level of control and functionality for streaming.
Silverlight and Flash emerged as proprietary plugins that advanced streaming technology beyond what HTML could offer at the time. They provided opportunities to overcome HTML’s limitations and introduced features such as media stream sources and content protection (DRM) to enhance the streaming experience. Silverlight’s media stream source concept, which later influenced HTML’s media source extensions, allowed developers to handle their own segment downloading and parsing, passing the video and audio streams to a media buffer for decoding and rendering. Content protection was a crucial aspect addressed by Silverlight and Flash, as HTML lacked a robust solution for DRM.
Around 2011-2012, Silverlight and Flash gradually phased out as HTML5 matured, offering the necessary APIs for implementing streaming protocols like DASH, HLS, and Smooth Streaming within the browser while incorporating DRM capabilities. HTML5 overcame initial growing pains and established itself as the predominant platform for streaming. By 2014-2015, HTML5 had evolved sufficiently to support basic streaming functionalities and content protection with DRM.
Optimizing Encoding Quality and Cost
Achieving optimal encoding quality while considering cost is a crucial concern for content creators and distributors. At Warner Brothers Discovery, the x264 and x265 codecs are commonly used for transcoding purposes, employing the slow or slower presets to achieve higher quality outputs. This approach balances encoding cost with desired video quality.
Recent discussions within the organization have prompted exploration into the idea of customizing presets based on specific resolutions and content complexities. The focus is on optimizing encoding efficiency by adjusting presets according to the intricacy of the content and the resolution being processed. Different resolutions have varying encoding requirements, and applying the very slow preset to all resolutions may result in unnecessary computational overhead for lower resolutions. Similarly, content complexity plays a role in determining the appropriate preset, as not all content requires the very slow preset. Customizing presets based on resolution and content characteristics allows for more efficient allocation of computational resources.
The popularity and viewership of specific content also factor into the choice of preset. Content with a larger audience may benefit from the slower preset due to potential CDN savings resulting from improved video quality. On the other hand, smaller-scale content with fewer viewers may not necessitate the same level of complexity in encoding. Balancing encoding quality and cost requires thoughtful consideration of these factors.
Adaptive Encoding Ladders: Variations, Frame Rates, and Device Considerations
Adaptive encoding ladders play a crucial role in delivering content based on source resolution and frame rate. At Warner Brothers Discovery, these encoding ladders consist of approximately six to eight different variations, allowing flexibility in content delivery. The source resolution determines the stopping point within the UHD ladder, minimizing the need for multiple permutations of the ladders themselves.
Variations in frame rates necessitate different encoding ladders. The introduction of high frame rates, especially with reality TV content, requires separate encoding ladders to preserve the temporal resolution. Encoding ladders also differ for SDR and HDR content, with distinctions made between HDR10 and Dolby Vision 5, offering specific encoding settings for each.
While currently the same encoding ladders are used for all devices, specific subsets of the ladder may be delivered to certain devices to accommodate their capabilities. Device differentiation is particularly important for high frame rates or resolutions above 1080p. By intentionally capping the manifest delivered to devices that cannot handle certain capabilities, compatibility and optimal viewing experiences can be ensured. Differentiating encoding ladders for various devices is essential for maintaining consistent quality across different devices.
VBR Control, Per-Title Encoding, and DRM Considerations in Video Encoding
Video encoding involves crucial considerations such as VBR control, per-title encoding, and DRM integration. At Warner Brothers Discovery, the x264 and x265 codecs employ a CRF (Constant Rate Factor) rate control with a bitrate and buffer cap for VBR (Variable Bit Rate) encoding. This approach ensures control over codec levels, peak rates, and overall encoding quality.
VBR control is achieved by using VBV (Video Buffering Verifier) buffer size and VBV max rate parameters. These parameters allow for setting the highest average bitrate for the video, while CRF brings the average bitrate below the specified max rate in most cases. This method enables per-title encoding, achieving CDN savings without compromising quality. Differentiating encoding ladders based on resolutions, frame rates, and HDR formats is essential to conform to content licensing agreements and compatibility requirements.
DRM has a significant impact on the encoding ladder. Licensing agreements often demand different security levels for various resolutions, necessitating the assignment of different encryption keys and playback policies to different security groups. The use of hardware-backed DRM, such as Widevine L1 and PlayReady SL3000, is often required for higher resolutions. The trend in the industry is moving towards increased use of DRM across the entire encoding ladder, with a focus on stricter requirements for HDR content. Content licensing agreements are evolving to require comprehensive DRM implementation for improved content protection.
Exploring Hardware and Software DRM: Implementation and Impact on Video Streaming
The choice between hardware and software DRM implementations has implications for video streaming security and performance. Hardware DRM involves integrating DRM clients into the secure video path of the system, tightly coupling with the hardware decoder. This ensures secure decoding and decryption of video streams, preventing unauthorized access to the content. Hardware-based DRM establishes a secure video path or secure media path, where the decrypted and decoded bits cannot be retrieved or accessed by applications. This level of security is achieved through close integration with the hardware decoder, ensuring protection throughout the entire decoding process.
On the other hand, software DRM performs decoding and decryption in software, introducing a potential vulnerability where the decoded bits could be compromised or accessed by unauthorized parties. Software DRM lacks the same level of hardware integration and security provided by hardware-based DRM.
The limitations of software-based DRM can impact the resolution of premium content when viewing it on certain platforms or browsers without hardware support. For example, Chrome’s support for Widevine DRM is limited to L3, the software-based implementation. This can result in inferior video quality compared to browsers like Edge or Safari, which support hardware DRM, allowing for a more secure video path and higher quality streaming.
Unifying Packaging Formats: HLS, DASH, and CMAF in Video Streaming
Standardizing packaging formats is crucial for compatibility and interoperability in video streaming. Warner Brothers Discovery and Hulu have been utilizing both HLS (HTTP Live Streaming) and DASH (Dynamic Adaptive Streaming over HTTP) for content distribution. HLS is predominantly used for Apple devices, while DASH is employed for other devices.
The commonality between HLS and DASH lies in their utilization of the CMAF (Common Media Application Format) standard. CMAF serves as a standardized version of fragmented MP4 (fMP4), specifying the necessary boxes and encryption application for fMP4 media segments used in HLS and DASH. CMAF is not a streaming protocol itself but encompasses two components.
Firstly, it defines a refined version of fMP4 for HLS and DASH, establishing a more precise set of guidelines for compatibility. Many existing HLS and DASH implementations using fMP4 media segments are already CMAF-compliant.
Secondly, CMAF specifies a hypothetical logical media presentation model, outlining the relationship between tracks, segments, fragments, and chunks. This model closely resembles HLS or DASH without explicitly using those terms. It provides a framework for addressing different levels of the media presentation.
HLS and DASH can be considered as the physical implementations of the logical media presentation model described by CMAF. The HLS-DASH interoperability specification, such as CTA 5005, heavily relies on CMAF, serving as a unifying model and describing how both HLS and DASH integrate with CMAF. This unification allows for similar concepts to be described across both formats, enhancing compatibility and simplifying the streaming ecosystem.
Exploring Hardware and Software DRM: Implementation and Impact on Video Streaming
The streaming industry faces challenges related to content publishing and compatibility across diverse platforms and devices. The Consumer Technology Association (CTA) plays a crucial role in addressing these challenges and streamlining content publishing processes. The CTA is actively working to enhance interoperability within the streaming industry, allowing publishers to focus primarily on content development rather than compatibility concerns.
The CTA’s WAVE initiative serves as a platform for fostering efforts to streamline content publishing and compatibility. One major challenge in the streaming landscape is the presence of numerous application development platforms. For example, within Warner Brothers Discovery, there are approximately a dozen or 16 different application development platforms utilized for their streaming service, with some overlap between certain platforms such as Android TV and Fire TV.
Developers often encounter the unique scenario of building multiple versions of the same application in various programming languages using different platform APIs. This complexity arises due to the diversity of devices and platforms requiring tailored applications. This situation is unparalleled compared to other industries where typically a web app, iOS app, and Android app cover the majority of development needs.
The multitude of application development platforms poses challenges in areas such as encoding and packaging. Determining device capabilities becomes arduous without a standardized specification or set of APIs that can provide consistent and reliable information across different platforms.
The standardization of device media capabilities detection APIs is a crucial step towards enhancing compatibility in the streaming industry. Efforts within the World Wide Web Consortium (W3C) to define these APIs in HTML are underway. However, it is important to note that not all platforms utilize HTML, necessitating the presence of similar APIs across all platforms. Once standardized APIs for media capabilities detection are established, developing a standardized method for signaling these capabilities to servers becomes essential. This facilitates targeting specific devices based on their capabilities and enables actions such as manifest filtering.
Standardization efforts are vital for simplifying content publishing and enhancing compatibility in the streaming industry. By establishing standardized specifications and APIs, the industry can overcome compatibility challenges and streamline the development and distribution of streaming content.
The Leverage Is Imperative
The evolution of streaming technologies has brought about significant advancements in digital media distribution and delivery. Pioneers like Alex Zambelli have played a crucial role in driving innovation and pushing the boundaries of what is possible in online streaming. The implementation of multi-view camera angles, considerations in livestreaming, advancements in video codecs and streaming protocols, and optimization of encoding quality and cost are key areas that shape the streaming landscape. Standardization efforts, hardware and software DRM implementations, and the role of organizations like the CTA further contribute to enhancing compatibility and simplifying content publishing in the streaming industry. As the streaming industry continues to evolve, leveraging these advancements and best practices is imperative to deliver high-quality, seamless streaming experiences to audiences worldwide.