<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	 xmlns:media="http://search.yahoo.com/mrss/" >

<channel>
	<title>Jameson Steiner &#8211; Bitmovin</title>
	<atom:link href="https://bitmovin.com/author/jameson-steiner/feed" rel="self" type="application/rss+xml" />
	<link>https://bitmovin.com</link>
	<description>Bitmovin provides adaptive streaming infrastructure for video publishers and integrators. Fastest cloud encoding and HTML5 Player. Play Video Anywhere.</description>
	<lastBuildDate>Fri, 06 Jan 2023 09:59:41 +0000</lastBuildDate>
	<language>en-GB</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	

<image>
	<url>https://bitmovin.com/wp-content/uploads/2023/11/bitmovin_favicon.svg</url>
	<title>Jameson Steiner &#8211; Bitmovin</title>
	<link>https://bitmovin.com</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>Video Tech Deep-Dive: Live Low Latency Streaming Part 3 &#8211; Low-Latency HLS</title>
		<link>https://bitmovin.com/live-low-latency-hls</link>
		
		<dc:creator><![CDATA[Jameson Steiner]]></dc:creator>
		<pubDate>Mon, 10 Aug 2020 09:50:09 +0000</pubDate>
				<category><![CDATA[Developers]]></category>
		<category><![CDATA[live encoding]]></category>
		<category><![CDATA[low-latency]]></category>
		<guid isPermaLink="false">https://bitmovin.com/?p=122639</guid>

					<description><![CDATA[<p>This blog post is the final piece of our Live Low-Latency Streaming series, where we previously covered the basic principles of low-latency streaming in OTT and LL-DASH. This final post focuses on latency when using Apple’s HTTP Live Streaming (HLS) protocol and how the latency time can be reduced. This article assumes that you are...</p>
<p>The post <a rel="nofollow" href="https://bitmovin.com/live-low-latency-hls">Video Tech Deep-Dive: Live Low Latency Streaming Part 3 &#8211; Low-Latency HLS</a> appeared first on <a rel="nofollow" href="https://bitmovin.com">Bitmovin</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p><img fetchpriority="high" decoding="async" class="aligncenter size-large wp-image-122660" src="https://bitmovin.com/wp-content/uploads/2020/08/Blog-live-low-latency-streaming-p.3-1-1024x512.jpg" alt="- Bitmovin" width="1024" height="512"><br />
<span style="font-weight: 400;">This blog post is the final piece of our Live Low-Latency Streaming series, where we previously covered the basic principles of low-latency streaming in OTT and LL-DASH. This final post focuses on latency when using </span><a href="https://developer.apple.com/streaming/" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">Apple’s HTTP Live Streaming</span></a><span style="font-weight: 400;"> (HLS) protocol and how the latency time can be reduced. This article assumes that you are already familiar with the basics of HLS and its manifest/playlist mechanics. You can view the first two posts below:</span></p>
<ul>
<li style="font-weight: 400;"><span style="font-weight: 400;"><a href="https://bitmovin.com/live-low-latency-streaming-p1/">Part 1 Fundamentals</a> </span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;"><a href="https://bitmovin.com/live-low-latency-streaming-p2/">Part 2 Chunked delivery with CMAF in MPEG-DASH</a> </span></li>
</ul>
<h2><span style="font-weight: 400;">Why is latency high in HLS?</span></h2>
<p><span style="font-weight: 400;">HLS in its current specifications favors stream reliability over latency. Higher latency is accepted in exchange for stable playback without interruptions. In section</span> <a href="https://tools.ietf.org/html/rfc8216#section-6.3.3" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">6.3.3. Playing the Media Playlist File</span></a><span style="font-weight: 400;"> the HLS specification states that a playback client</span></p>
<blockquote><p><span style="font-weight: 400;">SHOULD NOT choose a segment that starts less than three target durations from the end of the playlist file</span></p></blockquote>
<p><img decoding="async" class="aligncenter wp-image-122641" src="https://bitmovin.com/wp-content/uploads/2020/08/Low-Latency-HLS_Earliest-stream-segment-to-join_linear-visual.jpg" alt="Low Latency HLS _Earliest stream segment to join_linear visual" width="720" height="346" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/08/Low-Latency-HLS_Earliest-stream-segment-to-join_linear-visual.jpg?size=144x69&amp;lossy=2&amp;strip=1&amp;webp=1 144w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/08/Low-Latency-HLS_Earliest-stream-segment-to-join_linear-visual-300x144.png?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/08/Low-Latency-HLS_Earliest-stream-segment-to-join_linear-visual.jpg?size=432x208&amp;lossy=2&amp;strip=1&amp;webp=1 432w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/08/Low-Latency-HLS_Earliest-stream-segment-to-join_linear-visual.jpg?lossy=2&amp;strip=1&amp;webp=1 512w" sizes="(max-width: 720px) 100vw, 720px" /><br />
<span style="font-weight: 400;">Honoring this requirement results in having a latency of at least 3 target durations. Given typical target durations for current HLS deployments of 10 or </span><a href="https://developer.apple.com/documentation/http_live_streaming/hls_authoring_specification_for_apple_devices#2969514" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">6 seconds</span></a><span style="font-weight: 400;">, we would end up with a latency of at least 30 or 18 seconds, which is far from low. Even if we choose to ignore the above requirement, the fact that segments are typically produced, transferred, and consumed in their entirety poses a high risk of buffer underruns and subsequent playback interruptions, as described in more detail in the first part of this blog series.</span><br />
<span style="font-weight: 400;">The HLS media playlist for the above depicted this live stream would look something like this:</span><br />
[bg_collapse view=&#8221;button-blue&#8221; color=&#8221;#f7f7f7&#8243; icon=&#8221;eye&#8221; expand_text=&#8221;View HLS media playlist&#8221; collapse_text=&#8221;Close HLS media playlist&#8221; ]</p>
<pre><img decoding="async" class="aligncenter wp-image-122644 size-full" src="https://bitmovin.com/wp-content/uploads/2020/08/Screenshot-2020-08-10-at-11.23.21.png" alt="Low-Latency HLS _HLS media playlist call request_code screenshot" width="368" height="310" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/08/Screenshot-2020-08-10-at-11.23.21-300x253.png?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/08/Screenshot-2020-08-10-at-11.23.21.png?lossy=2&amp;strip=1&amp;webp=1 368w" sizes="(max-width: 368px) 100vw, 368px" /></pre>
<p>[/bg_collapse]</p>
<h2><span style="font-weight: 400;">Road to Low-Latency HLS</span></h2>
<p><span style="font-weight: 400;">2017’s Periscope, the most popular platform for live streaming of user-generated content at the time, investigated streaming solutions to replace their RTMP- and HLS-based hybrid approach with a more scalable one. The requirement was to offer similar end-to-end latency as RTMP but in a more cost-effective way; considering that their use case was streaming to large audiences. Periscope presented </span><a href="https://medium.com/@periscopecode/introducing-lhls-media-streaming-eb6212948bef" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">their solution</span></a><span style="font-weight: 400;"> to high latency issues: which took Apple’s HLS protocol, made two fundamental changes and called it Low-Latency HLS (LHLS):</span></p>
<ol>
<li style="font-weight: 400;"><span style="font-weight: 400;">Segments are delivered using HTTP/1.1 Chunked Transfer Coding</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">Segments are advertised in the HLS playlist before the are available</span></li>
</ol>
<p><span style="font-weight: 400;">If you read our previous blog posts about Low-Latency streaming, you might recognize these simple concepts as being the key ingredients for today’s OTT-based Low-Latency streaming approaches, like LL-DASH. Periscope’s work likely sparked and influenced the following developments around low-latency streaming such as LL-DASH and a </span><a href="https://github.com/video-dev/hlsjs-rfcs/blob/a6e7cc44294b83a7dba8c4f45df6d80c4bd13955/proposals/0001-lhls.md" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">community-driven initiative</span></a><span style="font-weight: 400;"> for defining modifications to HLS aiming to reduce streaming latency that started at the end of 2018. </span><br />
<span style="font-weight: 400;">The core of the community proposal for LHLS was the same as the aforementioned concepts. Segments should be loaded in chunks using HTTP CTE and early availability of incomplete segments should be signaled using a new </span><span style="font-weight: 400;">#EXT-X-PREFETCH</span><span style="font-weight: 400;"> tag in the playlist. In the example below, the client can already load and consume the currently available data of </span><span style="font-weight: 400;">6.ts</span><span style="font-weight: 400;"> and continue to do so as the chunks become available over time. Furthermore, the request for the segment </span><span style="font-weight: 400;">7.ts</span><span style="font-weight: 400;"> can be made early on to save network round-trip time, even though production had not started yet. It is also worth mentioning that the LHLS proposal preserves full backward-compatibility allowing standard HLS clients to consume such streams. This was the gist of the proposed implementation; you can find the full proposal in the </span><a href="https://github.com/video-dev/hlsjs-rfcs/blob/a6e7cc44294b83a7dba8c4f45df6d80c4bd13955/proposals/0001-lhls.md" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">hlsjs-rfcs GitHub repository</span></a><span style="font-weight: 400;">.</span><br />
[bg_collapse view=&#8221;button-blue&#8221; color=&#8221;#f7f7f7&#8243; icon=&#8221;eye&#8221; expand_text=&#8221;View LHLS media playlist proposal&#8221; collapse_text=&#8221;Close LHLS media playlist proposal&#8221; ]<br />
<img loading="lazy" decoding="async" class="aligncenter wp-image-122645 size-full" src="https://bitmovin.com/wp-content/uploads/2020/08/Screenshot-2020-08-10-at-11.24.30.png" alt="Low-Latency HLS _LHLS modification proposal_code screenshot" width="365" height="363" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/08/Screenshot-2020-08-10-at-11.24.30-150x150.png?lossy=2&amp;strip=1&amp;webp=1 150w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/08/Screenshot-2020-08-10-at-11.24.30-300x298.png?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/08/Screenshot-2020-08-10-at-11.24.30.png?lossy=2&amp;strip=1&amp;webp=1 365w" sizes="(max-width: 365px) 100vw, 365px" /><br />
[/bg_collapse]<br />
<span style="font-weight: 400;">Individuals across several companies in the media industry came together to work on this proposal with the hope that also Apple, being the driving force behind HLS, would join in and work the proposal into the official HLS specification. However, things came to fruition very differently than expected as Apple presented its own preliminary version, a very different approach during their 2019’s Worldwide Developers Conference.</span><br />
<span style="font-weight: 400;">Despite it being (and staying) a proprietary approach, some companies, like Twitch, are successfully using it in their production systems.</span></p>
<h2><span style="font-weight: 400;">Apple’s Low-Latency HLS</span></h2>
<p><span style="font-weight: 400;">In this section we’ll cover the principles of Apple’s preliminary specification for Low-Latency HLS.</span></p>
<h3><span style="font-weight: 400;">Generation of Partial Media Segments</span></h3>
<p><span style="font-weight: 400;">While HLS content is split into individual segments, in low-latency HLS each segment further consists of </span><i><span style="font-weight: 400;">parts </span></i><span style="font-weight: 400;">that are independently addressable by the client. For example, a segment of 6 seconds can consist of 30 parts of 200ms duration each. Depending on the </span><a href="https://go.bitmovin.com/ultimate-guide-container-formats"><span style="font-weight: 400;">container format</span></a><span style="font-weight: 400;">, such parts can represent CMAF chunks or a sequence of TS packets. This partitioning of segments decouples the end-to-end latency from the long segment duration and allows the client to load parts of a segment as soon as they become available. Compared to LL-DASH, this is achieved by using HTTP CTE, however, the MPD does not advertise individual parts/chunks of segments.</span><br />
[bg_collapse view=&#8221;button-blue&#8221; color=&#8221;#f7f7f7&#8243; icon=&#8221;eye&#8221; expand_text=&#8221;View partial media segment generation in low latency HLS&#8221; collapse_text=&#8221;Close partial media segment generation in low latency HLS&#8221; ]<br />
<img loading="lazy" decoding="async" class="aligncenter wp-image-122646 size-full" src="https://bitmovin.com/wp-content/uploads/2020/08/Screenshot-2020-08-10-at-11.28.32.png" alt="Partial media segment generation in Low-Latency HLS _code screenshot" width="515" height="525" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/08/Screenshot-2020-08-10-at-11.28.32-294x300.png?lossy=2&amp;strip=1&amp;webp=1 294w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/08/Screenshot-2020-08-10-at-11.28.32.png?size=384x391&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/08/Screenshot-2020-08-10-at-11.28.32.png?lossy=2&amp;strip=1&amp;webp=1 515w" sizes="(max-width: 515px) 100vw, 515px" /><br />
[/bg_collapse]<br />
<span style="font-weight: 400;">Partial segments are advertised using a new</span><span style="font-weight: 400;"> EXT-X-PART</span><span style="font-weight: 400;"> tag. Note that partial segments are only advertised for the most recent segments in the playlist. Furthermore, the partial segments (</span><span style="font-weight: 400;">filePart272.x.mp4</span><span style="font-weight: 400;">) and the respective full segments (</span><span style="font-weight: 400;">fileSequence272.mp4</span><span style="font-weight: 400;">) are offered.</span><br />
<span style="font-weight: 400;">Partial segments can also reference the same file but at different byte ranges. Clients can thereby load multiple partial segments with a single request and save round-trips compared to making separate requests for each part (as seen below).</span><br />
<img loading="lazy" decoding="async" class="aligncenter wp-image-122647 size-full" src="https://bitmovin.com/wp-content/uploads/2020/08/Screenshot-2020-08-10-at-11.30.38.png" alt="Low-Latency HLS_byterange variations for partial segment requests_code screenshot" width="571" height="60" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/08/Screenshot-2020-08-10-at-11.30.38-300x32.png?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/08/Screenshot-2020-08-10-at-11.30.38.png?size=384x40&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/08/Screenshot-2020-08-10-at-11.30.38.png?lossy=2&amp;strip=1&amp;webp=1 571w" sizes="(max-width: 571px) 100vw, 571px" /></p>
<h3><span style="font-weight: 400;">Preload hints and blocking of Media downloads</span></h3>
<p><span style="font-weight: 400;">Soon to be available partial segments are advertised prior to their actual availability in the playlist by a new</span><span style="font-weight: 400;"> EXT-X-PRELOAD-HINT tag</span><span style="font-weight: 400;">. This enables clients to open a request early and the server will respond once the data becomes available. This way the client can “save” the round-trip time for the request.</span><br />
<img loading="lazy" decoding="async" class="aligncenter wp-image-122648 size-full" src="https://bitmovin.com/wp-content/uploads/2020/08/Screenshot-2020-08-10-at-11.32.28.png" alt="Low-Latency HLS _Preload hints for media segments_code screenshot" width="513" height="73" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/08/Screenshot-2020-08-10-at-11.32.28-300x43.png?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/08/Screenshot-2020-08-10-at-11.32.28.png?size=384x55&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/08/Screenshot-2020-08-10-at-11.32.28.png?lossy=2&amp;strip=1&amp;webp=1 513w" sizes="(max-width: 513px) 100vw, 513px" /></p>
<h3><span style="font-weight: 400;">Playlist Delta Updates</span></h3>
<p><span style="font-weight: 400;">Clients have to refresh HLS playlists more frequently for low-latency HLS. Playlist Delta Updates can be used to reduce the amount of data transferred for each playlist request. A new</span><span style="font-weight: 400;"> EXT-X-SKIP</span><span style="font-weight: 400;"> tag replaces the content of the playlist that the client already received with a previous request.</span></p>
<h3><span style="font-weight: 400;">Blocking of Playlist reload</span></h3>
<p><span style="font-weight: 400;">The discovery of new segments becoming available for an HLS live stream is usually applied by the client reloading the playlist file in regular intervals and checking for new segments being appended. In the case of low-latency streaming, it is desirable to avoid any delay from a (partial) segment becoming available in the playlist to the client discovering its availability. With the playlist reloading approach, such discovery delay can be as high as the reload time interval in the worst case.</span><br />
<span style="font-weight: 400;">With the new feature of blocking playlist reloads, clients can specify which future segment’s availability they are awaiting and the server will have to hold onto that playlist request until that specific segment becomes available in the playlist. The segment to be awaited for is specified using a query parameter on the playlist request.</span></p>
<h3><span style="font-weight: 400;">Rendition Reports</span></h3>
<p><span style="font-weight: 400;">When playing at low latencies, fast bitrate adaptation is crucial to avoid playback interruptions due to buffer underruns. To save round-trips during playlist switching, playlists must contain rendition reports via a new</span><span style="font-weight: 400;"> EXT-X-RENDITION-REPORT</span><span style="font-weight: 400;"> tag that informs about the most recent segment and part in the respective rendition.</span><br />
<img loading="lazy" decoding="async" class="aligncenter size-full wp-image-122649" src="https://bitmovin.com/wp-content/uploads/2020/08/Screenshot-2020-08-10-at-11.34.11.png" alt="- Bitmovin" width="559" height="40" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/08/Screenshot-2020-08-10-at-11.34.11-300x21.png?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/08/Screenshot-2020-08-10-at-11.34.11.png?size=384x27&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/08/Screenshot-2020-08-10-at-11.34.11.png?lossy=2&amp;strip=1&amp;webp=1 559w" sizes="(max-width: 559px) 100vw, 559px" /></p>
<h2><span style="font-weight: 400;">Conclusion</span></h2>
<p><span style="font-weight: 400;">For more detailed information on Apple’s low-latency HLS take a look at the </span><a href="https://developer.apple.com/documentation/http_live_streaming/protocol_extension_for_low-latency_hls_preliminary_specification" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">Preliminary Specification</span></a><span style="font-weight: 400;"> and the </span><a href="https://datatracker.ietf.org/doc/draft-pantos-hls-rfc8216bis/" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">latest IEFT draft</span></a><span style="font-weight: 400;"> containing low-latency extensions for HLS.</span><br />
<span style="font-weight: 400;">We can conclusively say that low-latency HLS increases complexity quite significantly compared to standard HLS. The server will have its responsibilities expanded, from simply serving segments to supporting several additional mechanisms that clients use to save network round-trips and speed up segment delivery which ultimately enables lower end-to-end latency. Considering that the specification remains subject to change and is yet to be finalized, it might still take a while until streaming vendors pick it up and we finally see low-latency HLS in the wild. In short, live low latency streaming using HLS is possible, but at a large cost to server complexity, there are measures being developed to reduce complexity and server load, but it&#8217;ll take wider spread adoption by major stream providers for this to happen.</span></p>
<p>The post <a rel="nofollow" href="https://bitmovin.com/live-low-latency-hls">Video Tech Deep-Dive: Live Low Latency Streaming Part 3 &#8211; Low-Latency HLS</a> appeared first on <a rel="nofollow" href="https://bitmovin.com">Bitmovin</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Video Tech Deep-Dive: Live Low Latency Streaming Part 2</title>
		<link>https://bitmovin.com/live-low-latency-streaming-p2</link>
		
		<dc:creator><![CDATA[Jameson Steiner]]></dc:creator>
		<pubDate>Thu, 25 Jun 2020 12:42:01 +0000</pubDate>
				<category><![CDATA[Developers]]></category>
		<category><![CDATA[live encoding]]></category>
		<category><![CDATA[low-latency]]></category>
		<category><![CDATA[video player]]></category>
		<guid isPermaLink="false">https://bitmovin.com/?p=118091</guid>

					<description><![CDATA[<p>This blog post is continuation of an ongoing blog and webinar technical deep series. You can find the first blog post here. The first post covered the fundamentals of live low latency and defined chunked delivery methods with CMAF. This blog post expands on chunked CMAF delivery by explaining it’s application with MPEG-DASH to achieve low...</p>
<p>The post <a rel="nofollow" href="https://bitmovin.com/live-low-latency-streaming-p2">Video Tech Deep-Dive: Live Low Latency Streaming Part 2</a> appeared first on <a rel="nofollow" href="https://bitmovin.com">Bitmovin</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p><img loading="lazy" decoding="async" class="aligncenter wp-image-118093 size-large" src="https://bitmovin.com/wp-content/uploads/2020/06/Blog-live-low-latency-streaming-p.2-1-1024x512.png" alt="- Bitmovin" width="1024" height="512"><br />
<span style="font-weight: 400;">This blog post is continuation of an ongoing blog and webinar technical deep series. You can find the </span><a href="https://bitmovin.com/live-low-latency-streaming-p1/"><span style="font-weight: 400;">first blog post here</span></a>.<span style="font-weight: 400;"> The first post covered the fundamentals of live low latency and defined chunked delivery methods with CMAF.</span><br />
<span style="font-weight: 400;">This blog post expands on chunked CMAF delivery by explaining it’s application with MPEG-DASH to achieve low latency. We’ll lay some foundations and cover the basic approaches behind low-latency DASH, then look into what future developments are expected as low-latency streaming is a heavily researched subject and is quickly becoming a media industry standard.</span></p>
<h2><span style="font-weight: 400;">Basics of MPEG-DASH Live Streaming</span></h2>
<p><span style="font-weight: 400;">Before diving into how Low Latency Streaming works in MPEG-DASH we first need to understand some basic stream mechanics of DASH live streams, most importantly, the concept of segment availability.</span><br />
<span style="font-weight: 400;">The </span><a href="https://bitmovin.com/dynamic-adaptive-streaming-http-mpeg-dash/#MPD"><span style="font-weight: 400;">DASH Media Presentation Description</span></a><span style="font-weight: 400;"> (MPD) is an XML document containing essential metadata of a DASH stream. Among many other things, it describes which segments a stream consists of and how a playback client can obtain them. The main difference between on-demand and live stream segments within DASH is that all segments of the stream are available at all times for on-demand; whereas the segments are produced continuously one after another as time progresses for live-streams. Every time a new segment is produced, its availability is signaled to playback clients through the MPD. It is important to note that a segment is only made available once it is fully encoded and written to the origin.</span></p>
<figure id="attachment_118092" aria-describedby="caption-attachment-118092" style="width: 1024px" class="wp-caption aligncenter"><img loading="lazy" decoding="async" class="size-large wp-image-118092" src="https://bitmovin.com/wp-content/uploads/2020/06/Live-Low-Latency-Segment-AvailabilityTime-1-1024x493.png" alt="Live Low Latency-Segment Availability:Time" width="1024" height="493" /><figcaption id="caption-attachment-118092" class="wp-caption-text">Fig. 1 Live stream with template-based addressing scheme (simplified)</figcaption></figure>
<p><span style="font-weight: 400;">The MPD would specify the start of the stream availability (i.e. the Availability Start Time) and a constant segment duration, e.g. 2 seconds. Using these values the player can calculate how many segments are currently in the availability window and also their individual availability start time. For example, the segment availability start time for the second segment would be </span><span style="font-weight: 400;">AST + segment_duration * 2</span><span style="font-weight: 400;">.</span></p>
<h2><span style="font-weight: 400;">Low Latency Streaming with MPEG-DASH</span></h2>
<p><span style="font-weight: 400;">In the first part of this blog post series, we described how chunked encoding and transfer enables partial loads and consumption of segments that are still in the process of being encoded. To make a player aware of this action, the segment availability in the MPD is adjusted to signal an earlier availability, i.e. when the first chunk is complete. This is done using the </span><span style="font-weight: 400;">availabilityTimeOffset</span><span style="font-weight: 400;"> in the MPD. As a result, the player will not wait for a segment to be fully available and will load and consume it earlier.</span><br />
<span style="font-weight: 400;">Consider the example of Fig.1 with a segment duration of 2 seconds and a chunk duration of </span><span style="font-weight: 400;">0.033 seconds (i.e. one video frame duration with 29.97 fps). To signal the segment availability once the first chunk is completed we would set the </span><span style="font-weight: 400;">availabilityTimeOffset</span><span style="font-weight: 400;"> to 1.967 seconds </span><span style="font-weight: 400;">(segment_duration &#8211; chunk_duration</span><span style="font-weight: 400;">). This would signal the greyed-out segment in Fig. 1 to become partially available.</span><br />
<span style="font-weight: 400;">The below MPD represents this example:</span></p>
<pre><span style="font-weight: 400;">&lt;?xml version="1.0" encoding="utf-8"?&gt;</span>
<span style="font-weight: 400;">&lt;MPD</span>
<span style="font-weight: 400;">  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"</span>
<span style="font-weight: 400;">  xmlns="urn:mpeg:dash:schema:mpd:2011"</span>
<span style="font-weight: 400;">  xmlns:xlink="http://www.w3.org/1999/xlink"</span>
<span style="font-weight: 400;"> xsi:schemaLocation="urn:mpeg:DASH:schema:MPD:2011 http://standards.iso.org/ittf/PubliclyAvailableStandards/MPEG-DASH_schema_files/DASH-MPD.xsd"</span>
<span style="font-weight: 400;">  profiles="urn:mpeg:dash:profile:isoff-live:2011"</span>
<span style="font-weight: 400;">  type="dynamic"</span>
<span style="font-weight: 400;">  minimumUpdatePeriod="PT500S"</span>
<span style="font-weight: 400;">  suggestedPresentationDelay="PT2S"</span>
<span style="font-weight: 400;">  availabilityStartTime="2019-08-20T05:00:03Z"</span>
<span style="font-weight: 400;">  publishTime="2019-08-20T12:42:07Z"</span>
<span style="font-weight: 400;">  minBufferTime="PT2.0S"&gt;</span>
<span style="font-weight: 400;">  &lt;Period start="PT0.0S"&gt;</span>
<span style="font-weight: 400;">    &lt;AdaptationSet</span>
<span style="font-weight: 400;">      contentType="video"</span>
<span style="font-weight: 400;">      segmentAlignment="true"</span>
<span style="font-weight: 400;">      bitstreamSwitching="true"</span>
<span style="font-weight: 400;">      frameRate="30000/1001"&gt;</span>
<span style="font-weight: 400;">      &lt;Representation</span>
<span style="font-weight: 400;">       id="0"</span>
<span style="font-weight: 400;">       mimeType="video/mp4"</span>
<span style="font-weight: 400;">       codecs="avc1.64001f"</span>
<span style="font-weight: 400;">       bandwidth="2000000"</span>
<span style="font-weight: 400;">       width="1280"</span>
<span style="font-weight: 400;">       height="720"</span>
<span style="font-weight: 400;">        &lt;SegmentTemplate</span>
<span style="font-weight: 400;">         timescale="1000000"</span>
<span style="font-weight: 400;">         duration="2000000"</span>
<span style="font-weight: 400;">        <strong> </strong></span></pre>
<p><strong>availabilityTimeOffset=&#8221;1.967&#8243;</strong></p>
<pre><span style="font-weight: 400;">         initialization="1566277203/init-stream$RepresentationID$.m4s"</span>
<span style="font-weight: 400;">         media="1566277203/chunk-stream_t_$RepresentationID$-$Number%05d$.m4s"</span>
<span style="font-weight: 400;">         startNumber="1"&gt;</span>
<span style="font-weight: 400;">        &lt;/SegmentTemplate&gt;</span>
<span style="font-weight: 400;">      &lt;/Representation&gt;</span>
<span style="font-weight: 400;">    &lt;/AdaptationSet&gt;</span>
<span style="font-weight: 400;">  &lt;/Period&gt;</span>
<span style="font-weight: 400;">&lt;/MPD&gt;</span></pre>
<p><span style="font-weight: 400;">To recap, for low-latency DASH we are mainly doing two things:</span></p>
<ul>
<li style="font-weight: 400;"><span style="font-weight: 400;">Chunked encoding and transfer (i.e. chunked CMAF)</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">Signaling early availability of in-progress segments</span></li>
</ul>
<p><span style="font-weight: 400;">While the previous approach enables a basic low-latency DASH setup, there are additional considerations to be made to further optimize and stabilize streaming experience. </span><a href="https://dashif.org/" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">The DASH Industry Forum</span></a><span style="font-weight: 400;"> is working on guidelines for low-latency DASH to be released in the next version of the </span><a href="https://dashif.org/guidelines/" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">DASH-IF Interoperability Points (DASH-IF IOP)</span></a><span style="font-weight: 400;"> &#8211; expected in early July 2020. The change request for that can be found </span><a href="https://dashif.org/docs/CR-Low-Latency-Live-r8.pdf" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">here</span></a><span style="font-weight: 400;">. The following will explain key parts of these guidelines. Please note that some features were not officially finalized and standardized at the time of this post’s publication (June 2020).</span></p>
<h2><span style="font-weight: 400;">Wallclock Time Mapping</span></h2>
<p><span style="font-weight: 400;">For the purpose of measuring latency, a mapping between the media’s presentation time and the wall-clock time is needed. This is so that for any given presentation time of the stream the corresponding wall-clock time is known. The latency for a given playback position can then be calculated by determining the corresponding wall-clock time and subtracting it from the current wall-clock time.</span><br />
<span style="font-weight: 400;">This mapping can be achieved by specifying a so-called </span><i><span style="font-weight: 400;">Producer Reference Time </span></i><span style="font-weight: 400;">either in the segments (i.e. inband as </span><span style="font-weight: 400;">prft</span><span style="font-weight: 400;"> box) or in the MPD. It essentially specifies the wallclock time at which the respective segment/chunk was produced. (as seen below)</span></p>
<pre><span style="font-weight: 400;">&lt;ProducerReferenceTime</span>
<span style="font-weight: 400;">  id="0"</span>
<span style="font-weight: 400;">  type="encoder"</span>
<span style="font-weight: 400;">  presentationTime="538590000000"</span>
<span style="font-weight: 400;">  wallclockTime="2020-05-19T14:57:45Z"&gt;</span>
<span style="font-weight: 400;">&lt;/ProducerReferenceTime&gt;</span></pre>
<p><span style="font-weight: 400;">The</span><span style="font-weight: 400;"> type </span><span style="font-weight: 400;">attribute</span> <span style="font-weight: 400;">specifies whether the reference time was set by the capturing device or the encoder. Allowing for calculation of the </span><i><span style="font-weight: 400;">End-to-End Latency</span></i><span style="font-weight: 400;"> (EEL) or </span><i><span style="font-weight: 400;">Encoder-Display Latency</span></i><span style="font-weight: 400;"> (EDL), respectively.</span></p>
<h2><span style="font-weight: 400;">Client Time Synchronization</span></h2>
<p><span style="font-weight: 400;">A precise time/clock at the playback client is necessary for calculations that involve the client’s wallclock time such as segment availability calculations and latency calculations. It is recommended for the MPD to include a </span><span style="font-weight: 400;">UTCTiming</span><span style="font-weight: 400;"> element which specifies a time source that can be used to adjust for any drift of the client clock. (as seen below)</span></p>
<pre><span style="font-weight: 400;">&lt;UTCTiming
</span>
<span style="font-weight: 400;">  schemeIdUri="urn:mpeg:dash:utc:http-iso:2014"
</span>
<span style="font-weight: 400;">  value="https://time.akamai.com/?iso"
</span>
<span style="font-weight: 400;">/</span><span style="font-weight: 400;">&gt;</span></pre>
<h2><span style="font-weight: 400;">Low Latency Service Description</span></h2>
<p><span style="font-weight: 400;">A </span><span style="font-weight: 400;">ServiceDescription</span><span style="font-weight: 400;"> element should be used to specify the service provider’s desired target latency and minimum/maximum latency boundaries in milliseconds. Furthermore, playback rate boundaries may be specified that define the allowed range for playback acceleration/deceleration by the playout client to fulfill the latency requirements.</span></p>
<pre><span style="font-weight: 400;">&lt;ServiceDescription id="0"&gt;</span>
<span style="font-weight: 400;">  &lt;Latency target="3500" min="2000" max="10000" referenceId="0"/&gt;</span>
<span style="font-weight: 400;">  &lt;PlaybackRate min="0.9" max="1.1"/&gt;</span>
<span style="font-weight: 400;">&lt;/ServiceDescription&gt;</span></pre>
<p><span style="font-weight: 400;">In most player implementations such parameters are provided externally using configurations and APIs.</span></p>
<h2><span style="font-weight: 400;">Resynchronization Points</span></h2>
<p><span style="font-weight: 400;">The previous post pointed out that chunked delivery decouples the achievable latency from the segment durations and enables us to choose relatively long segment durations to maintain good video encoding efficiency. In turn, this prevents fast quality adaptation of the player as quality switching can only be done on segment boundaries. In a low-latency scenario with low buffer levels, fast adaptation &#8212; especially down-switching &#8212; would be desirable to avoid buffer underruns and consequently playback interruptions.</span><br />
<span style="font-weight: 400;">To that end, </span><span style="font-weight: 400;">Resync</span><span style="font-weight: 400;"> elements may be used that specify segment properties like chunk duration and chunk size. Playback clients can utilize them to locate resync point and</span></p>
<ul>
<li style="font-weight: 400;"><span style="font-weight: 400;">Join streams mid-segment, based on latency requirements</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">Switch representations mid-segment</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">Resynchronize at mid-segment position after buffer underruns</span></li>
</ul>
<p><span style="font-weight: 400;">The previous was a glimpse of what to expect in the near future and shows the great effort of the media industry put into kick-starting low-latency streaming with MPEG-DASH and getting it ready for production services. </span><br />
<span style="font-weight: 400;">Want to learn more? Check out Part 3: <a href="https://bitmovin.com/live-low-latency-hls/">Video Tech Deep-Dive: Live Low Latency Streaming Part 3 – Low-Latency HLS</a> </span><br />
<span style="font-weight: 400;">&#8230; or take a look at some of the supporting documentation below:</span><br />
<span style="font-weight: 400;">[Tool] </span><a href="https://conformance.dashif.org/" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">DASH-IF Conformance Tool</span></a><span style="font-weight: 400;"> </span><br />
<span style="font-weight: 400;">[Blog Post] </span><a href="https://bitmovin.com/live-low-latency-streaming-p1/"><span style="font-weight: 400;">Video Tech Deep-Dive: Live Low Latency Streaming Part 1</span></a><span style="font-weight: 400;"> </span><br />
<span style="font-weight: 400;">[Demo] </span><a href="https://bitmovin.com/demos/low-latency-streaming"><span style="font-weight: 400;">Low Latency Streaming with Bitmovin’s Player</span></a><span style="font-weight: 400;"> </span></p>
<p>The post <a rel="nofollow" href="https://bitmovin.com/live-low-latency-streaming-p2">Video Tech Deep-Dive: Live Low Latency Streaming Part 2</a> appeared first on <a rel="nofollow" href="https://bitmovin.com">Bitmovin</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Video Tech Deep Dive: Super-Resolution with Machine Learning P1</title>
		<link>https://bitmovin.com/super-resolution-machine-learning-p1</link>
		
		<dc:creator><![CDATA[Jameson Steiner]]></dc:creator>
		<pubDate>Wed, 20 May 2020 07:16:20 +0000</pubDate>
				<category><![CDATA[VidTech]]></category>
		<guid isPermaLink="false">https://bitmovin.com/?p=114684</guid>

					<description><![CDATA[<p>Super-Resolution: What&#8217;s the buzz and why does it matter? Super-resolution has been gaining steam recently. Many companies, secretly and not-so-secretly, have been incorporating this technology into their workflow. Most notably:  Samsung has started advertising this feature in their latest flagship phone camera’s &#8211; boasting 64MP cameras that use super-resolution for zooming in during the photo...</p>
<p>The post <a rel="nofollow" href="https://bitmovin.com/super-resolution-machine-learning-p1">Video Tech Deep Dive: Super-Resolution with Machine Learning P1</a> appeared first on <a rel="nofollow" href="https://bitmovin.com">Bitmovin</a>.</p>
]]></description>
										<content:encoded><![CDATA[<h2 style="text-align: center;">Super-Resolution: What&#8217;s the buzz and why does it matter?</h2>
<p><span style="font-weight: 400;">Super-resolution has been gaining steam recently. Many companies, secretly and not-so-secretly, have been </span><a href="https://www.androidauthority.com/super-resolution-smartphones-camera-995903/" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">incorporating this technology</span></a><span style="font-weight: 400;"> into their workflow. Most notably: </span></p>
<ul>
<li style="font-weight: 400;"><span style="font-weight: 400;">Samsung has started advertising this feature in their </span><a href="https://news.samsung.com/global/galaxy-s20-lights-a-new-path-for-photography-with-high-resolution-sensors-and-space-zoom-technology" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">latest flagship phone camera’s</span></a><span style="font-weight: 400;"> &#8211; boasting 64MP cameras that use super-resolution for zooming in during the photo capture process </span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">Other </span><a href="https://thenextweb.com/artificial-intelligence/2020/02/05/watch-ai-developer-upscales-famous-1895-train-scene-to-4k-at-60-fps/" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">upstart companies</span></a><span style="font-weight: 400;"> are exploiting super-resolution to “upsample” videos and bring back videos to life that were captured centuries ago.</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">Image editing applications like Pixelmator pro are </span><a href="https://www.theverge.com/2019/12/17/21025811/ai-super-resolution-zoom-enhance-pixelmator-pro" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">using this feature</span></a><span style="font-weight: 400;"> to provide an enhanced end-user experience.</span></li>
</ul>
<figure id="attachment_115047" aria-describedby="caption-attachment-115047" style="width: 1003px" class="wp-caption aligncenter"><img loading="lazy" decoding="async" class="wp-image-115047 size-full" src="https://bitmovin.com/wp-content/uploads/2020/05/Super-Resolution.002-e1590142948659.jpeg" alt="- Bitmovin" width="1003" height="701" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/05/Super-Resolution.002-e1590142948659-300x210.jpeg?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/05/Super-Resolution.002-e1590142948659.jpeg?size=384x268&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/05/Super-Resolution.002-e1590142948659-768x537.jpeg?lossy=2&amp;strip=1&amp;webp=1 768w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/05/Super-Resolution.002-e1590142948659.jpeg?lossy=2&amp;strip=1&amp;webp=1 1003w" sizes="(max-width: 1003px) 100vw, 1003px" /><figcaption id="caption-attachment-115047" class="wp-caption-text">Companies, large and small, have been incorporating super-resolution into their products.</figcaption></figure>
<p><span style="font-weight: 400;">Although the idea of super-resolution has been around for quite some time, it’s recent resurgence in media applications has been driven primarily by advances in Machine Learning (ML). In the age of 4K and 8K quality content &#8211; super-resolution is an increasingly relevant topic in the field of video and will only continue to grow.</span><span style="font-weight: 400;"><br />
</span><span style="font-weight: 400;"><br />
</span><span style="font-weight: 400;">So in this series, I will try to shed some light on : </span></p>
<ul>
<li style="font-weight: 400;"><span style="font-weight: 400;">What is ML-based super-resolution?</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">Why is super-resolution so enticing for the video companies? And what are its advantages?</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">Why does super-resolution matter for you, and how can you incorporate it within your own video workflows?  </span></li>
</ul>
<hr />
<h2><span style="font-weight: 400;">What is Video Sampling</span></h2>
<p><span style="font-weight: 400;">Before jumping directly into the deep end, let&#8217;s get some basics in order. Starting from simple digital signals and building all the way up to ML-based super-resolution. </span></p>
<h3><span style="font-weight: 400;">Digital Signals</span></h3>
<p><span style="font-weight: 400;">Videos are sequences of images. And an image essentially is nothing but a two-dimensional digital signal. And processing this digital signal is of utmost importance for most electronic devices. Especially, if you concern yourself with video transmission and picture quality.</span></p>
<h3><span style="font-weight: 400;">Resampling digital signals</span></h3>
<p><span style="font-weight: 400;">Next, within a digital signal, post-processing is almost always required. One of the commonly applied post-processing steps is known as “resampling”. Resampling is changing the sampling rate of a digital signal. Or, in simple words, for a given time duration, we change the number of samples in the signal. </span><br />
<span style="font-weight: 400;">Within resampling, one could do two things. You could either:</span><br />
<strong>Upsample</strong></p>
<ul>
<li style="font-weight: 400;"><span style="font-weight: 400;">To predict new information from the pre-existing information. Or in other words, you are increasing the number of samples in a given time. This is also known as interpolation sometimes.</span></li>
</ul>
<p><strong>Downsample</strong></p>
<ul>
<li style="font-weight: 400;"><span style="font-weight: 400;">To throw away existing information. Or in other words, you are decreasing the number of samples in a given time. </span></li>
</ul>
<p><span style="font-weight: 400;">This idea is depicted in the following figure.</span></p>
<figure id="attachment_115048" aria-describedby="caption-attachment-115048" style="width: 1024px" class="wp-caption aligncenter"><img loading="lazy" decoding="async" class="wp-image-115048 size-full" src="https://bitmovin.com/wp-content/uploads/2020/05/Super-Resolution.005.jpeg" alt="- Bitmovin" width="1024" height="768" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/05/Super-Resolution.005-300x225.jpeg?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/05/Super-Resolution.005.jpeg?size=384x288&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/05/Super-Resolution.005-768x576.jpeg?lossy=2&amp;strip=1&amp;webp=1 768w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/05/Super-Resolution.005.jpeg?lossy=2&amp;strip=1&amp;webp=1 1024w" sizes="(max-width: 1024px) 100vw, 1024px" /><figcaption id="caption-attachment-115048" class="wp-caption-text">A digital source signal can be resampled in two ways: upsampled or downsampled.</figcaption></figure>
<h3><span style="font-weight: 400;">Resampling video</span></h3>
<p><span style="font-weight: 400;">As explained earlier, video is nothing but a digital signal. And there is usually a need to sample this digital video signal (we will look at some practical examples below). Since super-resolution concerns itself only with video upsampling, the rest of the series will focus only on the video upsampling.</span><br />
<span style="font-weight: 400;">To reiterate, video upsampling is the process of predicting new video samples from pre-existing video samples. </span></p>
<hr />
<h2><span style="font-weight: 400;">Why Upsample?</span></h2>
<p><b>Is there a need to upsample to videos? And more importantly, is there a business opportunity behind it? </b><br />
<span style="font-weight: 400;">Let&#8217;s look at some real-world use cases and the types of upsampling to explain its relevance towards modern-day media. There are two primary types of upsampling: Temporal and Spatial.</span></p>
<figure id="attachment_115049" aria-describedby="caption-attachment-115049" style="width: 988px" class="wp-caption aligncenter"><img loading="lazy" decoding="async" class="wp-image-115049 size-full" src="https://bitmovin.com/wp-content/uploads/2020/05/Super-Resolution.008-e1590143082449.jpeg" alt="- Bitmovin" width="988" height="358" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/05/Super-Resolution.008-e1590143082449-300x109.jpeg?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/05/Super-Resolution.008-e1590143082449.jpeg?size=384x139&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/05/Super-Resolution.008-e1590143082449-768x278.jpeg?lossy=2&amp;strip=1&amp;webp=1 768w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/05/Super-Resolution.008-e1590143082449.jpeg?lossy=2&amp;strip=1&amp;webp=1 988w" sizes="(max-width: 988px) 100vw, 988px" /><figcaption id="caption-attachment-115049" class="wp-caption-text">Temporal vs Spatial Upsampling</figcaption></figure>
<h3><span style="font-weight: 400;">Temporal Upsampling </span></h3>
<p><span style="font-weight: 400;">Temporal upsampling is to predict video information across time-dimension using pre-existing information. This is best displayed in the iconic film series, </span><i><span style="font-weight: 400;">The Matrix</span></i><span style="font-weight: 400;">; if you are old enough to remember the famous “Neo vs Agent Smith Fight” scene from the movie </span><i><span style="font-weight: 400;">Matrix Reloaded</span></i><span style="font-weight: 400;"> you’ll know that this movie was shot in the year 2003. One of the fascinating aspects of the scene is that it alternates between 12000 frames per second (fps) (this is super-slow-motion) and 24 fps (normal-speed).</span></p>
<p style="text-align: center;"><iframe loading="lazy" src="https://gfycat.com/ifr/SeveralImpassionedCarp" width="640" height="404" frameborder="0" scrolling="no" allowfullscreen="allowfullscreen"></iframe></p>
<p><span style="font-weight: 400;">In the year 2003, filmmakers certainly did not have a camera that can shoot at 12000 fps. Cameras were only capable of shooting a maximum of 24 fps. So, the filmmakers had to do </span><a href="https://en.wikipedia.org/wiki/Bullet_time" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">sophisticated interpolation</span></a><span style="font-weight: 400;"> to obtain the 12000 frames (per second) from the pre-existing 24 frames (per second). In other words, they predict digital samples across temporal dimensions. </span></p>
<h3><span style="font-weight: 400;">Spatial Upsampling</span></h3>
<p><span style="font-weight: 400;">On the other side, Spatial upsampling is the process of predicting information across </span><i><span style="font-weight: 400;">the spatial </span></i><span style="font-weight: 400;">dimension.</span></p>
<figure style="width: 904px" class="wp-caption aligncenter"><img loading="lazy" decoding="async" src="https://media.giphy.com/media/XHGg6dgOvOQqKYeL0k/giphy.gif" alt="- Bitmovin" width="904" height="542"><figcaption class="wp-caption-text">Classic movies that in low resolution have to resampled to a higher resolution.</figcaption></figure>
<p><span style="font-weight: 400;">Imagine, now that you have an old catalog of classic movies that you want to enjoy on your new crisp 4K-TV. The classic movies were (expectedly) not shot in 4K resolution. So to convert low-resolution movies, say 360p to a higher 4K resolution would require spatial upsampling. You need spatial upsampling to go from 172800 pixels (360p) to 8294400 (4k) pixels. In other words, you predict digital information across spatial dimensions.</span><br />
<span style="font-weight: 400;">So, to answer the original question at the beginning. </span></p>
<ul>
<li><span style="font-weight: 400;"><strong>YES!</strong> there is a <strong>need to upsample videos</strong>.</span></li>
<li><span style="font-weight: 400;">And, an even emphatic, <strong>YES!</strong> there is <strong>a huge business opportunity</strong> behind it.</span></li>
</ul>
<p><span style="font-weight: 400;">Super-resolution primarily deals with spatial upsampling. Hence, we will restrict our focus for the remainder of the series only on spatial upsampling.</span></p>
<hr />
<h2><span style="font-weight: 400;">Spatial Upsampling in Video</span></h2>
<p><span style="font-weight: 400;">You might already be familiar with some of the other well-known methods to spatially upsample videos; the most common ones being </span><a href="https://en.wikipedia.org/wiki/Bilinear_interpolation" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">bilinear</span></a><span style="font-weight: 400;">, </span><a href="https://en.wikipedia.org/wiki/Bicubic_interpolation" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">bicubic</span></a><span style="font-weight: 400;">, or </span><a href="https://en.wikipedia.org/wiki/Lanczos_algorithm" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">lanczos</span></a><span style="font-weight: 400;">. Essentially, the idea behind all of these methods is that they use a </span><b>single predefined mathematical function</b><span style="font-weight: 400;"> to predict new digital video samples from the pre-existing ones.</span></p>
<figure id="attachment_114700" aria-describedby="caption-attachment-114700" style="width: 1049px" class="wp-caption aligncenter"><img loading="lazy" decoding="async" class="wp-image-114700 " src="https://bitmovin.com/wp-content/uploads/2020/05/Super-resolution-mathematical-function-spatial-upsampling-e1590143165338-1.jpg" alt="Super resolution-mathematical function spatial upsampling-illustrated" width="1049" height="516" /><figcaption id="caption-attachment-114700" class="wp-caption-text">Most commonly used upsampling methods work on the same idea. They use a single predefined mathematical function to interpolate.</figcaption></figure>
<p><span style="font-weight: 400;">The emphasis on “single predefined mathematical function” is important. This is a key point, and that we will revisit that later. </span><br />
<span style="font-weight: 400;">Now, in a similar vein, super-resolution is a class of techniques to perform upsampling of videos. There are several flavors of super-resolution. But they are based on the same core principle: they use information from several images (typically neighboring images in video) to spatially upsample a single low-resolution image to a high-resolution image.</span></p>
<figure id="attachment_115052" aria-describedby="caption-attachment-115052" style="width: 956px" class="wp-caption aligncenter"><img loading="lazy" decoding="async" class="wp-image-115052" src="https://bitmovin.com/wp-content/uploads/2020/05/Super-Resolution.006-1-e1590143447660.jpeg" alt="Super-Resolution_Spatial Upsamling with Mathematical funnctions_Illustrations" width="956" height="521" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/05/Super-Resolution.006-1-e1590143447660.jpeg?size=191x104&amp;lossy=2&amp;strip=1&amp;webp=1 191w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/05/Super-Resolution.006-1-e1590143447660-300x164.jpeg?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/05/Super-Resolution.006-1-e1590143447660.jpeg?size=382x208&amp;lossy=2&amp;strip=1&amp;webp=1 382w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/05/Super-Resolution.006-1-e1590143447660.jpeg?size=573x312&amp;lossy=2&amp;strip=1&amp;webp=1 573w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/05/Super-Resolution.006-1-e1590143447660.jpeg?lossy=2&amp;strip=1&amp;webp=1 765w" sizes="(max-width: 956px) 100vw, 956px" /><figcaption id="caption-attachment-115052" class="wp-caption-text">Super-resolution uses information from several low-resolution images to interpolate a single low-resolution image to a high-resolution image.</figcaption></figure>
<p><span style="font-weight: 400;">Note that in contrast to the spatial upsampling methods mentioned before, super-resolution uses information from several neighboring images to interpolate a single low-resolution image. And because it combines multiple information sources, it is able to better interpolate the image than the methods mentioned above. And this is one of the reasons why super-resolution is already popularly applied by companies such as </span><a href="https://www.amd.com/en/technologies/vsr" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">AMD</span></a><span style="font-weight: 400;"> and </span><a href="https://www.nvidia.com/en-us/geforce/news/dynamic-super-resolution-instantly-improves-your-games-with-4k-quality-graphics/" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">NVIDIA</span></a><span style="font-weight: 400;">, to render video games at high resolutions (4k). </span><br />
<span style="font-weight: 400;">Going forward, we will focus on super-resolution applications in a typical video workflow. We will especially focus on ML-based super-resolution. And discover the superior benefits it offers over conventional methods in video workflows. </span><br />
<strong>Why ML-based super-resolution and why is it so much better? We will answer this in the next series of posts. So stay tuned!</strong></p>
<hr />
<h2><span style="font-weight: 400;">Conclusion</span></h2>
<p><span style="font-weight: 400;">The following figure summarizes everything we learned in this blog post. </span></p>
<figure id="attachment_115031" aria-describedby="caption-attachment-115031" style="width: 801px" class="wp-caption aligncenter"><img loading="lazy" decoding="async" class="size-full wp-image-115031" src="https://bitmovin.com/wp-content/uploads/2020/05/Super-Resolution.001.jpeg" alt="- Bitmovin" width="801" height="691" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/05/Super-Resolution.001-300x259.jpeg?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/05/Super-Resolution.001.jpeg?size=384x331&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/05/Super-Resolution.001-768x663.jpeg?lossy=2&amp;strip=1&amp;webp=1 768w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/05/Super-Resolution.001.jpeg?lossy=2&amp;strip=1&amp;webp=1 801w" sizes="(max-width: 801px) 100vw, 801px" /><figcaption id="caption-attachment-115031" class="wp-caption-text">The focus of this series of blog posts will be on machine learning-based super-resolution.</figcaption></figure>
<p><span style="font-weight: 400;">There is a big business opportunity behind upsampling videos, especially spatial upsampling of videos. Super-resolution is a class of techniques to spatially upsample video. Broadly, super-resolution can be categorized into two categories: machine learning-based and non-machine learning-based. This blog series will focus on machine learning-based super-resolution and the superior benefits it offers in video workflows.</span><br />
Did you enjoy this post? Want to learn more?<br />
Check out part two of the Super-Resolution series: <a href="https://bitmovin.com/super-resolution-machine-learning-p2/"><span style="font-weight: 400;">Super-Resolution with Machine Learning P2</span></a><span style="font-weight: 400;"> </span><br />
Check out part three: <a href="https://bitmovin.com/super-resolution-deployments-machine-learning-p3/">Practical Super-Resolution Deployments and Ensuing Results</a></p>
<p>The post <a rel="nofollow" href="https://bitmovin.com/super-resolution-machine-learning-p1">Video Tech Deep Dive: Super-Resolution with Machine Learning P1</a> appeared first on <a rel="nofollow" href="https://bitmovin.com">Bitmovin</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Video Tech Deep-Dive: Live Low Latency Streaming Part 1</title>
		<link>https://bitmovin.com/live-low-latency-streaming-p1</link>
		
		<dc:creator><![CDATA[Jameson Steiner]]></dc:creator>
		<pubDate>Wed, 22 Apr 2020 13:49:39 +0000</pubDate>
				<category><![CDATA[Developers]]></category>
		<category><![CDATA[live encoding]]></category>
		<guid isPermaLink="false">https://bitmovin.com/?p=112080</guid>

					<description><![CDATA[<p>What is Live Low Latency? Low Latency in live streaming is the time delay between an event’s content being captured at one end of the media delivery chain and played out to a user at the other end. Consider a goal scored at a football game: Live latency is the delay in time between the...</p>
<p>The post <a rel="nofollow" href="https://bitmovin.com/live-low-latency-streaming-p1">Video Tech Deep-Dive: Live Low Latency Streaming Part 1</a> appeared first on <a rel="nofollow" href="https://bitmovin.com">Bitmovin</a>.</p>
]]></description>
										<content:encoded><![CDATA[<h2><span style="font-weight: 400;">What is Live Low Latency?</span></h2>
<p><span style="font-weight: 400;">Low Latency in live streaming is the time delay between an event’s content being captured at one end of the media delivery chain and played out to a user at the other end. Consider a goal scored at a football game: Live latency is the delay in time between the moment a goal is scored and captured by a camera until the moment that a viewer sees the goal on their own device. There are a few different terms that effectively define the same experience: end-to-end latency, hand-waving latency, or glass-to-glass latency.</span><br />
<figure id="attachment_112086" aria-describedby="caption-attachment-112086" style="width: 800px" class="wp-caption aligncenter"><img loading="lazy" decoding="async" class="size-full wp-image-112086" src="https://bitmovin.com/wp-content/uploads/2020/04/End-to-end-video-encoding.jpg" alt="End-to-end video encoding workflow illustrated" width="800" height="220" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/04/End-to-end-video-encoding-300x83.png?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/04/End-to-end-video-encoding.jpg?size=384x106&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/04/End-to-end-video-encoding-768x211.png?lossy=2&amp;strip=1&amp;webp=1 768w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/04/End-to-end-video-encoding.jpg?lossy=2&amp;strip=1&amp;webp=1 800w" sizes="(max-width: 800px) 100vw, 800px" /><figcaption id="caption-attachment-112086" class="wp-caption-text">End-to-end video encoding workflow (where latency matters)</figcaption></figure><br />
<span style="font-weight: 400;">In our </span><a href="https://bitmovin.com/bitmovin-2019-video-developer-report-av1-codec-ai-machine-learning-low-latency/"><span style="font-weight: 400;">most recent developer report</span></a><span style="font-weight: 400;">, low latency was identified as one of the biggest challenges for the media industry. This blog series will take an in-depth look into why that’s the case, welcome to our Live Latency Deep Dive series!</span><br />
<img loading="lazy" decoding="async" class="size-large wp-image-112087 aligncenter" src="https://bitmovin.com/wp-content/uploads/2020/04/Low-Latency-Dev-Report-Graph--1010x1024.png" alt="Low-Latency-Dev-Report-Graph" width="1010" height="1024" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/04/Low-Latency-Dev-Report-Graph--296x300.png?lossy=2&amp;strip=1&amp;webp=1 296w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/04/Low-Latency-Dev-Report-Graph-.png?size=384x389&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/04/Low-Latency-Dev-Report-Graph--768x779.png?lossy=2&amp;strip=1&amp;webp=1 768w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/04/Low-Latency-Dev-Report-Graph--1010x1024.png?lossy=2&amp;strip=1&amp;webp=1 1010w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/04/Low-Latency-Dev-Report-Graph--1514x1536.png?lossy=2&amp;strip=1&amp;webp=1 1514w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/04/Low-Latency-Dev-Report-Graph-.jpg?lossy=2&amp;strip=1&amp;webp=1 1546w" sizes="(max-width: 1010px) 100vw, 1010px" /></p>
<h2><span style="font-weight: 400;">Why care about Low Latency? </span></h2>
<p><span style="font-weight: 400;">Most use cases where live latency is crucial can be categorized into the following:</span></p>
<h3><span style="font-weight: 400;">Live content delivered across multiple distribution channels</span></h3>
<p><span style="font-weight: 400;">high live latency in comparison to traditional linear broadcast delivery via satellite, terrestrial or cable services.  Over-the-top (OTT) delivery methods like MPEG-DASH and Apple HLS have become the defacto standard for delivering video to audiences using mobile devices such as smartphones, tablets, laptops, and Smart TVs. Live network content, like sports or news, drive the need for low live latency as these networks attempt to deliver content simultaneously over various distribution means (e.g. OTT vs Cable). </span><br />
<span style="font-weight: 400;">Picture a scenario where you are streaming your favorite football team playing in the global final, your neighbor and equal fan (with incredibly thin walls) has traditional linear cable. It’s the final moments of the game, but you hear the neighbor cursing loudly, despite the fact that there is well over 1 minute left in the game. The thrill is spoiled and you know your team certainly lost. The need for faster live latency becomes clear, the difference between broadcast and streaming is unacceptable in today’s digital world. But a lot of factors affect how quickly content will appear on a viewer’s screen. Aside from infrastructural issues (like not being optimized for low latency), modern streaming methods may suffer latency delays from additional factors like social media feeds, push notifications, and second-screen experiences running in parallel to the live event.</span></p>
<h3><span style="font-weight: 400;">Interactive live content</span></h3>
<p><span style="font-weight: 400;">Whenever audience interaction is involved, live latency should be as low as possible to ensure a good </span><a href="https://bitmovin.com/qoe-why-quality-video-matters/"><span style="font-weight: 400;">quality of experience (QoE)</span></a><span style="font-weight: 400;">. Such use cases include webinars, auctions, user-generated content where the broadcaster interacts with the audience (e.g. Twitch, Periscope, Facebook Live, etc.) and more. Latency is often measured on a spectrum, where high latency is the least sought after delay, and </span><i><span style="font-weight: 400;">Real-Time</span></i><span style="font-weight: 400;"> is the most sought after. See the Latency Spectrum below (including the latency types, delay time, and streaming formats):</span><br />
<figure id="attachment_112099" aria-describedby="caption-attachment-112099" style="width: 800px" class="wp-caption aligncenter"><img loading="lazy" decoding="async" class="size-full wp-image-112099" src="https://bitmovin.com/wp-content/uploads/2020/04/Live-Low-Latency-Deep-Dive_Latency-Spectrum-graph-1.jpg" alt="Live Low Latency Deep Dive_Latency Spectrum-graph" width="800" height="484" /><figcaption id="caption-attachment-112099" class="wp-caption-text">Latency Spectrum in Video Streaming</figcaption></figure><br />
<span style="font-weight: 400;">The latency spectrum shows that unoptimized OTT delivery accounts for around 30+ seconds of delay while cable broadcast TV clocks in at around 5 seconds &#8211; give or take. Furthermore, sub-second latencies may not be achievable with OTT methods and require other protocols like WebRTC.</span></p>
<h2><span style="font-weight: 400;">Where does live latency come from?</span></h2>
<p><span style="font-weight: 400;">First, a slightly more technical definition of live latency: It’s the time difference between a video frame being captured and the moment it’s presented to the playback client. In other words, it’s the time that a video frame spends in the media processing and delivery chain. Every component in the chain introduces a certain amount of latency and eventually accumulates to what is considered live latency. </span><br />
<span style="font-weight: 400;">Let’s have a look at the main sources of live latency:</span></p>
<h3><span style="font-weight: 400;">Buffering ahead for playback stability at the player-level</span></h3>
<p><figure id="attachment_112100" aria-describedby="caption-attachment-112100" style="width: 1024px" class="wp-caption aligncenter"><img loading="lazy" decoding="async" class="wp-image-112100 size-large" src="https://bitmovin.com/wp-content/uploads/2020/04/Low-latency-Livestream-timeline-illustrated-1-1024x492.png" alt="Low-latency-Livestream-timeline-illustrated" width="1024" height="492" /><figcaption id="caption-attachment-112100" class="wp-caption-text">Live stream timeline</figcaption></figure><br />
<span style="font-weight: 400;">A video player will aim to maintain a pre-defined amount of buffered data ahead of its playback position. The standard value is about 30 seconds of buffer loaded ahead at all times during playback. One of the reasons behind this is the cause is that if network bandwidth drops during playback there would still be 30 seconds of data to be played out without interruption. During this time the player can react to new bandwidth conditions appropriately, thereby buying the player some time to adapt. Buffer time also typically influences the bitrate adaptation decisions as low buffer levels may imply more aggressive downwards adaptations</span><span style="font-weight: 400;">.</span><br />
<span style="font-weight: 400;">However, when aiming for 30 seconds of buffer with a live stream, the player must stay at least 30 seconds behind the live edge (the most recent point) of the stream with its playback position; this would result in a live latency of 30 seconds. Conversely, this means that aiming for a low latency would require being even closer to the live edge and implies having a minimum buffer. If we aim for 5 seconds of latency, the player would have 5 seconds of buffer at most. Thus, the difficult decision of trading off between latency and playback stability must be made.</span></p>
<h3><span style="font-weight: 400;">Segments are produced, transferred and consumed in their entirety</span></h3>
<p><span style="font-weight: 400;">Live streams are encoded in real-time. This means that if a segment duration is 6 seconds it will take the encoder 6 seconds to produce one full segment. Additionally, if fragmented MP4 is used as the container format, encoders can only write a segment to the desired storage once it&#8217;s encoded completely, i.e. 6 seconds after starting the encode of the segment. So once a segment is transferred to the storage its oldest frame is already 6 seconds old. On the other side of the delivery chain, the player can only decode an fMP4 segment in its entirety and therefore needs to download a segment fully before it can process it. Network transfers: like uploading a video to a CDN origin server, transferring the content within the CDN, and downloading from the CDN edge server to the client can add to the overall latency to a lower degree.</span><br />
<span style="font-weight: 400;">In summary, the fact that segments are only processed and transferred in their entirety results in latency being correlated directly to segment duration.</span><br />
<figure id="attachment_112109" aria-describedby="caption-attachment-112109" style="width: 1024px" class="wp-caption aligncenter"><img loading="lazy" decoding="async" class="wp-image-112109 size-large" src="https://bitmovin.com/wp-content/uploads/2020/04/Screenshot-2020-04-14-at-09.14.09-1024x478.png" alt="Low Latency Data Segments in the Encoding Workflow Illustrated" width="1024" height="478" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/04/Screenshot-2020-04-14-at-09.14.09-300x140.png?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/04/Screenshot-2020-04-14-at-09.14.09.png?size=384x179&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/04/Screenshot-2020-04-14-at-09.14.09-768x358.png?lossy=2&amp;strip=1&amp;webp=1 768w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/04/Screenshot-2020-04-14-at-09.14.09-1024x478.png?lossy=2&amp;strip=1&amp;webp=1 1024w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/04/Screenshot-2020-04-14-at-09.14.09-1536x716.png?lossy=2&amp;strip=1&amp;webp=1 1536w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/04/Screenshot-2020-04-14-at-09.14.09-2048x955.png?lossy=2&amp;strip=1&amp;webp=1 2048w" sizes="(max-width: 1024px) 100vw, 1024px" /><figcaption id="caption-attachment-112109" class="wp-caption-text">Data Segments in the Encoding Workflow</figcaption></figure></p>
<h2>What can we do?</h2>
<h3>Naive approach: Short segments</h3>
<p><span style="font-weight: 400;">As latency is correlated to segment duration, a simple way to decrease latency would be to use short segments, e.g. 1-second duration. However, this comes with negative side effects such as:</span></p>
<ul>
<li style="font-weight: 400;"><span style="text-decoration: underline;"><span style="font-weight: 400;"><strong>Video coding efficiency suffers</strong></span></span><span style="font-weight: 400;">: </span><span style="font-weight: 400;">The requirement of each video segment starting with a key frame implies having small groups of pictures (GOPs). This in turn, causes the efficiency of differential/predictive coding to suffer. With short segments, you’d have to spend more bits if you’re aiming for the same perceptual quality as longer segments with the same content.</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;"><strong><span style="text-decoration: underline;">More network requests</span></strong> and everything negative associated with them, e.g. time to first byte (TTFB) wasted on every request.</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;"><strong><span style="text-decoration: underline;">Increased number of segments</span></strong> may decrease CDN caching efficiency.</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;"><strong><span style="text-decoration: underline;">Buffer at the player grows</span></strong> in a jumpy fashion which increases the risk of playback stalls due to rebuffering.</span></li>
</ul>
<h3>Chunked encoding and transfer</h3>
<p><span style="font-weight: 400;">To solve the problem of segments being produced and consumed only in their entirety, we can make use of the chunked encoding scheme specified in the </span><a href="https://bitmovin.com/what-is-cmaf-threat-opportunity/"><span style="font-weight: 400;">MPEG-CMAF (Common Media Application Format) standard</span></a><span style="font-weight: 400;">. CMAF defines a container format based on the ISO Base Media File Format (ISO BMFF), similar to the MP4 container format, which is already widely supported by browsers and end devices. Within its chunked encoding feature, CMAF introduces the notion of CMAF chunks. Compared to an “ordinary” fMP4 segment that has its media payload in a single big mdat box, chunked CMAF allows segments to consist of a sequence of CMAF chunks (moof+mdat tuples). In extreme cases, every frame can be put into its own CMAF chunk. This enables the encoder to produce and the player’s decoder to consume segments in a chunk-by-chunk fashion instead of limiting use to entire segment consumption. Admittedly, the MPEG-TS container format offers similar properties as chunked CMAF, but it’s fading as a format for OTT due to the lack of native device and platform support that fMP4 and CMAF provide.</span><br />
<figure id="attachment_112110" aria-describedby="caption-attachment-112110" style="width: 1024px" class="wp-caption aligncenter"><img loading="lazy" decoding="async" class="wp-image-112110 size-large" src="https://bitmovin.com/wp-content/uploads/2020/04/Screenshot-2020-04-22-at-16.09.10-1024x371.png" alt="Low Latency data segments illustrated" width="1024" height="371" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/04/Screenshot-2020-04-22-at-16.09.10-300x109.png?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/04/Screenshot-2020-04-22-at-16.09.10.png?size=384x139&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/04/Screenshot-2020-04-22-at-16.09.10-768x278.png?lossy=2&amp;strip=1&amp;webp=1 768w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/04/Screenshot-2020-04-22-at-16.09.10-1024x371.png?lossy=2&amp;strip=1&amp;webp=1 1024w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/04/Screenshot-2020-04-22-at-16.09.10-1536x556.png?lossy=2&amp;strip=1&amp;webp=1 1536w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/04/Screenshot-2020-04-22-at-16.09.10-2048x741.png?lossy=2&amp;strip=1&amp;webp=1 2048w" sizes="(max-width: 1024px) 100vw, 1024px" /><figcaption id="caption-attachment-112110" class="wp-caption-text">6s fMP4 segment compared to chunked CMAF</figcaption></figure><br />
<span style="font-weight: 400;">Chunked encoding on its own does not help us decrease the latency but is a key ingredient. To capitalize on chunked encodes, we need to combine the process with HTTP 1.1 chunked transfer encoding (CTE). CTE is a feature of HTTP that allows resource transfers where size is unknown at the time of transfer. It does so by transferring resources chunk-wise and signaling the end of a resource with a chunk of length 0. We can utilize CTE at the encoder to write CMAF chunks to the storage as soon as they are being produced without waiting for the encode of the full segment to finish. This enables the player to request (also using CTE) available CMAF chunks of a segment that is still being encoded and forward them as fast as possible to the decoder for playout. Therefore allowing playback as soon as the first CMAF chunk is received.</span><br />
<img loading="lazy" decoding="async" class="size-large wp-image-112103 aligncenter" src="https://bitmovin.com/wp-content/uploads/2020/04/Chunked-CMAF-1-1024x465.png" alt="Chunked CMAF Data Segment in storage illustrated" width="1024" height="465" /></p>
<h2><span style="font-weight: 400;">Implications of low latency chunked delivery</span></h2>
<p><span style="font-weight: 400;">… besides enabling low latency:</span></p>
<ul>
<li style="font-weight: 400;"><span style="text-decoration: underline;"><b>Smoother and less jumpy client buffer levels</b></span><span style="font-weight: 400;"> from the constant flow of CMAF chunks received. Thus lowering the risk of buffer underruns and improves playback stability.</span></li>
<li style="font-weight: 400;"><a href="https://bitmovin.com/importance-video-startup-time/"><b>Faster stream startup</b></a><span style="font-weight: 400;"> (</span><i><span style="font-weight: 400;">time to first frame</span></i><span style="font-weight: 400;">) and seeking at the client due to being able to decode and playout segments partially during their download.</span></li>
<li style="font-weight: 400;"><span style="text-decoration: underline;"><b>Higher overhead in segment file size</b></span><span style="font-weight: 400;"> compared to non-chunked segments as a result of the additional metadata (moof boxes, mdat headers) introduced with chunked encodes.</span></li>
<li style="font-weight: 400;"><span style="text-decoration: underline;"><b>Low buffer levels</b></span><span style="font-weight: 400;"> at the client impact playback stability. A low live latency implies the client is playing close to the live edge and has a low buffer level. Therefore the longest achievable buffer level is limited by the current live latency. It’s a QoE tradeoff: low latency vs. playback stability.</span></li>
<li style="font-weight: 400;"><b><span style="text-decoration: underline;">Bandwidth estimation for adaptive streaming at the client is hard.</span> <span style="font-weight: 400;">When loading a segment at the bleeding live edge, the download rate will be limited by the source/encoder. As content is produced in real-time it takes, for example, 6 seconds to encode a 6-second long segment. So the download rate/time for segments is no longer limited by networks but by encoders. This causes a problem in bandwidth estimation methods that are currently commonplace in the industry and based on the download duration. The standard formula to calculate bandwidth estimation is:</span></b></li>
</ul>
<blockquote><p><span style="font-weight: 400;">estimatedBW = segmentSize / downloadDuration</span><br />
<span style="font-weight: 400;">E.g.: estimatedBW = 1MB / 2s = 4mbit</span></p></blockquote>
<p><span style="font-weight: 400;">As download duration roughly equals the segment duration when loading at the bleeding live edge using CTE, it can no longer be used to estimate client bandwidth. Bandwidth estimation is a crucial part of any adaptive streaming player and the lack of estimated bandwidth must be addressed. Research for better ways to estimate bandwidth in chunked low-latency delivery scenarios is ongoing in academia and throughout the streaming industry, e.g. </span><a href="https://dl.acm.org/doi/10.1145/3304112.3325611" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">ACTE</span></a><span style="font-weight: 400;">.</span><br />
Did you enjoy this post? Want to learn more? Check out Part two of the Low Latency series: <a href="https://bitmovin.com/live-low-latency-streaming-p2/"><span style="font-weight: 400;">Video Tech Deep-Dive: Live Low Latency Streaming Part 2</span></a><br />
<span style="font-weight: 400;">&#8230;or if you want to jump ahead, take a look at Part three: </span><a href="https://bitmovin.com/live-low-latency-hls/">Video Tech Deep-Dive: Live Low Latency Streaming Part 3 – Low-Latency HLS</a></p>
<p>The post <a rel="nofollow" href="https://bitmovin.com/live-low-latency-streaming-p1">Video Tech Deep-Dive: Live Low Latency Streaming Part 1</a> appeared first on <a rel="nofollow" href="https://bitmovin.com">Bitmovin</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Fun with Container Formats &#8211; Part 2</title>
		<link>https://bitmovin.com/fun-with-container-formats-2</link>
		
		<dc:creator><![CDATA[Jameson Steiner]]></dc:creator>
		<pubDate>Mon, 01 Jul 2019 21:54:35 +0000</pubDate>
				<category><![CDATA[Developers]]></category>
		<guid isPermaLink="false">https://bitmovin.com/?p=48439</guid>

					<description><![CDATA[<p>In continuation of our Fun with Container Formats series, this week we’ll be diving into the MP4 and CMAF container formats. If you need a refresher on the terminology or the handling of these formats in your player, please check back to Post 1: Fun with Container Formats MP4 Overview of Standards MPEG-4 Part 14...</p>
<p>The post <a rel="nofollow" href="https://bitmovin.com/fun-with-container-formats-2">Fun with Container Formats &#8211; Part 2</a> appeared first on <a rel="nofollow" href="https://bitmovin.com">Bitmovin</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p><img loading="lazy" decoding="async" class="alignnone size-full wp-image-48447" src="https://bitmovin.com/wp-content/uploads/2019/07/image1-1.png" alt="- Bitmovin" width="1999" height="1000"><br />
<span style="font-weight: 400;">In continuation of our Fun with Container Formats series, this week we’ll be diving into the <strong>MP4 and CMAF</strong> container formats. If you need a refresher on the terminology or the handling of these formats in your player, please check back to </span><a href="https://bitmovin.com/fun-with-container-formats-1/"><span style="font-weight: 400;">Post 1: Fun with Container Formats</span></a></p>
<h2><span style="font-weight: 400;">MP4</span></h2>
<p><b>Overview of Standards</b><br />
<span style="font-weight: 400;">MPEG-4 Part 14 (MP4) is one of the most commonly used container formats and often has a .mp4 file ending. It is used for Dynamic Adaptive Streaming over HTTP (DASH) and can also be used for Apple’s HLS streaming. </span><span style="font-weight: 400;"><br />
</span><span style="font-weight: 400;">MP4 is based on the ISO Base Media File Format (</span><span style="font-weight: 400;">MPEG-4 Part 12), which is based on the QuickTime File Format. MPEG stands for Moving Pictures Experts Group and is a cooperation of the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC). MPEG was formed to set standards for audio and video compression and transmission. MPEG-4 specifies the Coding of audio-visual objects. </span><br />
<span style="font-weight: 400;">MP4 supports a wide range of codecs. The most commonly used video codecs are H.264 and HEVC. AAC is the most commonly used audio codec. AAC is the successor of the famous MP3 audio codec.</span><br />
<img loading="lazy" decoding="async" class="alignnone size-full wp-image-48443" src="https://bitmovin.com/wp-content/uploads/2019/07/image5.png" alt="- Bitmovin" width="854" height="429" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2019/07/image5-300x151.png?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2019/07/image5.png?size=384x193&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2019/07/image5-768x386.png?lossy=2&amp;strip=1&amp;webp=1 768w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2019/07/image5.png?lossy=2&amp;strip=1&amp;webp=1 854w" sizes="(max-width: 854px) 100vw, 854px" /><br />
<b>ISO Base Media File Format</b><br />
<span style="font-weight: 400;">ISO Base Media File Format (ISOBMFF, MPEG-4 Part 12) is the base of the MP4 container format. ISOBMFF is a standard that defines time-based multimedia files. Time-base multimedia usually refers to audio and video, often delivered as a steady stream. It is designed to be flexible and easy to extend. It enables interchangeability, management, editing and presentability of multimedia data.  </span><span style="font-weight: 400;"><br />
</span><span style="font-weight: 400;"><br />
</span><span style="font-weight: 400;">The base component of ISOBMFF are boxes, which are also called atoms. The standard defines the boxes, by using classes and an object oriented approach. Using inheritance all boxes extend a base class Box and can be made specific in their purpose by adding new class properties. </span><br />
<span style="font-weight: 400;">The base class:</span><br />
<span style="font-weight: 400;">  Example FileTypeBox:</span><span style="font-weight: 400;"><br />
</span><br />
<span style="font-weight: 400;">The FileTypeBox is used to identify the purpose and usage of an ISOBMFF file. It is often at the beginning of a file. </span><span style="font-weight: 400;"><br />
</span><span style="font-weight: 400;">A box can also have children and form a tree of boxes. For example the MovieBox (moov) can have multiple TrackBoxes (trak). A track in the context of ISOBMFF is a single media stream. E.g. a MovieBox contains a trak box for video and one track box for audio. </span><br />
<span style="font-weight: 400;">The binary codec data can be stored in a Media Data Box (mdat). A track usually references  its binary codec data.</span><br />
<img loading="lazy" decoding="async" class="alignnone size-full wp-image-48442" src="https://bitmovin.com/wp-content/uploads/2019/07/image6-1.jpg" alt="- Bitmovin" width="1253" height="553"><br />
<b>Fragmented MP4 (fMP4)</b><br />
<img loading="lazy" decoding="async" class="alignnone size-full wp-image-48441" src="https://bitmovin.com/wp-content/uploads/2019/07/image7-1.png" alt="- Bitmovin" width="1293" height="458"><br />
<span style="font-weight: 400;">Using MP4 it is also possible to split a movie into multiple fragments. This has the advantage that for using DASH or HLS a <a href="https://bitmovin.com/video-player-datasheet/">player software</a> only needs to download the fragments the viewer wants to watch. A fragmented MP4 file consists of the usual MovieBox with the TrackBoxes to signal which media streams are available. A Movie Extends Box (mvex) is used to signal that the movie is continued in the fragments. Another advantage is that fragments can be stored in different files. A fragment consists of a Movie Fragment Box (moof), which is very similar to a Movie Box (moov). It contains the information about the media streams contained in one single fragment. E.g. it contains the timestamp information for the 10 seconds of video, which are stored in the fragment. Each fragment has its own Media Data (mdat) box. </span><br />
<b>Debugging (f)MP4 files</b><br />
<span style="font-weight: 400;">Viewing the boxes (atoms) of an (f)MP4 file is often necessary to discover bugs and other unwanted configurations of specific boxes. To get a summary of what a media file contains the best tools are:</span></p>
<ul>
<li style="font-weight: 400;"><span style="font-weight: 400;">MediaInfo (</span><a href="https://mediaarea.net/en/MediaInfo/Download" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">https://mediaarea.net/en/MediaInfo/Download</span></a><span style="font-weight: 400;">)</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">ffprobe, which is part of the ffmpeg binaries (</span><a href="https://ffbinaries.com/downloads" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">https://ffbinaries.com/downloads</span></a><span style="font-weight: 400;">)</span></li>
</ul>
<p><span style="font-weight: 400;">These tools will however not show you the box structure of an (f)MP4 file. For this you could use the following tools:</span></p>
<ul>
<li style="font-weight: 400;"><span style="font-weight: 400;">Boxdumper (https://github.com/l-smash/l-smash)</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">IsoViewer (https://github.com/sannies/isoviewer)</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">MP4Box.js (</span><a href="http://download.tsi.telecom-paristech.fr/gpac/mp4box.js/filereader.html" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">http://download.tsi.telecom-paristech.fr/gpac/mp4box.js/filereader.html</span></a><span style="font-weight: 400;">)</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">Mp4dump (</span><a href="https://www.bento4.com/" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">https://www.bento4.com/</span></a><span style="font-weight: 400;">) </span></li>
</ul>
<p><img loading="lazy" decoding="async" class="alignnone size-full wp-image-48446" src="https://bitmovin.com/wp-content/uploads/2019/07/image2.png" alt="- Bitmovin" width="832" height="567" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2019/07/image2-300x204.png?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2019/07/image2.png?size=384x262&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2019/07/image2-768x523.png?lossy=2&amp;strip=1&amp;webp=1 768w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2019/07/image2.png?lossy=2&amp;strip=1&amp;webp=1 832w" sizes="(max-width: 832px) 100vw, 832px" /><br />
<span style="font-weight: 400;">(Screenshot of isoviewer)</span></p>
<h2><span style="font-weight: 400;">CMAF</span></h2>
<p><b>MPEG-CMAF (Common Media Application Format)</b><br />
<span style="font-weight: 400;">Serving every platform as a content distributor can prove to be challenging as some platforms only support certain container formats. To distribute a certain piece of content it can be necessary to produce and serve copies of the content in different container formats, e.g. MPEG-TS and fMP4. Clearly, this causes additional costs in infrastructure for content creation as well as storage costs for hosting multiple copies of the same content. On top of that, it also makes CDN caching less efficient. MPEG-CMAF aims to solve these problems, not by creating yet another container format, but by converging to a single already existing container format for OTT media delivery. CMAF is closely related to fMP4 which should make the transition from fMP4 to CMAF to be of very low effort. Further, with Apple being involved in CMAF, the necessity of having content muxed in MPEG-TS to serve Apple devices should hopefully be a thing of the past and CMAF can be used everywhere.</span><br />
<span style="font-weight: 400;">With MPEG-CMAF, there are also improvements in the interoperability of DRM (Digital Rights Management) solutions by the use of MPEG-CENC (Common Encryption). It is theoretically possible to encrypt the content once and still use it with all the different state-of-the-art DRM systems. However, there is no encryption scheme standardized and, unfortunately, there are still competing ones, for instance Widevine and PlayReady. Those are not compatible to each other, but the DRM industry is slowly moving to converge to one, the Common Encryption format.</span><br />
<b>Chunked CMAF</b><br />
<span style="font-weight: 400;">One interesting feature of MPEG-CMAF is the possibility to encode segments in so-called CMAF chunks. Such chunked <a href="https://bitmovin.com/video-encoding-data-sheet/">encoding</a> of the content in combination with delivering the media files using HTTP chunked transfer encoding enables lower latencies in live streaming uses cases than before.</span><br />
<img loading="lazy" decoding="async" class="alignnone size-full wp-image-48444" src="https://bitmovin.com/wp-content/uploads/2019/07/image4.png" alt="- Bitmovin" width="1300" height="540" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2019/07/image4-300x125.png?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2019/07/image4.png?size=384x160&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2019/07/image4-768x319.png?lossy=2&amp;strip=1&amp;webp=1 768w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2019/07/image4-1024x425.png?lossy=2&amp;strip=1&amp;webp=1 1024w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2019/07/image4.png?size=1152x479&amp;lossy=2&amp;strip=1&amp;webp=1 1152w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2019/07/image4.png?lossy=2&amp;strip=1&amp;webp=1 1300w" sizes="(max-width: 1300px) 100vw, 1300px" /><br />
<span style="font-weight: 400;">In traditional fMP4 the whole segment had to be fully downloaded until it could be played out. With chunked encoding, any completely loaded chunks of the segment can already be decoded and played out while still loading the rest of the segment. Hereby, the achievable live latency is no longer depending on the segment duration as the muxed chunks of an incomplete segment can already be loaded and played out at the client.</span><br />
<span style="font-weight: 400;">Make sure to check back with us next week as dive into MPEG-TS and Matroska</span><br />
Jump to <a href="https://bitmovin.com/fun-with-container-formats-3/">Part 3 of Fun with Container Formats </a></p>
<p>The post <a rel="nofollow" href="https://bitmovin.com/fun-with-container-formats-2">Fun with Container Formats &#8211; Part 2</a> appeared first on <a rel="nofollow" href="https://bitmovin.com">Bitmovin</a>.</p>
]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>
