<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	 xmlns:media="http://search.yahoo.com/mrss/" >

<channel>
	<title>Adithyan Ilangovan &#8211; Bitmovin</title>
	<atom:link href="https://bitmovin.com/author/adi-i/feed" rel="self" type="application/rss+xml" />
	<link>https://bitmovin.com</link>
	<description>Bitmovin provides adaptive streaming infrastructure for video publishers and integrators. Fastest cloud encoding and HTML5 Player. Play Video Anywhere.</description>
	<lastBuildDate>Mon, 09 Jan 2023 11:42:21 +0000</lastBuildDate>
	<language>en-GB</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	

<image>
	<url>https://bitmovin.com/wp-content/uploads/2023/11/bitmovin_favicon.svg</url>
	<title>Adithyan Ilangovan &#8211; Bitmovin</title>
	<link>https://bitmovin.com</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>Video Tech Deep Dive: Super-Resolution with Machine Learning Part 3</title>
		<link>https://bitmovin.com/super-resolution-deployments-machine-learning-p3</link>
		
		<dc:creator><![CDATA[Adithyan Ilangovan]]></dc:creator>
		<pubDate>Tue, 24 Nov 2020 09:00:05 +0000</pubDate>
				<category><![CDATA[Developers]]></category>
		<category><![CDATA[super-resolution]]></category>
		<guid isPermaLink="false">https://bitmovin.com/?p=139507</guid>

					<description><![CDATA[<p>Practical Super-Resolution Deployments and Ensuing Results Introduction Welcome to Part 3 of Bitmovin’s Video Tech Deep Dive series:  “Super-Resolution with Machine learning”.  Before you continuing with this post, I would highly recommend that you view the first two installments: Part 1 : Super-Resolution: What’s the buzz and why does it matter?  Part 2 : Super-Resolution:...</p>
<p>The post <a rel="nofollow" href="https://bitmovin.com/super-resolution-deployments-machine-learning-p3">Video Tech Deep Dive: Super-Resolution with Machine Learning Part 3</a> appeared first on <a rel="nofollow" href="https://bitmovin.com">Bitmovin</a>.</p>
]]></description>
										<content:encoded><![CDATA[<h2 style="text-align: center;"><span style="font-weight: 400;">Practical Super-Resolution Deployments and Ensuing Results</span></h2>
<p><img fetchpriority="high" decoding="async" class="aligncenter size-large wp-image-139520" src="https://bitmovin.com/wp-content/uploads/2020/11/Blog-Post-Super-Resolution-3-1024x512.jpg" alt="- Bitmovin" width="1024" height="512" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/Blog-Post-Super-Resolution-3-300x150.jpg?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/Blog-Post-Super-Resolution-3.jpg?size=384x192&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/Blog-Post-Super-Resolution-3-768x384.jpg?lossy=2&amp;strip=1&amp;webp=1 768w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/Blog-Post-Super-Resolution-3-1024x512.jpg?lossy=2&amp;strip=1&amp;webp=1 1024w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/Blog-Post-Super-Resolution-3.jpg?lossy=2&amp;strip=1&amp;webp=1 1080w" sizes="(max-width: 1024px) 100vw, 1024px" /></p>
<h3>Introduction</h3>
<p><span style="font-weight: 400;">Welcome to Part 3 of Bitmovin’s Video Tech Deep Dive series:  “Super-Resolution with Machine learning”. </span><br />
<span style="font-weight: 400;">Before you continuing with this post, I would highly recommend that you view the first two installments:</span></p>
<ul>
<li style="font-weight: 400;"><span style="font-weight: 400;">Part 1 : </span><a href="https://bitmovin.com/super-resolution-machine-learning-p1/"><span style="font-weight: 400;">Super-Resolution: What’s the buzz and why does it matter?</span></a><span style="font-weight: 400;"> </span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">Part 2 : </span><a href="https://bitmovin.com/super-resolution-machine-learning-p2/"><span style="font-weight: 400;">Super-Resolution: Why is it good and how can you incorporate it?</span></a></li>
</ul>
<p><span style="font-weight: 400;">However, if you would rather jump right into it, here is a quick summary: </span></p>
<ul>
<li style="font-weight: 400;"><span style="font-weight: 400;">Spatially upsampling videos is a huge business opportunity.</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">Super-Resolution (SR) is a class of techniques to spatially upsample videos. </span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">Machine Learning (ML) based SR methods are superior to the conventional SR methods.</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">SR can be incorporated into your video workflow in several ways, and consequently, help you improve the end-user experience. </span></li>
</ul>
<p><span style="font-weight: 400;">In this closing post, we explore:</span></p>
<ul>
<li style="font-weight: 400;"><span style="font-weight: 400;">How to do practical super-resolution deployments in your video workflows?</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">Which tools you should be using?</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">Some real-life results from applying these practical deployments.</span></li>
</ul>
<hr />
<h2><span style="font-weight: 400;">Practical Super-Resolution Deployments</span></h2>
<p><span style="font-weight: 400;">So, we understand how ML-based super-resolution works in theory. But how is it actually deployed in practice? </span></p>
<h3><span style="font-weight: 400;">Classic 3-Step Playbook</span></h3>
<p><span style="font-weight: 400;">It follows the classic 3 step playbook (This is by no means a comprehensive explanation of how ML models work. But rather a simplified representation of it to get a basic understanding), like any other ML-based deployments. You need to : </span></p>
<ol>
<li style="font-weight: 400;"><span style="font-weight: 400;">Choose the right ML model</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">Train the chosen ML model</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">Use the trained ML model</span></li>
</ol>
<figure id="attachment_139510" aria-describedby="caption-attachment-139510" style="width: 961px" class="wp-caption aligncenter"><img decoding="async" class="wp-image-139510 size-full" src="https://bitmovin.com/wp-content/uploads/2020/11/Class-3-Step-Machine-Learning-Model_illustrated.jpg" alt="Class 3 Step Machine-Learning Model_illustrated" width="961" height="712" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/Class-3-Step-Machine-Learning-Model_illustrated-300x222.png?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/Class-3-Step-Machine-Learning-Model_illustrated.jpg?size=384x285&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/Class-3-Step-Machine-Learning-Model_illustrated-768x569.png?lossy=2&amp;strip=1&amp;webp=1 768w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/Class-3-Step-Machine-Learning-Model_illustrated.jpg?lossy=2&amp;strip=1&amp;webp=1 961w" sizes="(max-width: 961px) 100vw, 961px" /><figcaption id="caption-attachment-139510" class="wp-caption-text">The classic 3-steps used in Machine Learning deployment</figcaption></figure>
<h4><strong>Choosing a model</strong></h4>
<p><span style="font-weight: 400;">The first step is choosing the right model structure to train and deploy. The model you select will determine the level of tradeoff between computation complexity vs performance. For example, you can select a “complex” model that is hard to train in Step 2 but will give you great results in Step 3. Conversely, you could choose a “simpler” model that is easy to train but will give you comparatively worse performance. </span><br />
<span style="font-weight: 400;">This is similar to the tradeoff that you make when choosing different codecs. The tradeoff with codecs is defined by </span><a href="https://bitmovin.com/demos/multi-codec-streaming"><span style="font-weight: 400;">compression efficiency vs encoding complexity</span></a><span style="font-weight: 400;">. There is no one “correct” model as it’s often determined by the particular use-case requirement.</span></p>
<h4><strong>Training the model</strong></h4>
<p><span style="font-weight: 400;">Once you’ve selected the right model for the specific project, you need to feed it with training data. In the case of super-resolution, the training data is a compilation of high-resolution videos and their corresponding low-resolution videos. The model is like an empty brain and the training data is its sensory inputs of the real world. Based on the training data, the model will learn how to upsample videos.  </span><br />
<span style="font-weight: 400;">As you might have guessed, the choice of training data highly influences what a model learns. If you only feed it a particular type of content, say cartoon, it will perform exceptionally well in that particular type of content. But not so much for the other types of content. So, the training dataset has to be carefully chosen. Once the model is trained and learned, you can start deploying to actually upsample low-resolution videos. </span></p>
<h4><strong>Using the model</strong></h4>
<p><span style="font-weight: 400;">The last step is to use the newly-trained model in Step 2 to upsample videos by feeding in a low-resolution video and it will provide you with an upsampled high-resolution video.  </span></p>
<hr />
<h3><span style="font-weight: 400;">Implementation of the model</span></h3>
<p><span style="font-weight: 400;">Once you’ve selected an ML model, the next critical step is deciding how to actually implement the chosen model. This is similar to choosing a </span><a href="https://bitmovin.com/multi-codec-world-2020/"><span style="font-weight: 400;">codec</span></a> and its corresponding implementation in an encoding workflow :</p>
<ol>
<li style="font-weight: 400;"><span style="font-weight: 400;">Selecting the codec (ex: HEVC, VP9, AV1) &#8211; this is the standard that will specify how the encoding should be applied</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">The implementation of the codec &#8211; There are several implementation options for the same codec. In the case of HEVC (H.265), there is X265, </span><a href="http://x265.org/beamr-hevc-encoder-comparison/" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">Beamr-HEVC</span></a><span style="font-weight: 400;">, </span><a href="https://developer.nvidia.com/video-encode-and-decode-gpu-support-matrix-new" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">Nvidia-Hardware-HEVC</span></a><span style="font-weight: 400;">, among others. The difference between implementations could be either hardware or software-based. Furthermore, within the software, the implementation could be based on an open-source or closed-source build. </span></li>
</ol>
<p><span style="font-weight: 400;">Similarly, when it comes to the super-resolution ML models, you could apply the same differentiating implementations for the same model, open-source vs closed-source and/or hardware-accelerated vs software-accelerated. For example, the academic </span><a href="https://arxiv.org/pdf/1501.00092.pdf" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">SRCNN</span></a><span style="font-weight: 400;"> is an open-source and popular theoretical Super-Resolution ML model, and </span><a href="https://github.com/nagadomi/waifu2x" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">Waifu-2x</span></a><span style="font-weight: 400;"> is an open-sourced implementation of that model.</span><br />
<img decoding="async" class="wp-image-139512 aligncenter" src="https://bitmovin.com/wp-content/uploads/2020/11/super-resolution-deployments-implementation-model-illustrated.jpg" alt="- Bitmovin" width="703" height="422" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/super-resolution-deployments-implementation-model-illustrated.jpg?size=140x84&amp;lossy=2&amp;strip=1&amp;webp=1 140w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/super-resolution-deployments-implementation-model-illustrated-300x180.jpg?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/super-resolution-deployments-implementation-model-illustrated.jpg?size=421x253&amp;lossy=2&amp;strip=1&amp;webp=1 421w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/super-resolution-deployments-implementation-model-illustrated.jpg?lossy=2&amp;strip=1&amp;webp=1 512w" sizes="(max-width: 703px) 100vw, 703px" /><img loading="lazy" decoding="async" class="wp-image-139511 aligncenter" src="https://bitmovin.com/wp-content/uploads/2020/11/Codec-Implementation-Model_Illustrated.png" alt="- Bitmovin" width="701" height="414" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/Codec-Implementation-Model_Illustrated.png?size=140x83&amp;lossy=2&amp;strip=1&amp;webp=1 140w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/Codec-Implementation-Model_Illustrated-300x177.png?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/Codec-Implementation-Model_Illustrated.png?size=420x248&amp;lossy=2&amp;strip=1&amp;webp=1 420w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/Codec-Implementation-Model_Illustrated.png?lossy=2&amp;strip=1&amp;webp=1 512w" sizes="(max-width: 701px) 100vw, 701px" /></p>
<p style="text-align: center;"><em>Codec and codec implementation (HEVC and x265) is analogous to the Super-Resolution ML model and its corresponding implementation (SRCNN and waifu-2x).</em></p>
<p>Codec and codec implementation (HEVC and x265) is analogous to the Super-Resolution ML model and its corresponding implementation (SRCNN and waifu-2x).<span style="font-weight: 400;">Some popular open-source implementations of the Super-Resolution ML models are </span><a href="https://github.com/nagadomi/waifu2x" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">Waifu-2x</span></a><span style="font-weight: 400;">, </span><a href="https://github.com/bloc97/Anime4K" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">Anime-4k-CPP</span></a><span style="font-weight: 400;">, and </span><a href="https://github.com/bloc97/Anime4K" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">ACNet</span></a><span style="font-weight: 400;">.</span></p>
<hr />
<h2><span style="font-weight: 400;">Super-Resolution Practical Results </span></h2>
<p>In this section, we look at some of the results obtained from using an implementation of the Super-Resolution model. In the first section, we look at an implementation where we had to manually train the model. And in the next section, we will look at the implementation where we use a pre-trained model.</p>
<h3><span style="font-weight: 400;">Manually Trained model</span></h3>
<h4><span style="font-weight: 400;">Methodology</span></h4>
<p><span style="font-weight: 400;">In this section, we follow all the three steps laid out earlier in the Section &#8220;Classic 3-Step Playbook&#8221;. We use <a href="https://www.ffmpeg.org/index.html#news" rel="nofollow noopener" target="_blank">FFmpeg</a> to do the testing. The following were the settings we used to perform an evaluation of super-resolution in </span><a href="https://www.ffmpeg.org/index.html#news" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">FFmpeg</span></a></p>
<figure id="attachment_139513" aria-describedby="caption-attachment-139513" style="width: 1024px" class="wp-caption aligncenter"><img loading="lazy" decoding="async" class="size-full wp-image-139513" src="https://bitmovin.com/wp-content/uploads/2020/11/super-resolution-deployments-3-step-ML-model_illustrated.jpg" alt="super-resolution deployments-3 step ML model_illustrated" width="1024" height="768" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/super-resolution-deployments-3-step-ML-model_illustrated-300x225.png?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/super-resolution-deployments-3-step-ML-model_illustrated.jpg?size=384x288&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/super-resolution-deployments-3-step-ML-model_illustrated-768x576.png?lossy=2&amp;strip=1&amp;webp=1 768w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/super-resolution-deployments-3-step-ML-model_illustrated.jpg?lossy=2&amp;strip=1&amp;webp=1 1024w" sizes="(max-width: 1024px) 100vw, 1024px" /><figcaption id="caption-attachment-139513" class="wp-caption-text">Evaluation of i<a href="https://ffmpeg.org/ffmpeg-filters.html#sr" rel="nofollow noopener" target="_blank">nbuilt super-resolution video filte</a>r in <a href="https://www.ffmpeg.org/index.html#news" rel="nofollow noopener" target="_blank">FFmpeg</a>.</figcaption></figure>
<ol>
<li style="font-weight: 400;"><b>Choose the model:</b><span style="font-weight: 400;"> We chose the Efficient Sub-Pixel Convolutional Neural Network model (</span><a href="https://arxiv.org/abs/1501.00092" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">SRCNN</span></a><span style="font-weight: 400;">). </span></li>
<li style="font-weight: 400;"><b>Train the model:</b><span style="font-weight: 400;"> We used two 1080p sample videos to create the model that was fed into Ffmpeg.</span></li>
<li style="font-weight: 400;"><b>Use the model:</b><span style="font-weight: 400;"> Once the model is ready, we can use it to upsample any video.</span></li>
</ol>
<p><span style="font-weight: 400;">To test the performance of the super-resolution upsampling method in a real video workflow, we selected a 720p video as input. The input video was upscaled and transcoded using H.264 with high encoding settings and at different bitrates. To determine the effectiveness of the super-resolution method, we ran the test using the traditional bicubic method first, for control.</span><br />
<img loading="lazy" decoding="async" class="aligncenter wp-image-139514" src="https://bitmovin.com/wp-content/uploads/2020/11/super-resolution-deployments_process-workflow_illustrated-1024x316.png" alt="super-resolution-deployments_process workflow_illustrated" width="1205" height="372" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/super-resolution-deployments_process-workflow_illustrated-300x93.png?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/super-resolution-deployments_process-workflow_illustrated.png?size=384x119&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/super-resolution-deployments_process-workflow_illustrated-768x237.png?lossy=2&amp;strip=1&amp;webp=1 768w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/super-resolution-deployments_process-workflow_illustrated-1024x316.png?lossy=2&amp;strip=1&amp;webp=1 1024w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/super-resolution-deployments_process-workflow_illustrated.png?size=1152x356&amp;lossy=2&amp;strip=1&amp;webp=1 1152w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/super-resolution-deployments_process-workflow_illustrated-1536x474.png?lossy=2&amp;strip=1&amp;webp=1 1536w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/super-resolution-deployments_process-workflow_illustrated.jpg?lossy=2&amp;strip=1&amp;webp=1 1600w" sizes="(max-width: 1205px) 100vw, 1205px" /><br />
<span style="font-weight: 400;">Once we had the results of both upsamples, I compared the final output quality against the input quality using the </span><a href="https://github.com/Netflix/vmaf" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">VMAF quality metric</span></a><span style="font-weight: 400;">. The result is shown in the following figure. </span></p>
<figure id="attachment_139515" aria-describedby="caption-attachment-139515" style="width: 670px" class="wp-caption aligncenter"><img loading="lazy" decoding="async" class="wp-image-139515 " src="https://bitmovin.com/wp-content/uploads/2020/11/super-resolution-deployments-visual-quality-results-VMAF_graph-e1606135951717.jpg" alt="super-resolution-deployments-visual-quality results-VMAF_graph" width="670" height="631" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/super-resolution-deployments-visual-quality-results-VMAF_graph-e1606135951717.jpg?size=134x126&amp;lossy=2&amp;strip=1&amp;webp=1 134w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/super-resolution-deployments-visual-quality-results-VMAF_graph-e1606135951717-300x282.png?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/super-resolution-deployments-visual-quality-results-VMAF_graph-e1606135951717.jpg?lossy=2&amp;strip=1&amp;webp=1 375w" sizes="(max-width: 670px) 100vw, 670px" /><figcaption id="caption-attachment-139515" class="wp-caption-text">VMAF vs Bitrate (in Kbps) for the different upsampling methods.</figcaption></figure>
<h4><span style="font-weight: 400;">Results</span></h4>
<p><span style="font-weight: 400;">We observe that on average there is a 6 point difference in VMAF. If you are in the video field, you will realize that is a pretty </span><a href="https://streaminglearningcenter.com/blogs/finding-the-just-noticeable-difference-with-netflix-vmaf.html" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">significant gain in video quality.</span></a><span style="font-weight: 400;"> </span><br />
<span style="font-weight: 400;">Admittedly, this is not a scientific evaluation of the upsampling methods and also not a fair comparison. Because we used only two videos to train the model. And we fed in a similar type of low-resolution video for super-resolution upsampling. So the experiment is highly rigged towards the super-resolution method. </span><br />
<span style="font-weight: 400;">Nevertheless, this goes on to show the superiority of the super-resolution methods compared to the traditional upsampling method. This was more of a test-grade result. In the next section, we will look at the production-grade result.</span><br />
<span style="font-weight: 400;">If you are an engineer/developer and want to try your hand at the aforementioned steps, then you are in luck. The popular multimedia tool </span><a href="https://www.ffmpeg.org/index.html#news" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">FFmpeg</span></a><span style="font-weight: 400;"> has supported super-resolution as an </span><a href="https://ffmpeg.org/ffmpeg-filters.html#sr" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">inbuilt video filter</span></a><span style="font-weight: 400;">, since version 4.2. </span></p>
<hr />
<h3><span style="font-weight: 400;">Pre-Trained Model</span></h3>
<p><span style="font-weight: 400;">It’s important to note that steps one and two from the “Classic 3-step playbook” are not always mandatory when using machine-learning. In some cases, it’s possible that the best performing and most appropriate models have already been chosen and trained. In the case of super-resolution, there are plenty of models publicly available for </span><a href="https://github.com/nagadomi/waifu2x" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">upsampling anime content</span></a><span style="font-weight: 400;"> that has been trained using larger data sets and also proven to </span><a href="https://medium.com/crunchyroll/scaling-up-anime-with-machine-learning-and-smart-real-time-algorithms-2fb706ec56c0" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">work well</span></a><span style="font-weight: 400;">. Therefore, you don’t need to concern yourself with the burden of manually training your own models. In this scenario, you could simply plug the existing model into your production workflow model and “turn on” super-resolution.</span></p>
<figure id="attachment_139516" aria-describedby="caption-attachment-139516" style="width: 1024px" class="wp-caption aligncenter"><img loading="lazy" decoding="async" class="size-large wp-image-139516" src="https://bitmovin.com/wp-content/uploads/2020/11/super-resolution-deployments-3-step-ML-anime-content_illustrated-1024x582.jpg" alt="super-resolution-deployments-3 step ML-anime content_illustrated" width="1024" height="582" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/super-resolution-deployments-3-step-ML-anime-content_illustrated-300x171.jpg?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/super-resolution-deployments-3-step-ML-anime-content_illustrated.jpg?size=384x218&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/super-resolution-deployments-3-step-ML-anime-content_illustrated-768x437.jpg?lossy=2&amp;strip=1&amp;webp=1 768w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/super-resolution-deployments-3-step-ML-anime-content_illustrated-1024x582.jpg?lossy=2&amp;strip=1&amp;webp=1 1024w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/super-resolution-deployments-3-step-ML-anime-content_illustrated-1536x874.jpg?lossy=2&amp;strip=1&amp;webp=1 1536w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/super-resolution-deployments-3-step-ML-anime-content_illustrated.jpg?lossy=2&amp;strip=1&amp;webp=1 1600w" sizes="(max-width: 1024px) 100vw, 1024px" /><figcaption id="caption-attachment-139516" class="wp-caption-text">For certain use cases, such as anime, you could skip step 1 (Choosing) and step 2 (Training), and directly use a pre-trained model.</figcaption></figure>
<p><span style="font-weight: 400;">For our second super-resolution deployment test, we chose the popular </span><a href="https://github.com/nagadomi/waifu2x" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">Waifu2x</span></a><span style="font-weight: 400;"> implementation with a pre-trained model to do some production-grade testing for a very popular (but old) anime series. Given that this model is a perfect fit for upsampling anime-style art, we selected production assets that are old, low-resolution, and noisy. </span></p>
<figure id="attachment_139517" aria-describedby="caption-attachment-139517" style="width: 1024px" class="wp-caption aligncenter"><img loading="lazy" decoding="async" class="size-large wp-image-139517" src="https://bitmovin.com/wp-content/uploads/2020/11/super-resolution-deployment-anime-content-selection_image-1024x529.png" alt="super-resolution-deployment-anime content selection_image" width="1024" height="529" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/super-resolution-deployment-anime-content-selection_image-300x155.png?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/super-resolution-deployment-anime-content-selection_image.png?size=384x198&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/super-resolution-deployment-anime-content-selection_image-768x397.png?lossy=2&amp;strip=1&amp;webp=1 768w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/super-resolution-deployment-anime-content-selection_image-1024x529.png?lossy=2&amp;strip=1&amp;webp=1 1024w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/super-resolution-deployment-anime-content-selection_image.png?lossy=2&amp;strip=1&amp;webp=1 1080w" sizes="(max-width: 1024px) 100vw, 1024px" /><figcaption id="caption-attachment-139517" class="wp-caption-text">Production assets used for testing the pre-trained super-resolution model.</figcaption></figure>
<p><span style="font-weight: 400;">We feed them through the pre-trained model and the results of our test can be viewed below:</span><br />
[Rich_Web_Slider id=&#8221;1&#8243;]<br />
<span style="font-weight: 400;">As you can see from the results above, the video upsampled using the super-resolution is significantly better than the conventional upsampling methods. The details in the boundary are crisper, the blurred artifacts are reduced, among other things. </span><br />
<span style="font-weight: 400;">Since we did not have a high-resolution video for reference, we could not compute an objective quality metric (VMAF). Nevertheless, we applied subjective quality testing by playing the two videos, conventionally vs super-resolution upsampled, and asked the viewers to vote on which video looked better. 83% of the viewers voted that the super-resolution upsampled video looks better. </span></p>
<figure id="attachment_139518" aria-describedby="caption-attachment-139518" style="width: 1024px" class="wp-caption aligncenter"><img loading="lazy" decoding="async" class="size-large wp-image-139518" src="https://bitmovin.com/wp-content/uploads/2020/11/super-resolution-deployments-subjective-visual-quality-results_bar-chart-1024x955.png" alt="super-resolution-deployments-subjective visual quality-results_bar chart" width="1024" height="955" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/super-resolution-deployments-subjective-visual-quality-results_bar-chart-300x280.png?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/super-resolution-deployments-subjective-visual-quality-results_bar-chart.png?size=384x358&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/super-resolution-deployments-subjective-visual-quality-results_bar-chart-768x716.png?lossy=2&amp;strip=1&amp;webp=1 768w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/super-resolution-deployments-subjective-visual-quality-results_bar-chart-1024x955.png?lossy=2&amp;strip=1&amp;webp=1 1024w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/11/super-resolution-deployments-subjective-visual-quality-results_bar-chart.png?lossy=2&amp;strip=1&amp;webp=1 1080w" sizes="(max-width: 1024px) 100vw, 1024px" /><figcaption id="caption-attachment-139518" class="wp-caption-text">Subjective quality testing. Video-2 is the upsampled video obtained from the Super-Resolution methods. Video-1 is the upsampled video obtained from the conventional bicubic method.</figcaption></figure>
<p><span style="font-weight: 400;">These results are “super” encouraging for the future of upsampling content with machine learning, new models don’t always need to be picked and trained. Even for certain specific production use-cases, one could use pre-trained models, and obtain superior results compared to the traditional methods. One could reasonably believe that super-resolution models can be applied for many new variations of use cases.</span></p>
<hr />
<h1><span style="font-weight: 400;">Conclusion </span></h1>
<p><span style="font-weight: 400;">As we’ve learned throughout this series, super-resolution with machine learning is a huge business opportunity, especially with the exponential rise of 4K (and higher) resolution consumer devices and the count of streaming services. Given that many content owners have massive backlog libraries of standard quality content, upsampling will be a great method of re-engaging an older audience and introducing new viewers to older content. Super-resolution, one of a few types of upsampling methods can be best applied using machine learning mechanics that will further improve quality over time. However, much like a classic encoding workflow, there are countless ways to implement an ML-based super-resolution upsample &#8211; and finding the balance of existing models versus new models will ultimately help improve the end-user experience.</span><br />
Did you enjoy this post? Check out  the following content:<br />
Part one of the Super-Resolution series: <a href="https://bitmovin.com/super-resolution-machine-learning-p1/">What&#8217;s the buzz and why does it matter?</a><br />
Part two: <a href="https://bitmovin.com/super-resolution-machine-learning-p2/">Why is it good and how can you incorporate it?</a><br />
View my comparison test of <a href="https://cdn.bitmovin.com/content/demos/superresolution-ibc/superresolution/public.html">Bicubic vs Super-Resolution content here</a></p>
<p>The post <a rel="nofollow" href="https://bitmovin.com/super-resolution-deployments-machine-learning-p3">Video Tech Deep Dive: Super-Resolution with Machine Learning Part 3</a> appeared first on <a rel="nofollow" href="https://bitmovin.com">Bitmovin</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Video Tech Deep Dive: Super-Resolution with Machine Learning P2</title>
		<link>https://bitmovin.com/super-resolution-machine-learning-p2</link>
		
		<dc:creator><![CDATA[Adithyan Ilangovan]]></dc:creator>
		<pubDate>Mon, 06 Jul 2020 13:22:49 +0000</pubDate>
				<category><![CDATA[Developers]]></category>
		<category><![CDATA[super-resolution]]></category>
		<category><![CDATA[video encoding]]></category>
		<guid isPermaLink="false">https://bitmovin.com/?p=119162</guid>

					<description><![CDATA[<p>Super-Resolution: Why is it good and how can you incorporate it? Introduction  Welcome to Part 2 of Bitmovin’s Video Tech Deep Dive series: Super-Resolution with Machine learning. Before you get started, I highly recommend that you read Part 1. But if you would rather prefer to directly jump into it, here is a quick summary: ...</p>
<p>The post <a rel="nofollow" href="https://bitmovin.com/super-resolution-machine-learning-p2">Video Tech Deep Dive: Super-Resolution with Machine Learning P2</a> appeared first on <a rel="nofollow" href="https://bitmovin.com">Bitmovin</a>.</p>
]]></description>
										<content:encoded><![CDATA[<h2 style="text-align: center;"><span style="font-weight: 400;">Super-Resolution: Why is it good and how can you incorporate it?</span></h2>
<p><img loading="lazy" decoding="async" class="aligncenter size-large wp-image-119199" src="https://bitmovin.com/wp-content/uploads/2020/07/Blog-Post-Super-Resolution-2-1-1024x512.jpg" alt="- Bitmovin" width="1024" height="512"></p>
<h3><span style="font-weight: 400;">Introduction </span></h3>
<p><span style="font-weight: 400;">Welcome to Part 2 of <em>Bitmovin’s Video Tech Deep Dive series: Super-Resolution with Machine learning.</em> Before you get started, I highly recommend that you read </span><a href="https://bitmovin.com/super-resolution-machine-learning-p1/"><span style="font-weight: 400;">Part 1</span></a><span style="font-weight: 400;">. But if you would rather prefer to directly jump into it, here is a quick summary: </span></p>
<ul>
<li style="font-weight: 400;"><span style="font-weight: 400;">Spatially upsampling videos is a huge business opportunity.</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">Super-Resolution is a class of techniques to spatially upsample videos. </span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">Super-Resolution can be categorized into two categories: machine-learning based and non-machine-learning based. </span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">This blog series will focus on machine learning-based super-resolution.</span></li>
</ul>
<figure id="attachment_115031" aria-describedby="caption-attachment-115031" style="width: 737px" class="wp-caption aligncenter"><img loading="lazy" decoding="async" class="wp-image-115031" src="https://bitmovin.com/wp-content/uploads/2020/05/Super-Resolution.001.jpeg" alt="- Bitmovin" width="737" height="635" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/05/Super-Resolution.001-300x259.jpeg?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/05/Super-Resolution.001.jpeg?size=384x331&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/05/Super-Resolution.001-768x663.jpeg?lossy=2&amp;strip=1&amp;webp=1 768w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/05/Super-Resolution.001.jpeg?lossy=2&amp;strip=1&amp;webp=1 801w" sizes="(max-width: 737px) 100vw, 737px" /><figcaption id="caption-attachment-115031" class="wp-caption-text">The focus of this series of blog posts will be on machine learning-based super-resolution.</figcaption></figure>
<p><span style="font-weight: 400;">In this post, we will examine:</span></p>
<ul>
<li style="font-weight: 400;"><span style="font-weight: 400;">What factors lead to the current popularity of machine learning-based super-resolution?</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">Why is it better than the other conventional methods? </span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">And, finally, how you can incorporate it into your video workflow and what benefits will super-resolution yield?</span></li>
</ul>
<hr />
<h2><span style="font-weight: 400;">The Holy Trinity: Super-Resolution, Machine learning, and Video upscaling</span></h2>
<p><span style="font-weight: 400;">Super-resolution, Machine learning (ML), and Video Upscaling are a match made in heaven. The three factors coming together is the reason behind the current popularity in <em><strong>Machine-learning based super-resolution</strong> </em>applications. In this section, we will see why.</span></p>
<h3><span style="font-weight: 400;">Super-Resolution and its beginnings</span></h3>
<p><span style="font-weight: 400;">The concept of super-resolution has existed since the 1980s. The basic idea behind super-resolution was (and continues to be) to </span><b>intelligently</b><span style="font-weight: 400;"> combine </span><b>non-redundant information</b><span style="font-weight: 400;"> from multiple related low-resolution images to generate a single high-resolution image. </span></p>
<figure id="attachment_119260" aria-describedby="caption-attachment-119260" style="width: 912px" class="wp-caption aligncenter"><img loading="lazy" decoding="async" class="wp-image-119260 size-full" src="https://bitmovin.com/wp-content/uploads/2020/07/3D1C7B00-86DE-47E7-AFA3-7ADE95695290.jpeg" alt="- Bitmovin" width="912" height="373" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/07/3D1C7B00-86DE-47E7-AFA3-7ADE95695290-300x123.jpeg?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/07/3D1C7B00-86DE-47E7-AFA3-7ADE95695290.jpeg?size=384x157&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/07/3D1C7B00-86DE-47E7-AFA3-7ADE95695290-768x314.jpeg?lossy=2&amp;strip=1&amp;webp=1 768w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/07/3D1C7B00-86DE-47E7-AFA3-7ADE95695290.jpeg?lossy=2&amp;strip=1&amp;webp=1 912w" sizes="(max-width: 912px) 100vw, 912px" /><figcaption id="caption-attachment-119260" class="wp-caption-text">Super-Resolution uses non-redundant information from several related images to produce a single image.</figcaption></figure>
<p><span style="font-weight: 400;">Some classic early applications were finding license plate information from several low-resolution images.</span></p>
<p><figure id="attachment_119164" aria-describedby="caption-attachment-119164" style="width: 812px" class="wp-caption aligncenter"><img loading="lazy" decoding="async" class="wp-image-119164" src="https://bitmovin.com/wp-content/uploads/2020/07/Super-Resolution-upsampled-image-example-1.jpg" alt="Super-Resolution - upsampled image example" width="812" height="465" /><figcaption id="caption-attachment-119164" class="wp-caption-text">Several low-resolution snapshots of a moving car provides non-redundant but related information. Super-Resolution uses this related non-redundancy to create higher-resolution images, which can be useful in finding information such as license plate information or driver identification [<em><a href="https://www.researchgate.net/profile/Carlos_Miravet" rel="nofollow noopener" target="_blank">Source</a></em>].</figcaption></figure><br />
<span style="font-weight: 400;">When super-resolution started out, the </span><b>“intelligence”</b><span style="font-weight: 400;"> was, roughly speaking, a set of predefined and complex mathematical formulas (</span><a href="https://www.researchgate.net/publication/3321472_Super-Resolution_Image_Reconstruction_A_Technical_Overview" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">Image observation model</span></a><span style="font-weight: 400;">, </span><a href="http://elynxsdk.free.fr/ext-docs/Supersampling/Advances%20and%20Challenges%20in%20Super-Resolution.pdf" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">Interpolation-restoration</span></a><span style="font-weight: 400;">, among others). The “</span><b>intelligence</b><span style="font-weight: 400;">” at the beginning had nothing to do with ML.</span><br />
<span style="font-weight: 400;">But the recent wave of interest in super-resolution has been primarily driven by ML.</span></p>
<hr />
<h3><span style="font-weight: 400;">Machine learning’s resurgence</span></h3>
<p><span style="font-weight: 400;">So, why ML and what changed now? </span><br />
<span style="font-weight: 400;">ML, in essence, is about learning the </span><b>“intelligence”</b><span style="font-weight: 400;"> for a <strong>well-defined problem</strong>. With the right architecture and enough data, ML can be significantly more </span><b>“intelligent”</b><span style="font-weight: 400;"> than a human-defined solution (at least in that narrow domain). We saw this demonstrated stunningly in the case of </span><a href="https://en.wikipedia.org/wiki/AlphaZero" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">AlphaZero</span></a><span style="font-weight: 400;"> (for chess) and </span><a href="https://en.wikipedia.org/wiki/AlphaZero" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">AlphaGo</span></a><span style="font-weight: 400;"> (for the board game </span><a href="https://en.wikipedia.org/wiki/Go_(game)" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">Go</span></a><span style="font-weight: 400;">).</span><br />
<span style="font-weight: 400;">Super-resolution is a </span><b>well-defined problem</b><span style="font-weight: 400;">, and one could reasonably argue that ML would be a natural fit to solve this problem. With that motivation, early theoretical solutions were already proposed in the literature.</span><br />
<span style="font-weight: 400;">But, the exorbitant computational power and fundamental unresolved complexities kept the practical applications of ML-based super-resolution at bay. </span><br />
<span style="font-weight: 400;">However, in the last few years, there were two major developments:</span></p>
<ol>
<li style="font-weight: 400;"><span style="font-weight: 400;">The enormous increase in the computational power density, especially the purpose-built </span><a href="https://phoenixnap.com/blog/future-gpu-machine-learning-ai" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">Graphical Processing Units (GPUs)</span></a><span style="font-weight: 400;">, and also their affordability.</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">Fundamental advances in ML, especially </span><a href="https://en.wikipedia.org/wiki/Convolutional_neural_network" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">Convolutional Neural Network</span></a><span style="font-weight: 400;"> (CNNs), and their ease of use. </span></li>
</ol>
<p><span style="font-weight: 400;">These developments have led to a resurgence and come back for ML-based super-resolution methods.</span><br />
<span style="font-weight: 400;">It should be mentioned that ML-based super-resolution is a versatile hammer that can be used to drive many </span><span style="font-weight: 400;">nails. It has wide applications, ranging from </span><a href="http://www.lmars.whu.edu.cn/prof_web/zhanghongyan/papers/Image%20super-resolution%20-%20The%20techniques,%20applications%20and%20future.pdf" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">medical imaging, remote sensing, astronomical observations, among others.</span></a><span style="font-weight: 400;">  But as mentioned in </span><a href="https://bitmovin.com/super-resolution-machine-learning-p1/"><span style="font-weight: 400;">Part 1</span></a><span style="font-weight: 400;"> of this series, we will focus on how</span><b> the ML super-resolution</b><span style="font-weight: 400;"> hammer can nail the problem of </span><b>video upscaling</b><span style="font-weight: 400;">. </span></p>
<hr />
<h3><span style="font-weight: 400;">The convergence of the three factors</span></h3>
<p><span style="font-weight: 400;">The last missing puzzle piece in this arc of the story is </span><b>Video upscaling</b><span style="font-weight: 400;">.  </span></p>
<figure id="attachment_119165" aria-describedby="caption-attachment-119165" style="width: 500px" class="wp-caption aligncenter"><img loading="lazy" decoding="async" class="size-full wp-image-119165" src="https://bitmovin.com/wp-content/uploads/2020/07/Super-resolution-the-holy-trinity-1.png" alt="Super-resolution- the holy trinity of upsampling-comic image" width="500" height="637" /><figcaption id="caption-attachment-119165" class="wp-caption-text">Video upscaling, Machine learning, and Super-Resolution a match made in heaven.</figcaption></figure>
<p><span style="font-weight: 400;">When you think about it, video upscaling is almost a perfect “nail” for the ML-based super-resolution “hammer”. </span><br />
<span style="font-weight: 400;">Video provides the core features needed for the ML-based Super-Resolution. Namely: </span></p>
<ul>
<li style="font-weight: 400;"><span style="font-weight: 400;">It has </span><b><i>related non-redundancy</i></b><span style="font-weight: 400;"> built-in: </span><span style="font-weight: 400;">Every single frame in a video almost always has a set of closely </span><b>“related”</b><span style="font-weight: 400;"> frames. And if there is enough motion of an object in the frame, all those </span><b>related</b><span style="font-weight: 400;"> frames should provide </span><b>non-redundant</b><span style="font-weight: 400;"> information about objects in the frame. </span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">The vast amount of available </span><b><i>data</i></b><span style="font-weight: 400;">: </span><span style="font-weight: 400;">We have no shortage of </span><a href="https://truefilmproduction.com/study-predicts-84-internet-traffic-will-video-2020/" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">video</span></a><span style="font-weight: 400;">. These vast data could be used to train the ML network and let the network learn the best to upscale intelligence.</span></li>
</ul>
<p><span style="font-weight: 400;">The convergence of these three factors is why we are witnessing a </span><span style="font-weight: 400;">huge uptick in the </span><a href="https://paperswithcode.com/task/video-super-resolution" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">research</span></a><span style="font-weight: 400;"> in this area, and also the </span><a href="https://www.nvidia.com/en-us/shield/support/shield-tv/ai-upscaling/" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">first practical applications</span></a><span style="font-weight: 400;"> in the field of ML Super-Resolution powered Video upscaling.</span></p>
<hr />
<h2><span style="font-weight: 400;">Why is it better than traditional methods?</span></h2>
<p><span style="font-weight: 400;">I provided a historical timeline and the factors that lead to ML Super-Resolution powered Video upscaling. But, it might still not be clear on why it is superior to other traditional methods (</span><a href="https://en.wikipedia.org/wiki/Bilinear_interpolation" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">bilinear</span></a><span style="font-weight: 400;">, </span><a href="https://en.wikipedia.org/wiki/Bicubic_interpolation" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">bicubic</span></a><span style="font-weight: 400;">, </span><a href="https://en.wikipedia.org/wiki/Lanczos_algorithm" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">Lanczos</span></a><span style="font-weight: 400;">, among others). In this section, I will provide a simplified explanation to provide an intuitive understanding.</span><br />
<span style="font-weight: 400;">The superior performance simply boils down to the fact that the algorithm understands the nature of the content it is upsampling. And how it tunes itself to upsample that content in the best way possible. This is in contrast to the traditional methods where there is no “tuning”. In traditional methods, the same formula is applied without any consideration of the nature of the content.</span><br />
<span style="font-weight: 400;">One could say that: </span></p>
<blockquote>
<p style="text-align: center;"><span style="font-weight: 400;">ML-based super-resolution is to upsampling, what </span><a href="https://bitmovin.com/per-title-encoding/"><span style="font-weight: 400;">Per-Title</span></a><span style="font-weight: 400;"> is to encoding.</span></p>
</blockquote>
<p><span style="font-weight: 400;">In </span><a href="https://bitmovin.com/per-title-encoding/"><span style="font-weight: 400;">Per-Title</span></a><span style="font-weight: 400;">, we use different encoding recipes for the different pieces of content. In a similar way, ML-based super-resolution uses different upsampling recipes for different pieces of content. </span><br />
<span style="font-weight: 400;">The recipes can adapt itself on both at the:</span></p>
<ul>
<li><span style="font-weight: 400;"><strong>Macro-level</strong>: </span><span style="font-weight: 400;">Use different upsampling recipes for different types of content (anime, movie, sports, among others)</span></li>
<li><span style="font-weight: 400;"><strong>Micro-level</strong>: </span><span style="font-weight: 400;">Use different upsampling recipes for different types of frames within the same content (high complexity frame, low complexity frame).</span></li>
</ul>
<figure id="attachment_119264" aria-describedby="caption-attachment-119264" style="width: 947px" class="wp-caption aligncenter"><img loading="lazy" decoding="async" class="wp-image-119264 size-full" src="https://bitmovin.com/wp-content/uploads/2020/07/9AC9EEE2-5551-4E3A-81DC-159EB204BA1A.jpeg" alt="- Bitmovin" width="947" height="601"><figcaption id="caption-attachment-119264" class="wp-caption-text">The superior performance of ML-based super-resolution comes from the fact that it understands the content. It understands both at the macro level and micro level.</figcaption></figure>
<hr />
<h2><span style="font-weight: 400;">Why do you need Super-Resolution with Machine-Learning? </span></h2>
<p><span style="font-weight: 400;">Hopefully, by now, you are already excited about the possibilities of this idea. In this section, I would like to provide some suggestions on how you can incorporate this idea into your own video workflows and the potential benefits you might expect from it.  </span></p>
<h3><span style="font-weight: 400;">Quality</span></h3>
<p><span style="font-weight: 400;">Broadly speaking, a video processing workflow typically has three steps involved:</span></p>
<ol>
<li style="font-weight: 400;"><span style="font-weight: 400;">Pre-processing (decoding, upsampling, filtering, among others) </span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">Encoding</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">Post-processing (filtering, muxing, among others) </span></li>
</ol>
<p><span style="font-weight: 400;">Typically, there is a heavy emphasis on the encoding block for visual quality optimizations (</span><a href="https://bitmovin.com/per-title-encoding/"><span style="font-weight: 400;">Per-Title</span></a><span style="font-weight: 400;">, </span><a href="https://bitmovin.com/chunk-based-3-pass-video-encoding-uses-machine-learning-deliver-unrivalled-quality/"><span style="font-weight: 400;">3-Pass</span></a><span style="font-weight: 400;">, </span><a href="https://bitmovin.com/docs/encoding/tutorials/how-to-optimize-your-h264-codec-configuration-for-different-use-cases"><span style="font-weight: 400;">Codec-Configuration</span></a><span style="font-weight: 400;">, among others).</span></p>
<figure id="attachment_119262" aria-describedby="caption-attachment-119262" style="width: 730px" class="wp-caption aligncenter"><img loading="lazy" decoding="async" class="wp-image-119262" src="https://bitmovin.com/wp-content/uploads/2020/07/199F113B-42B1-46A3-8944-C457CCBE6618_4_5005_c.jpeg" alt="- Bitmovin" width="730" height="221" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/07/199F113B-42B1-46A3-8944-C457CCBE6618_4_5005_c-300x91.jpeg?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/07/199F113B-42B1-46A3-8944-C457CCBE6618_4_5005_c.jpeg?size=384x116&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/07/199F113B-42B1-46A3-8944-C457CCBE6618_4_5005_c-768x233.jpeg?lossy=2&amp;strip=1&amp;webp=1 768w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/07/199F113B-42B1-46A3-8944-C457CCBE6618_4_5005_c.jpeg?lossy=2&amp;strip=1&amp;webp=1 950w" sizes="(max-width: 730px) 100vw, 730px" /><figcaption id="caption-attachment-119262" class="wp-caption-text">The three major blocks in a video processing workflow. The emphasis is typically on the encoding block, whereas the pre- and post-processing blocks are ignored.</figcaption></figure>
<p><span style="font-weight: 400;">But, the other two (often overlooked) blocks are as important when it comes to visual quality optimization. In this instance, upsampling is a preprocessing step. And by choosing the right upsampling methods, such as super-resolution, one can improve the visual quality of the entire workflow. Sometimes, significantly more than that could be provided from the other blocks.</span><br />
<span style="font-weight: 400;">In the Part-3 of this series, we will delve more deeply into this. We will quantify how much quality improvements one could expect from tuning the pre-processing block with super-resolution. And use some real-life examples.</span></p>
<hr />
<h3><span style="font-weight: 400;">Synergies with other blocks</span></h3>
<p><span style="font-weight: 400;">(</span><i><span style="font-weight: 400;">This specific section is primarily meant for advanced readers who understand what </span></i><a href="https://bitmovin.com/per-title-encoding/"><i><span style="font-weight: 400;">Per-Title</span></i></a><i><span style="font-weight: 400;">, </span></i><a href="https://github.com/Netflix/vmaf" rel="nofollow noopener" target="_blank"><i><span style="font-weight: 400;">VMAF</span></i></a><i><span style="font-weight: 400;">, </span></i><a href="https://netflixtechblog.com/per-title-encode-optimization-7e99442b62a2" rel="nofollow noopener" target="_blank"><i><span style="font-weight: 400;">convex-hull </span></i></a><i><span style="font-weight: 400;">means. Please feel free to skip this section</span></i><span style="font-weight: 400;">).</span><br />
<span style="font-weight: 400;">Like explained earlier, there are broadly three blocks in a video workflow. Roughly speaking, they work independently. But if we are smart about the design, we can extract synergies and use that to improve the overall video pipelines, that otherwise would not have existed.</span><br />
<span style="font-weight: 400;">One illustrative example is how Per-Title can work in conjunction with the Super-Resolution. This idea is depicted in the following figure. </span></p>
<figure id="attachment_119168" aria-describedby="caption-attachment-119168" style="width: 802px" class="wp-caption aligncenter"><img loading="lazy" decoding="async" class="wp-image-119168" src="https://bitmovin.com/wp-content/uploads/2020/07/Super-Resolution-as-applied-to-per-title-encoding-graphically-illustrated-1-1024x547.png" alt="Super-Resolution as applied to per-title encoding-graphically illustrated" width="802" height="428" /><figcaption id="caption-attachment-119168" class="wp-caption-text"><a href="https://github.com/Netflix/vmaf" rel="nofollow noopener" target="_blank">VMAF</a> vs Bitrate Convex hulls of video content. <br />Green =&gt; 360p, Red =&gt; 720p, Blue =&gt; 1080p.<br />BC : Bicubic, SR : SuperResolution.</figcaption></figure>
<p><span style="font-weight: 400;">The solid line represents the </span><a href="https://netflixtechblog.com/per-title-encode-optimization-7e99442b62a2" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">convex hull</span></a><span style="font-weight: 400;"> when the traditional bicubic upsampling method is used, whereas the dotted line represents the </span><a href="https://netflixtechblog.com/per-title-encode-optimization-7e99442b62a2" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">convex hull</span></a><span style="font-weight: 400;"> with the super-resolution method (as explained in the last section visual quality is improved by using Super-Resolution). </span><br />
<span style="font-weight: 400;">In the above figure, for the illustrated bitrate: When using the traditional method the choice is clear. We will pick the 720p rendition. But, when using Super-Resolution, the choice is not very clear. We could either pick</span></p>
<ul>
<li style="font-weight: 400;"><span style="font-weight: 400;">720p Super-Resolution rendition, or</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">360p Super-Resolution rendition, or</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">720p Bicubic rendition.</span></li>
</ul>
<p><span style="font-weight: 400;">The choice is determined by the complexity (vs) quality tradeoff that we are willing to make.</span><br />
<span style="font-weight: 400;">The takeaway message is two blocks synergistically working together to give more options and flexibility for the Per-Title algorithm to work with. Overall, a higher number of options translate to better overall results.</span><br />
<span style="font-weight: 400;">This is just one illustrative example, but within your own video workflows, you could identify regions where super-resolution can work synergically and improve the overall performance. </span></p>
<hr />
<h3><span style="font-weight: 400;">Targeted Upsampling </span></h3>
<p><span style="font-weight: 400;">If your entire video catalog is a specific kind of content (anime for example), and you want to do a targeted upsample of these contents, then without doubt </span><b>ML Super-Resolution is the way to go! </b><br />
<span style="font-weight: 400;">In fact, that is what many companies already</span><a href="https://medium.com/crunchyroll/scaling-up-anime-with-machine-learning-and-smart-real-time-algorithms-2fb706ec56c0" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;"> do.</span></a><span style="font-weight: 400;"> This specific trend will only accelerate in the future, especially considering the popularity of consumer 4K TVs.</span><br />
<i><span style="font-weight: 400;">Visual quality enhancements</span></i><span style="font-weight: 400;">, </span><i><span style="font-weight: 400;">Synergies</span></i><span style="font-weight: 400;">, and </span><i><span style="font-weight: 400;">Targeted upsampling</span></i><span style="font-weight: 400;"> are some ideas on how you can incorporate Super-Resolution into your video workflows. </span></p>
<figure id="attachment_119169" aria-describedby="caption-attachment-119169" style="width: 812px" class="wp-caption aligncenter"><img loading="lazy" decoding="async" class="wp-image-119169" src="https://bitmovin.com/wp-content/uploads/2020/07/Super-Resolution-Appied-in-Anime-image-1-1024x600.jpg" alt="Super-Resolution - Appied in Anime - image" width="812" height="476" /><figcaption id="caption-attachment-119169" class="wp-caption-text">Super-Resolution applied for targeted content such as Anime [<a href="https://medium.com/crunchyroll/scaling-up-anime-with-machine-learning-and-smart-real-time-algorithms-2fb706ec56c0" rel="nofollow noopener" target="_blank">Source</a>]</figcaption></figure>
<hr />
<h2><span style="font-weight: 400;">Summary</span></h2>
<p><span style="font-weight: 400;">We continued the story from </span><a href="https://bitmovin.com/super-resolution-machine-learning-p1/"><span style="font-weight: 400;">Part 1</span></a><span style="font-weight: 400;">. We learned that : </span></p>
<ul>
<li style="font-weight: 400;"><span style="font-weight: 400;">The convergence of three factors has led to a resurgence in Machine learning-based Super-Resolution.</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">The superior performance of Super-Resolution boils down to the fact that it understands the content it is upsampling.</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">Super-Resolution can be incorporated into your video workflow in several ways, and consequently, help you increase the end-user experience. </span></li>
</ul>
<p>Did you enjoy this post?<br />
Check out part one of the super-resolution series: <a href="https://bitmovin.com/super-resolution-machine-learning-p1/"><span style="font-weight: 400;">Super-Resolution with Machine Learning P1</span></a><span style="font-weight: 400;"> </span><br />
Check out part three: <a href="https://bitmovin.com/super-resolution-deployments-machine-learning-p3/">Practical Super-Resolution Deployments and Ensuing Results</a></p>
<p>The post <a rel="nofollow" href="https://bitmovin.com/super-resolution-machine-learning-p2">Video Tech Deep Dive: Super-Resolution with Machine Learning P2</a> appeared first on <a rel="nofollow" href="https://bitmovin.com">Bitmovin</a>.</p>
]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>
