<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	 xmlns:media="http://search.yahoo.com/mrss/" >

<channel>
	<title>Christian Feldmann &#8211; Bitmovin</title>
	<atom:link href="https://bitmovin.com/author/cfeldmann/feed" rel="self" type="application/rss+xml" />
	<link>https://bitmovin.com</link>
	<description>Bitmovin provides adaptive streaming infrastructure for video publishers and integrators. Fastest cloud encoding and HTML5 Player. Play Video Anywhere.</description>
	<lastBuildDate>Wed, 20 Sep 2023 14:48:33 +0000</lastBuildDate>
	<language>en-GB</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	

<image>
	<url>https://bitmovin.com/wp-content/uploads/2023/11/bitmovin_favicon.svg</url>
	<title>Christian Feldmann &#8211; Bitmovin</title>
	<link>https://bitmovin.com</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>VVC: Open-GOP Resolution Switching</title>
		<link>https://bitmovin.com/vvc-open-gop-resolution-switching</link>
					<comments>https://bitmovin.com/vvc-open-gop-resolution-switching#respond</comments>
		
		<dc:creator><![CDATA[Christian Feldmann]]></dc:creator>
		<pubDate>Sat, 09 Sep 2023 00:48:59 +0000</pubDate>
				<category><![CDATA[Innovation]]></category>
		<category><![CDATA[IBC]]></category>
		<category><![CDATA[video encoding]]></category>
		<category><![CDATA[vvc codec]]></category>
		<guid isPermaLink="false">https://bitmovin.com/?p=267371</guid>

					<description><![CDATA[<p>At IBC 2023 the Fraunhofer HHI, Spin Digital and Bitmovin are presenting a paper on the practical application of a new feature that was introduced in VVC: Open-GOP resolution switching. In this blog post I want to explain what open-GOP (Group of Pictures) prediction is, what the benefits are and why with VVC open-GOP prediction...</p>
<p>The post <a rel="nofollow" href="https://bitmovin.com/vvc-open-gop-resolution-switching">VVC: Open-GOP Resolution Switching</a> appeared first on <a rel="nofollow" href="https://bitmovin.com">Bitmovin</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p>At IBC 2023 the <a href="https://www.hhi.fraunhofer.de/" rel="nofollow noopener" target="_blank">Fraunhofer HHI,</a> <a href="https://spin-digital.com/" rel="nofollow noopener" target="_blank">Spin Digital</a> and <a href="https://bitmovin.com/">Bitmovin</a> are presenting a paper on the practical application of a new feature that was introduced in VVC: Open-GOP resolution switching. In this blog post I want to explain what open-GOP (Group of Pictures) prediction is, what the benefits are and why with VVC open-GOP prediction can finally be used in adaptive streaming.</p>



<div style="height:20px" aria-hidden="true" class="wp-block-spacer"></div>



<div class="wp-block-rank-math-toc-block" id="rank-math-toc"><h2>Table of Contents</h2><nav><ul><li><a href="#closed-gop-prediction-structure">Closed-GOP prediction structure</a></li><li><a href="#open-gop-prediction-structure">Open-GOP prediction structure</a><ul><li><a href="#coding-performance">Coding performance</a></li><li><a href="#coding-performance-across-segment-boundaries">Coding performance across segment boundaries</a></li></ul></li><li><a href="#open-gop-resolution-switching">Open-GOP resolution switching</a></li><li><a href="#ibc-2023">IBC 2023</a></li><li><a href="#related-links">Related Links</a></li></ul></nav></div>



<div style="height:20px" aria-hidden="true" class="wp-block-spacer"></div>



<h2 class="wp-block-heading" id="closed-gop-prediction-structure">Closed-GOP prediction structure</h2>



<p>Let us first look at a conventional closed-GOP prediction structure. Since nothing was decoded yet, the first frame in a bitstream is an <em>Instantaneous Decoding Refresh</em> (IDR) frame. If an IDR frame is received, the decoder is instantaneously reset (refreshed) and all frame buffers or other internal buffers are cleared. Since the frame has no dependencies on other frames, it can always be decoded. An IDR frame is also a Random Access (RA) point or keyframe. At RA points decoding can be started. (RAs are marked in orange).</p>



<p>The following frames are then encoded using predictive (P) coding. This means that they use data from the already decoded frames. This includes pixel data for motion compensation but also motion vectors or prediction modes. But let&#8217;s illustrate this with an example:</p>



<figure class="wp-block-image size-full"><img fetchpriority="high" decoding="async" width="1000" height="281" src="https://bitmovin.com/wp-content/uploads/2023/09/normalPrediction.gif" alt="Example closed-gop prediction structure of video frames with 1 IDR frame" class="wp-image-267373"/></figure>



<p>In this example we are encoding a total of 9 frames. The frames are marked from 0-9 in the order that they are displayed to the viewer. (The vertical offset of the uneven frame numbers is just for illustration purposes.) However, they are not encoded in the order that they are displayed. In this example, frame 2 uses only frame 0, which has been decoded already, for prediction. Next, frame 1 is decoded which is displayed between frames 0 and 2 and uses both frames for prediction. This so-called Bi-prediction is much more efficient than prediction only from frames in the temporal past and is a key feature that makes modern video codecs so efficient.&nbsp;</p>



<p>Of course, it is impractical to have only one keyframe at the very beginning of a video. We also want to be able to start decoding at frequent points within a bitstream. This allows us to seek in a video as well as to switch between different renditions as it is done in adaptive streaming. So, we can just insert multiple IDR frames in a video:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="1000" height="281" src="https://bitmovin.com/wp-content/uploads/2023/09/predictionWithIDR.gif" alt="Example closed-gop prediction structure of video frames with 2 IDR frames" class="wp-image-267374"/></figure>



<p>In this example, frame 4 is also an IDR frame. We can start decoding at frame 0 as well as at frame 4. Frames 0-3 form a Group of Pictures (GOP) which is completely self-contained and can be decoded completely independently of any other GOPs. The same is true for the following GOP of frames 4-9. As there are no dependencies between these GOPs, this is also referred to as a <em>closed-GOP</em> configuration.&nbsp;</p>



<p>The closed-GOP configuration is widely used in <a href="https://en.wikipedia.org/wiki/Adaptive_bitrate_streaming" rel="nofollow noopener" target="_blank">adaptive bitrate streaming</a> applications where the ubiquitous approach is to split the video into segments of a certain length. Each segment is then encoded using a predetermined set of different resolutions and bitrates called renditions. Since every segment starts with an IDR frame, it is possible to start decoding at each segment which therefore enables seeking. Furthermore, the video player can also freely switch to any of the other renditions at every segment boundary.</p>



<p>Another benefit emerges at the encoder side where each long video is split into small pieces (segments). If these segments can be independently decoded, then they can also be independently encoded. And if we mention “many segments” and “independently encodable” then the next thought is “scalability” and “cloud compute”. And this is exactly the principle that the Bitmovin cloud encoder is based on. We take all these individual encoding tasks and then scale horizontally in the cloud.&nbsp;</p>



<div style="height:20px" aria-hidden="true" class="wp-block-spacer"></div>



<h2 class="wp-block-heading" id="open-gop-prediction-structure">Open-GOP prediction structure</h2>



<p>So the opposite of a closed-GOP is an open-GOP configuration. The key difference is that in an open-GOP prediction structure predictions between the GOPs are allowed. Let’s again look at an example:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="1000" height="281" src="https://bitmovin.com/wp-content/uploads/2023/09/predictionWithCRA.gif" alt="Example vvc open-gop prediction structure of video frames with clean random access (CRA) frame" class="wp-image-267375"/></figure>



<p>So, the frames 0, 1 and 2 are decoded in a hierarchical fashion as before. But then something different happens. The next frame in decoding order is frame 4 which is a random access point (RA). However, it is not an IDR but a <em>Clean Random Access </em>(CRA) point. While a CRA can also be decoded independently of any other frame, it does not reset the decoder as an IDR does and the reference picture buffer is not cleared. Next, we have frame 3 in coding order. As before, this frame uses frames 2 and 4 as reference. The rest of the frames are coded as before.</p>



<p>As in the closed-GOP example we can start the decoding process at frame 4 because it is not using any other frames as a reference. But this time, the decoder cannot be reset if this frame is received because the following frame (frame 3) uses previously decoded frames as a reference which as a result must remain in the picture buffer. The process of starting decoding from the second GOP is therefore a bit more complex:</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="1000" height="281" src="https://bitmovin.com/wp-content/uploads/2023/09/open_gop_rasl_CRA.gif" alt="Example vvc open-gop prediction structure of video frames with Random Access Skipped Leading (RASL) clean random access (CRA) frame" class="wp-image-267376"/></figure>



<p>Frame 4 is a <em>Clean Random Access </em>(CRA) point so decoding can be started with this frame. For the next frame in coding order (frame 3) we now have an issue. Since one of its references (frame 2) has not been decoded, frame 3 cannot be decoded. If we start decoding at the Random Access (RA) point of frame 4, the decoding of the leading picture 3 must be skipped. Consequently, the frame type is <em>Random Access Skipped Leading </em>(RASL)<em>. </em>The remaining frames can be decoded as before.&nbsp;</p>



<p>So, what are the advantages of an open-GOP configuration? So far, we just observed that decoding is more complicated. Moreover, it is now also impossible to switch to a different rendition at the CRA because we have not decoded the reference frames that are needed for the leading frames of the open-GOP, and we will have to skip decoding them. But there are two substantial advantages:</p>



<div style="height:12px" aria-hidden="true" class="wp-block-spacer"></div>



<h3 class="wp-block-heading" id="coding-performance">Coding performance</h3>



<p>As I already mentioned, bidirectional prediction into the temporal past and future is one of the key features that make modern video codecs so efficient. Generally, the more past and future reference frames a frame can use for prediction, the higher the compression efficiency. Frames that do not use any other frames as reference (RA frames like IDR and CRA frames) typically have the worst compression efficiency.</p>



<p>While we cannot avoid having regular random access points in the bitstream for seeking, we can increase the coding efficiency of the leading pictures significantly in the open-GOP configuration. This leads to a significant reduction in overall bitrate at the same quality. In the experiments from the HHI, an overall BD-rate reduction of up to 9% could be observed. Of course, these results depend on many factors like the general coding structure, the resolution and bitrate as well as the content itself.</p>



<div style="height:12px" aria-hidden="true" class="wp-block-spacer"></div>



<h3 class="wp-block-heading" id="coding-performance-across-segment-boundaries">Coding performance across segment boundaries</h3>



<p>In a closed-GOP configuration, the decoder must be reset with every IDR frame. An unwanted side effect of this is that the quality as well as the visual representation of a scene changes very abruptly at this point. Especially at lower bitrates, this can be perceived as a sudden jump or pumping in the video. Things that are generally hard to encode like water, clouds and trees are particularly susceptible to this effect.&nbsp;</p>



<p>In this example, the difference between closed-GOP on the left and open-GOP on the right is nicely visible. The pumping is especially notable in the exhaust clouds of the rocket launch in the first scene and in the trees in the background in the second scene. In the open-GOP configuration this effect is hardly visible.&nbsp;</p>



<div class="bitmovin-stream-wrapper"><iframe src="https://streams.bitmovin.com/cjtpj60mi6k28e8kfk10/embed" title="bitmovin-streams" allow="fullscreen"></iframe></div>



<div style="height:20px" aria-hidden="true" class="wp-block-spacer"></div>



<h2 class="wp-block-heading" id="open-gop-resolution-switching">Open-GOP resolution switching</h2>



<p>I mentioned before that switching to a different rendition is only possible at IDR frames in a closed-GOP configuration. We also saw that if we start decoding a CRA frame with RASL frames, we must skip decoding of the leading frames. Obviously, we don’t want to skip decoding frames whenever the player switches to a different resolution. This would be a horrible experience for the viewer.</p>



<p>Fortunately, VVC has a trick up its sleeve for exactly this scenario. In the example above we noted that decoding of the RASL frame (frame 3) is not possible because it uses frame 2 as a reference which has not been decoded when switching renditions. But what has been decoded is a different version of frame 2 from a different rendition. While this frame may have been decoded at a different quality or even at a different spatial resolution it is a representation of the exact same frame. So, with a bit of high level syntax, the VVC decoder can use this frame from another rendition as a reference frame for decoding frame 3. Even if the frame uses a different resolution, the decoder has a standardized set of up/down scaling filters. Let&#8217;s look at this:</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="1000" height="563" src="https://bitmovin.com/wp-content/uploads/2023/09/predictionWithCRADecodeFromCRA.gif" alt="Example of VVC open-gop switching and decoding from Clean Random Access frame" class="wp-image-267377"/></figure>



<p>In this example we are decoding frames 0 to 2 from a rendition at a lower resolution. Then the player decides to switch to a rendition with a higher resolution and bitrate. Decoding of the CRA (frame 4) is no problem since RA frames can be decoded independently of other frames. For frame 3, the decoder will now upscale frame 2 from the lower rendition and use this frame as a reference instead of the unavailable frame from the higher rendition. Decoding of the remaining frames is unchanged.</p>



<p>As mentioned before, the open-GOP prediction structure significantly reduces quality pumping effects. But there is another advantage. When switching to a higher or lower rendition in a closed-GOP configuration, there is a visible jump of the quality of the video. Of course, the bigger the jump is, the more pronounced the visible quality jump becomes. However, in open-GOP resolution switching, the intermediate leading frames that are using references from both renditions act as a sort of “quality interpolation” between the renditions which results in a much smoother transition between the renditions.&nbsp;</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="1600" height="565" src="https://bitmovin.com/wp-content/uploads/2023/09/open_gop_bitrates.jpg" alt="vvc - Bitmovin" class="wp-image-267605" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2023/09/open_gop_bitrates-300x106.png?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2023/09/open_gop_bitrates.jpg?size=384x136&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2023/09/open_gop_bitrates-768x271.png?lossy=2&amp;strip=1&amp;webp=1 768w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2023/09/open_gop_bitrates.jpg?size=1152x407&amp;lossy=2&amp;strip=1&amp;webp=1 1152w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2023/09/open_gop_bitrates-1536x542.png?lossy=2&amp;strip=1&amp;webp=1 1536w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2023/09/open_gop_bitrates.jpg?lossy=2&amp;strip=1&amp;webp=1 1600w" sizes="(max-width: 1600px) 100vw, 1600px" /></figure>



<p>Here, we can see an example of the PSNR for the resolution switching behavior. We have 3 renditions of 1920&#215;800 (gray), 1280&#215;534 (red) and 640&#215;268 (blue). In conventional adaptive streaming implementations with closed-GOPs, a rendition switch would result in an abrupt jump in quality at the switching points. The yellow graph shows how the quality has a much smoother transition between the renditions when using open-GOP resolution switching in VVC.</p>



<div style="height:20px" aria-hidden="true" class="wp-block-spacer"></div>



<h2 class="wp-block-heading" id="ibc-2023">IBC 2023</h2>



<p>At the IBC, we are presenting a technical paper about practical implementations and considerations when implementing open-GOP resolution switching with VVC in real world environments. This is a joint effort of <a href="https://www.hhi.fraunhofer.de/" rel="nofollow noopener" target="_blank">Fraunhofer HHI</a><a href="https://spin-digital.com/" rel="nofollow noopener" target="_blank">,</a><a href="https://spin-digital.com/" rel="nofollow noopener" target="_blank"> </a><a href="https://spin-digital.com/" rel="nofollow noopener" target="_blank">Spin Digital</a> and <a href="https://bitmovin.com/">Bitmovin</a>. Please join us at the IBC in the “Advances in video coding and processing” session on Sep 16th starting at 14:15 in room E102. Here we will present what technical challenges arise when deploying this feature for low latency live transcoding as well as in the highly scalable Bitmovin cloud encoder.</p>



<div style="height:20px" aria-hidden="true" class="wp-block-spacer"></div>



<h2 class="wp-block-heading" id="related-links">Related Links</h2>



<p><a href="https://show.ibc.org/" rel="nofollow noopener" target="_blank">IBC 2023 Website</a></p>



<p><a href="https://bitmovin.com/vvc-video-codec/">VVC codec background</a></p>



<p><a href="https://bitmovin.com/vvc-benefit-supported-devices-bitmovin-implementation/">VVC benefits and supported devices</a></p>
<p>The post <a rel="nofollow" href="https://bitmovin.com/vvc-open-gop-resolution-switching">VVC: Open-GOP Resolution Switching</a> appeared first on <a rel="nofollow" href="https://bitmovin.com">Bitmovin</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://bitmovin.com/vvc-open-gop-resolution-switching/feed</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>VVC Video Codec &#8211; The Next Generation Codec</title>
		<link>https://bitmovin.com/vvc-video-codec</link>
					<comments>https://bitmovin.com/vvc-video-codec#comments</comments>
		
		<dc:creator><![CDATA[Christian Feldmann]]></dc:creator>
		<pubDate>Thu, 03 Feb 2022 17:00:00 +0000</pubDate>
				<category><![CDATA[VidTech]]></category>
		<category><![CDATA[vvc codec]]></category>
		<guid isPermaLink="false">http://bitmovin.com/?p=24855</guid>

					<description><![CDATA[<p>State of VVC Video Codec So it’s happening. After their previous work on h.264/AVC and h.265/HEVC, the ITU-T Video Coding Experts Group (VCEG) and ISO Moving Picture Experts Group (MPEG) have again joined forces to create another video codec named Versatile Video Coding (VVC). The first goal for the VVC Video Codec was to significantly...</p>
<p>The post <a rel="nofollow" href="https://bitmovin.com/vvc-video-codec">VVC Video Codec &#8211; The Next Generation Codec</a> appeared first on <a rel="nofollow" href="https://bitmovin.com">Bitmovin</a>.</p>
]]></description>
										<content:encoded><![CDATA[<h2>State of VVC Video Codec</h2>
<p><span style="font-weight: 400;">So it’s happening. After their previous work on h.264/AVC and h.265/HEVC, the ITU-T Video Coding Experts Group (VCEG) and ISO Moving Picture Experts Group (MPEG) have again joined forces to create another video codec named <a href="https://bitmovin.com/compression-standards-vvc-2020/" target="_blank" rel="noopener">Versatile Video Coding (VVC)</a>.</span><br />
<span style="font-weight: 400;">The first goal for the VVC Video Codec was to significantly reduce bitrate expenditure while maintaining the same visual quality compared to HEVC. When considering PSNR as a quality metric, the reference VVC encoder outperforms the reference HEVC encoder <a href="https://jvet-experts.org/doc_end_user/current_document.php?id=11360" target="_blank" rel="noopener nofollow">by about 40%</a> in BD-rate</span><span style="font-weight: 400;">. However, some subjective tests were also performed which demonstrated overall bit-savings of <a href="https://jvet-experts.org/doc_end_user/current_document.php?id=10550" target="_blank" rel="noopener nofollow">closer to 50%</a>. The second goal in the standardization was versatility. On this front, VVC facilitates coding and transport for a wide range of applications and content types such as conventional video streaming, optimizations for screen content, 360-degree video, as well as live and ultra-low delay applications.</span></p>
<h2><span style="font-weight: 400;">Evolution or Revolution?</span></h2>
<p><span style="font-weight: 400;">Since the development of VVC was started from the basis of HEVC, the first question to ask about VVC is: Is it an evolution based on the technologies that were used in the former coding standards, or is it a really new and revolutionary way of compressing video? </span><br />
<span style="font-weight: 400;">Answer: It’s more or less an evolution of the basic building blocks that were already used in HEVC and various other codecs before.</span></p>
<ul>
<li style="font-weight: 400;"><span style="font-weight: 400;">It is still a hybrid, block-based video coding standard</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">Most technologies are based on HEVC and are further refined and improved</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">But there are also a lot of new coding tools which have not been seen in the context of video coding</span></li>
</ul>
<p><span style="font-weight: 400;">But how was this achieved? Like other video coding standards (e.g. AVC, HEVC, or AV1), VVC is based on the hybrid block-based video coding approach with conventional intra and inter prediction. As with former evolutions of video coding standards, these gains can not be attributed to one single technique that was added in VVC but to smaller improvements in all the building blocks of the coding scheme. So here is a (not even close to complete) list of advancements in VVC:</span></p>
<ul>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">The maximum block size of Coding Tree Units (CTU) that can be processed was increased. The maximum block size is now 128&#215;128 pixels. Also, the maximum block sizes for intra prediction and transformations were increased. This is particularly beneficial as the resolution of content that is encoded also is increasing. </span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">After the initial split, each CTU is further split into Coding Units. This splitting algorithm is now much more flexible and allows for more different block sizes both square and non-square.</span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">The number of directions that can be used in directional intra prediction was further increased. Intra prediction can now also be performed for non-square blocks. </span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Many aspects of motion compensated prediction (inter prediction) were improved as well like better motion vector prediction, decoder side motion vector refinement, and overlapped block motion compensation (OBMC).</span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">More different types of transformation are available by using a separable transform combining Discrete Cosine and Sine Transform.</span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">The in-loop filters were improved as well with the addition of a new filter &#8211; Adaptive Loop Filter &#8211; for which the encoder can signal optimal parameters on a CTU basis.</span></li>
</ul>
<p><span style="font-weight: 400;">As I already mentioned this list is far from complete and there are many many more new and adapted technologies that make VVC so efficient. If you want to learn more then there is plenty of material out there. But a good place to start is <a href="https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9503377" target="_blank" rel="noopener nofollow">this very detailed overview paper</a>.</span></p>
<h2><span style="font-weight: 400;">What’s new in VVC?</span></h2>
<p><span style="font-weight: 400;">So while the standard was finalized in late 2020, there is no widespread use of VVC in the market yet. But this was also not to be expected after such a short time. History has shown that adoption time for new video codecs is long and usually follows the following scheme: First, devices must support playback of a new standard. While software decoding on devices with high compute capabilities and no restrictions on power usage can be implemented quickly, most consumption of media is performed on devices that rely on hardware decoders. And while it takes some time to develop and deploy new decoding hardware, it takes even longer until these new devices reach a critical mass of deployment “in the wild”. At the same time, encoding solutions must be developed, tuned, and deployed. And finally, there is the question of royalties that must be paid. And only when all of these issues are resolved will it make sense to deploy actual video streaming using the new video codec. </span><br />
<span style="font-weight: 400;">For VVC, we are seeing the first practical implementations of VVC. There are some software-based encoder and decoder solutions and the first vendors have released hardware decoders in their<a href="https://jvet-experts.org/doc_end_user/current_document.php?id=11244" target="_blank" rel="nofollow noopener"> System on a Chip (SoC) devices</a>. Furthermore, some vendors (like Bitmovin) have deployed VVC Video Codec as an <a href="https://bitmovin.com/press-room/bitmovin-enables-innovation-with-new-vvc-codec-feature">option in their transcoding as a service product</a>. On the patent side, there are a few players moving the codec into production, with <a href="https://www.tvtechnology.com/news/mpeg-la-introduces-patent-pool-license-for-vvc-compression" target="_blank" rel="nofollow noopener">MPEG LA introducing the first license in early 2021</a>. While some patent pools are forming it is still very much unknown how much the usage of VVC will cost. And in the face of the falling cost of bandwidth, this is the biggest problem of VVC. To put it simply: If the price is not worth the bitrate savings, it will not be a thing. </span><br />
<span style="font-weight: 400;">So as I mentioned, Bitmovin already deployed VVC video encoding in its cloud-based transcoding solution. For this, we teamed up with the Fraunhofer Heinrich-Hertz-Institut (HHI) to integrate their <a href="https://www.hhi.fraunhofer.de/en/departments/vca/technologies-and-solutions/h266-vvc/fraunhofer-versatile-video-encoder-vvenc.htm" target="_blank" rel="nofollow noopener">software-based encoder VVenC</a> which is an open source VVC encoder and is freely <a href="https://github.com/fraunhoferhhi/vvenc" target="_blank" rel="noopener nofollow">available on Gihub</a>. While this is working great, there was no easy way to create a VVC Video and watch it. This is where the vvDecPlayer project comes in.</span></p>
<h3>Introducing the VVC Video Player: BitvvDecPlayer</h3>
<p><span style="font-weight: 400;">What I wanted to create was a simple demonstration player that was able to stream and decode a VVC video stream in real-time. All parts of the decoder are based on other open-source projects. Many of the rendering routines were copied from the <a href="https://github.com/IENT/YUView" target="_blank" rel="noopener nofollow">YUView player</a> (another project of mine) which in turn is using the Qt framework.</span><br />
<span style="font-weight: 400;">When opening a playlist in the vvDecPlayer, four threads are launched that build up the decoding pipeline:</span></p>
<ul>
<li aria-level="1"><b>Download: </b><span style="font-weight: 400;">This thread performs the download of the VVC video segments using HTTP. It has an internal buffer of 5 segments that it will try to keep full. The actual download was implemented using the Qt network module.</span></li>
</ul>
<ul>
<li aria-level="1"><b>Bitstream parsing: </b><span style="font-weight: 400;">After the download is done, we perform parsing of the high-level syntax of the bitstream. This gives us information about the segment like resolution and more importantly the number of frames in the segment.</span></li>
</ul>
<ul>
<li aria-level="1"><b>Decode: </b><span style="font-weight: 400;">The decode thread decodes the compressed bitstream into raw YUV frames. We are using the Fraunhofer <a href="https://github.com/fraunhoferhhi/vvdec" target="_blank" rel="noopener nofollow">VVdeC</a> software decoder here. The decoder is quite fast and is able to do real-time decoding of UHD content, provided that enough CPU power is available. The decoded frames are all stored in temporary buffers.</span></li>
</ul>
<ul>
<li aria-level="1"><b>Conversion: </b><span style="font-weight: 400;">While the decoded frames are in the YUV domain, we require RGB pixel data for display. This is done with a native C++ function that was copied from YUView.</span></li>
</ul>
<p><span style="font-weight: 400;">Finally, there is a timer running in the main thread that is trying to update the screen ‘FPS’ times per second by drawing the next converted RGB pixel buffer to the screen.</span></p>
<p><figure id="attachment_215769" aria-describedby="caption-attachment-215769" style="width: 512px" class="wp-caption aligncenter"><img loading="lazy" decoding="async" class="wp-image-215769 size-full" src="https://bitmovin.com/wp-content/uploads/2018/12/BitvvDecPlayer_VVC-Player-Performance_Screenshot-2.jpg" alt="BitvvDecPlayer_VVC Player Performance_Screenshot" width="512" height="347" /><figcaption id="caption-attachment-215769" class="wp-caption-text">VVC Video Player in Action</figcaption></figure></p>
<p><span style="font-weight: 400;">The screenshot shows the player in action. On the top left, the available renditions are shown. The currently selected rendition is marked with an arrow and the currently visible rendition is highlighted in green. Use the up/down arrows to switch renditions. On the top right are the fps counter and the status of the threads. The thread status display can be enabled in the menu or by pressing Ctrl+D. On the bottom is the progress graph (Ctrl+P). This is showing many values from the decoder pipeline. The dark cyan blocks indicate the compressed data of each segment where the height corresponds to the bitrate of each segment. Within each block, there is one bar that indicates the bitrate per frame. On the bottom, the status of each frame is shown: Downloaded (grey), Decoded (blue), converted to RGB, and ready for display (green). Playback can also be paused using the space bar and playback can be switched to full-screen view by double-clicking the video.</span><br />
<span style="font-weight: 400;">If you want to give it a try <a href="https://github.com/bitmovin/vvDecPlayer" target="_blank" rel="nofollow noopener">check out the project on Github</a></span><span style="font-weight: 400;">. The only prerequisites you need are a compiler, CMake, and the Qt libraries. How to build really depends on the platform that you are compiling for. But it goes something like this:</span></p>
<ul>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Get Qt. Either from the <a href="https://www.qt.io/" target="_blank" rel="noopener nofollow">Qt page</a> or if you are on Linux then probably your distro’s package manager can install it for you. On the mac homebrew is a good option.</span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Check out the source code of the player, and create a new directory to build in (e.g. ‘build’). Go into the directory and call qmake: ‘qmake ../’. After that, you start compilation. On Linux/mac this is probably ‘make’, while on windows it is likely ‘nmake’.</span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Next, you also need the VVdeC decoder library. Compiling that is also easy. Get the sources from the <a href="https://github.com/fraunhoferhhi/vvdec" target="_blank" rel="noopener nofollow">Github repo</a> and create a build directory. In there, call ‘cmake -DBUILD_SHARED_LIBS=1 ..’ and then ‘cmake -–build . –config Release’. This should build the shared decoder library in the ‘bin’ or the ‘lib’ folder.</span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Lastly, you can start the player. Go to ‘Settings-&gt;Select VVdeC library’ and browse to the shared VVdeC library you just built.</span></li>
</ul>
<p><span style="font-weight: 400;">Now everything should be ready to stream some videos. The player comes with some sample stream provided by us and our encoder. Just select a sample from ‘File-&gt;Bitmovin Streams’. If you encounter a bug please feel free to open a bug report on Github</span><span style="font-weight: 400;">. To learn more about the BitvvDecPlayer, view <a href="https://webinars.bitmovin.com/bitmovin/State-of-VVC-H-266-ft-Fraunhofer-HHI-d7adbaedf42789cf74423455" target="_blank" rel="nofollow noopener">our Webinar with Fraunhoffer HHI.</a></span></p>
<h2><span style="font-weight: 400;">Conclusion</span></h2>
<p><span style="font-weight: 400;">There is no doubt that VVC has some exciting potential and is already showing interesting results. Two years is a long time, and there is a lot of opportunities to further increase the coding performance and lower the encoding complexity, as well as perfect some of the new tools that are already adopted into the new standard. It will be exciting to check in again in 6 to 12 months to see how the implementations are developing.</span><br />
If you like to learn more about how the VVC Video Codec works check out our introductory blog: <a href="https://bitmovin.com/compression-standards-vvc-2020/">What is VVC and how does it work</a><br />
Open projects like these are a staple of the benefits of working as an Engineer for Bitmovin. <a href="https://bitmovin.com/careers/?utm_source=bmblog&amp;utm_medium=web">Come join our team to work on your exciting standard-setting projects</a>.</p>
<h2>Video technology guides and articles</h2>
<ul>
<li>Back to Basics: Guide to the <a href="https://bitmovin.com/html5-video-tag-guide/">HTML5 Video Tag </a></li>
<li><a href="https://bitmovin.com/vod-platforms/">What is a VoD Platform?</a>A comprehensive guide to Video on Demand (VOD)</li>
<li><a href="https://bitmovin.com/top-5-video-technology-trends/">Video Technology [2022]</a>: Top 5 video technology trends</li>
<li><a href="https://bitmovin.com/vp9-vs-hevc-h265/">HEVC vs VP9</a>: Modern codecs comparison</li>
<li>What is the <a href="https://bitmovin.com/av1/">AV1 Codec</a>?</li>
<li>Video Compression: <a href="https://bitmovin.com/encoding-definition-bitrates/">Encoding Definition and Adaptive Bitrate</a></li>
<li>What is <a href="https://bitmovin.com/adaptive-streaming/">adaptive bitrate streaming</a></li>
<li><a href="https://bitmovin.com/mkv-vs-mp4/">MP4 vs MKV</a>: Battle of the Video Formats</li>
<li><a href="https://bitmovin.com/video-streaming-models-svod-avod-tvod/">AVOD vs SVOD</a>; the “fall” of SVOD and Rise of AVOD &amp; TVOD (Video Tech Trends)</li>
<li><a href="https://bitmovin.com/dynamic-adaptive-streaming-http-mpeg-dash/">MPEG-DASH</a> (Dynamic Adaptive Streaming over HTTP)</li>
<li><a href="https://bitmovin.com/container-formats-fun-1/">Container Formats</a>: The 4 most common container formats and why they matter to you.</li>
<li><a href="https://bitmovin.com/qoe-why-quality-video-matters/">Quality of Experience</a> (QoE) in Video Technology [2022 Guide]</li>
</ul>
<p>The post <a rel="nofollow" href="https://bitmovin.com/vvc-video-codec">VVC Video Codec &#8211; The Next Generation Codec</a> appeared first on <a rel="nofollow" href="https://bitmovin.com">Bitmovin</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://bitmovin.com/vvc-video-codec/feed</wfw:commentRss>
			<slash:comments>1</slash:comments>
		
		
			</item>
		<item>
		<title>HEVC vs VP9: The Battle of the Video Codecs</title>
		<link>https://bitmovin.com/vp9-vs-hevc-h265</link>
		
		<dc:creator><![CDATA[Christian Feldmann]]></dc:creator>
		<pubDate>Wed, 05 Aug 2020 13:14:07 +0000</pubDate>
				<category><![CDATA[Developers]]></category>
		<category><![CDATA[hevc]]></category>
		<category><![CDATA[video encoding]]></category>
		<category><![CDATA[VP9]]></category>
		<guid isPermaLink="false">https://bitmovin.com/?p=122232</guid>

					<description><![CDATA[<p>For an APAC live event, our video coding engineer Christian Feldmann compared the HEVC (H.265) vs VP9. During the session, we discussed the fundamental differences between the two “modern codecs” and tied it off with an early analysis of each codec’s performance. These results were obtained using the open-source encoders libvpx-vp9, x264, and x265. This...</p>
<p>The post <a rel="nofollow" href="https://bitmovin.com/vp9-vs-hevc-h265">HEVC vs VP9: The Battle of the Video Codecs</a> appeared first on <a rel="nofollow" href="https://bitmovin.com">Bitmovin</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p><img loading="lazy" decoding="async" class="aligncenter size-large wp-image-122249" src="https://bitmovin.com/wp-content/uploads/2020/08/Blog-Post-VP9vsHEVC-social-1-1024x512.png" alt="vvc - Bitmovin" width="1024" height="512"><br />
<span style="font-weight: 400;">For an APAC live event, our video coding engineer Christian Feldmann compared the HEVC (H.265) vs </span><span style="font-weight: 400;">VP9</span><span style="font-weight: 400;">. </span><br />
<span style="font-weight: 400;">During the session, we discussed the fundamental differences between the two “modern codecs” and tied it off with an early analysis of each codec’s performance. </span><br />
<span style="font-weight: 400;">These results were obtained using the open-source encoders libvpx-vp9, x264, and x265. </span><br />
<span style="font-weight: 400;">This article delves into that experiment and shares the results of Christian&#8217;s research.</span></p>
<h2><span style="font-weight: 400;">VP9 vs HEVC: The encoding setup</span></h2>
<h3><span style="font-weight: 400;">Software</span></h3>
<p><span style="font-weight: 400;">For the test I used the following: </span></p>
<ul>
<li><span style="font-weight: 400;">libvpx-vp9 encoder (version 1.8.2) for VP9 encoding</span></li>
<li><span style="font-weight: 400;">x264 encoder (tag235ce6130168f4deee55c88ecda5ab84d81d125b) for h.264/AVC encoding</span></li>
<li><span style="font-weight: 400;">x265 encoder (version 3.2) for h.265/HEVC encoding. </span></li>
</ul>
<p><span style="font-weight: 400;">I also compiled libvmaf (version 1.5.1) and ffmpeg (version 4.2.3) to run the encoders and perform PSNR, SSIM and VMAF measurements. </span><br />
<span style="font-weight: 400;">If you want to recreate the same execution environment: I used Docker to build it so you can recreate the exact same environment using my Dockerfile which can be found </span><a href="https://drive.google.com/file/d/1wbUA56vB-LeH2H8nV-EGzhJPWW7ikkwx/view" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">here</span></a><span style="font-weight: 400;">.</span></p>
<h3><span style="font-weight: 400;">Test set</span></h3>
<p><span style="font-weight: 400;">For the test set I used Full HD and 4K sequences from the JEVT SDR test set [1] which was also used in the standardization of </span><a href="https://en.wikipedia.org/wiki/Versatile_Video_Coding" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">VVC</span></a><span style="font-weight: 400;">. </span><br />
<span style="font-weight: 400;">Some of these sequences are well known and were already used in several prior standardization activities. All sequences are 10 seconds long and used in YUV 4:2:0 subsampling. </span><br />
<span style="font-weight: 400;">The sequences are as follows:</span><br />
&nbsp;<br />
<figure id="attachment_122243" aria-describedby="caption-attachment-122243" style="width: 360px" class="wp-caption aligncenter"><img loading="lazy" decoding="async" class="wp-image-122243" src="https://bitmovin.com/wp-content/uploads/2020/08/Screenshot-2020-08-05-at-14.46.28-280x300.png" alt="VP9 vs HEVC Encoding test set sequence table" width="360" height="385" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/08/Screenshot-2020-08-05-at-14.46.28-280x300.png?lossy=2&amp;strip=1&amp;webp=1 280w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/08/Screenshot-2020-08-05-at-14.46.28.png?lossy=2&amp;strip=1&amp;webp=1 744w" sizes="(max-width: 360px) 100vw, 360px" /><figcaption id="caption-attachment-122243" class="wp-caption-text">JEVT SDR HD &amp; 4K Sequences</figcaption></figure></p>
<h3><span style="font-weight: 400;">Encoding</span></h3>
<p><span style="font-weight: 400;">For <a href="https://bitmovin.com/encoding-definition-bitrates/">encoding</a>, I used default settings with ffmpeg. All encodings implemented 2-pass encoding with a set target bitrate. The corresponding ffmpeg calls look like this: </span></p>
<pre><span style="font-weight: 400;">ffmpeg -i input.yuv -c:v libx264 -preset veryslow -b:v br --pass 1/2 enc.mp4</span>
<span style="font-weight: 400;">ffmpeg -i input.yuv -c:v libx265 -preset slow -b:v br --pass 1/2 enc.mp4</span>
<span style="font-weight: 400;">ffmpeg -i input.yuv -c:v libvpx-vp9 -b:v br --pass 1/2 enc.mp4
</span></pre>
<h3><span style="font-weight: 400;">Presets</span></h3>
<p><span style="font-weight: 400;">I used the following presets for each encoder:</span></p>
<ul>
<li style="font-weight: 400;"><span style="font-weight: 400;">x264 &#8211; <i>very slow</i></span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">X265 &#8211; </span><i><span style="font-weight: 400;">slow</span></i><span style="font-weight: 400;"> </span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">libvpx-vp9 no preset was chosen (which corresponds to a cpu-used value of 1)</span></li>
</ul>
<p><span style="font-weight: 400;">These settings were chosen from experience. While they do not yield the highest possible compression performance, they correspond to a very high quality encode with a good trade-off between encoding time and quality. </span><br />
<span style="font-weight: 400;">The encodings were performed under two scenarios:</span></p>
<ol>
<li style="font-weight: 400;"><b>Fixed Resolution: </b><span style="font-weight: 400;">no scaling is applied. Encoding was performed at the resolution of the original sequence with various different target bitrates. The bitrates for x265 and libvpx-vp9 were: 4.8 Mbit/s, 2.4 Mbit/2, 1.8 Mbit/2, 1.2 Mbit/s and 0.8 Mbit/s. For x264, these values were multiplied by a factor of two. Based on the pixel count, another factor of four was applied to the bitrates for the 4K encodes.</span></li>
<li style="font-weight: 400;"><b>Bitrate Ladder: </b><span style="font-weight: 400;">Encoding was performed at a range of different resolutions and bitrates also referred to as a </span><i><span style="font-weight: 400;">bitrate ladder</span></i><span style="font-weight: 400;">. These were: 1920&#215;1080 at 4.8Mbit/s, 1920&#215;1080 at 2.4Mbit/s, 1280&#215;720 at 1.8Mbit/s, 1280&#215;720 at 1.2Mbit/s, 854&#215;480 at 0.8 Mbit/s, 640&#215;360 at 0.4 Mbit/s and 426&#215;240 at 0.2 Mbit/s. For the 4K encodes, two additional points with 3840&#215;2160 at 19.2 Mbit/s and 9.6 Mbit/s were added. As in the first scenario, the bitrates were multiplied by a factor of two for x264. The final measurement step was performed after upsampling back to the resolution of the original source. The default scaling algorithm is </span><i><span style="font-weight: 400;">bicubic</span></i><span style="font-weight: 400;">.</span></li>
</ol>
<h2><span style="font-weight: 400;">Results</span></h2>
<p><span style="font-weight: 400;">For each encoding, multiple different measurements were performed. </span><br />
<span style="font-weight: 400;">In the cases where encoding was performed at a lower spatial resolution, the measurement was performed after upscaling the reconstruction back to the resolution of the source. </span><br />
<span style="font-weight: 400;">PSNR and SSIM measurements were performed for the three components (Y/U/V) as well as an averaged value. VMAF was calculated as well. For the 4k source files, the 4k VMAF model was applied. </span><br />
<span style="font-weight: 400;">For the encoding time, I measured the absolute elapsed time as well as the CPU time per thread. This is a sample plot for the sequence “MarketPlace” in the fixed resolution scenario (<strong>hover over image to zoom</strong>):</span><br />
<figure id="attachment_122245" aria-describedby="caption-attachment-122245" style="width: 720px" class="wp-caption aligncenter"><img loading="lazy" decoding="async" class="zoooom wp-image-122245" src="https://bitmovin.com/wp-content/uploads/2020/08/VP9-vs-HEVC-visual-quality-measurements-sample-plot-grapgs-1-1024x526.png" alt="VP9 vs HEVC visual quality measurements sample plot graphs" width="720" height="370" /><figcaption id="caption-attachment-122245" class="wp-caption-text">Fixed resolution results for the Sequence MarketPlace.</figcaption></figure></p>
<h3><span style="font-weight: 400;">Coding performance</span></h3>
<p><span style="font-weight: 400;">For both scenarios I calculated BD-rate results for the average PSNR, average SSIM, and VMAF values relative to x264 [2]: </span><br />
<figure id="attachment_122246" aria-describedby="caption-attachment-122246" style="width: 720px" class="wp-caption aligncenter"><img loading="lazy" decoding="async" class="wp-image-122246" src="https://bitmovin.com/wp-content/uploads/2020/08/Screenshot-2020-08-05-at-14.58.23-1-1024x288.png" alt="BD-Rate Encoding results for PSNR, SSIM, and VMAF table" width="720" height="203" /><figcaption id="caption-attachment-122246" class="wp-caption-text">Averaged BD-Rate results for PSNR, SSIM, and VMAF compared to x264 for both scenarios.</figcaption></figure><br />
<span style="font-weight: 400;"><br />
As one can see the libvpx-vp9 encoder is able to compete with x265 very well when it comes to coding performance. </span><br />
<span style="font-weight: 400;">However, the PSNR and SSIM based BD values are consistently higher for libvpx-vp9 and the VMAF-based BD-rate values are higher for x265 in the fixed resolution scenario. </span><br />
<span style="font-weight: 400;">In the bitrate ladder scenario, both encoders show very similar results.</span><br />
<span style="font-weight: 400;">What is surprising is that the default x265 configuration seems to use a much lower QP for the color components (U/V) compared to the other two encoders. However, because of the way the average values are calculated, this does not have a huge impact on the BD results.</span></p>
<h3><span style="font-weight: 400;">VP9 vs HEVC Complexity Levels</span></h3>
<p><span style="font-weight: 400;">For all encodings, I also measured the overall runtime of the encoding as well as the CPU-time per thread. Both of these values can give us an indication of how well the encoders can utilize multiple cores. </span><br />
<span style="font-weight: 400;">All tests were performed on an Intel 6 core (12 thread) processor. The results for x265 and libvpx-vp9 were taken relative to the values of x264 and then averaged. </span><br />
<em><strong>The following table displays the relative factors compared to x264:</strong></em><br />
<figure id="attachment_122247" aria-describedby="caption-attachment-122247" style="width: 720px" class="wp-caption aligncenter"><img loading="lazy" decoding="async" class="wp-image-122247" src="https://bitmovin.com/wp-content/uploads/2020/08/Screenshot-2020-08-05-at-15.00.09-1-1024x351.png" alt="VP9 vs HEVC runtime and CPU time comparison table" width="720" height="247" /><figcaption id="caption-attachment-122247" class="wp-caption-text">Runtime factors relative to x264 for absolute runtime and CPU time.</figcaption></figure><br />
<span style="font-weight: 400;">While both x265 and libvpx-vp9 have higher runtimes compared to x264, we can see that x265 is much better at utilizing available threads efficiently, which results in much lower values for the overall runtime factors. </span><br />
<span style="font-weight: 400;">When it comes to the CPU time, libvpx-vp9 has an advantage over x265 in the tested configuration. Similar observations can be made of the table above. </span><br />
<span style="font-weight: 400;">So depending on your application this may be a disadvantage or not. For example, since our encoder uses multiple vectors to utilize the available threads efficiently this behavior is not a big disadvantage for us. </span></p>
<h2><span style="font-weight: 400;">Files</span></h2>
<p><span style="font-weight: 400;">Finally, I would like to provide all the files needed in order to recreate the results. </span><br />
<span style="font-weight: 400;">Furthermore, the archive includes all the result files that were used to determine my findings. I encourage everybody to double-check them. </span><br />
<span style="font-weight: 400;">However, for legal reasons, I can not provide the encoded video sequences or the original uncompressed YUV test sequences. </span><br />
<span style="font-weight: 400;">The archive includes the following scripts which should be helpful:</span></p>
<ul>
<li><b>Test shell scripts: <span style="font-weight: 400;">These scripts were used to perform the encoding and the measurements in the docker container. Please feel free to use these in your own tests.</span></b></li>
<li><b>Python scripts: <span style="font-weight: 400;">The python scripts were used to calculate the BD results (</span><i><span style="font-weight: 400;">calculateBDResults.py</span></i><span style="font-weight: 400;">), to plot the measured values per sequence (</span><i><span style="font-weight: 400;">plotResults.py</span></i><span style="font-weight: 400;">) and to plot the results per frame (</span><i><span style="font-weight: 400;">plotPerFrameResults.py</span></i><span style="font-weight: 400;">). Please use these scripts to take a detailed look at the results. You require python 3 and matplotlib installed. Each script must be called with the name of a sub-folder that should be plotted.</span></b></li>
</ul>
<p><span style="font-weight: 400;">File:</span><br />
<a href="https://drive.google.com/file/d/1wbUA56vB-LeH2H8nV-EGzhJPWW7ikkwx/view?usp=sharing" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">https://drive.google.com/file/d/1wbUA56vB-LeH2H8nV-EGzhJPWW7ikkwx/view?usp=sharing</span></a></p>
<h2><span style="font-weight: 400;">Summary</span></h2>
<p><span style="font-weight: 400;">While this is just a quick and superficial encoder comparison, I tried to keep it close to practical applications. From the VP9 vs HEVC test here, libvpx-vp9 is able to take on x265 when it comes to coding performance. </span><br />
<span style="font-weight: 400;">Please note that only these encoders were tested and there are other AVC, HEVC, and VP9 encoders out there which may perform better.</span><br />
<span style="font-weight: 400;">If you have additional inputs to the test please reach out to me! I am very willing to run this again using a different set of settings. </span></p>
<h2><span style="font-weight: 400;">References</span></h2>
<p><span style="font-weight: 400;">[1] &#8211; A. Segall, E. François, W. Husak, S. Iwamura, D. Rusanovskyy &#8211; JVET common test conditions and evaluation procedures for HDR/WCG video &#8211; </span><a href="http://phenix.it-sudparis.eu/jvet/doc_end_user/current_document.php?id=8862" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">JVET-P2011</span></a><span style="font-weight: 400;"> </span><br />
<span style="font-weight: 400;">[2] &#8211; Gisle Bjontegaard &#8211; Calculation of average PSNR differences between RD-curves &#8211; VCEG-M33 Austin, Texas, USA, 2-4 April 2001</span></p>
<h2><strong>More video technology guides and articles:</strong></h2>
<div class="col-12-12 mob-1-1">
<div class="entry-content">
<ul>
<li><a href="https://bitmovin.com/encoding-definition-bitrates/">Encoding Definition and Adaptive Bitrate</a>: Video Compression Guide</li>
<li>Back to Basics: Guide to the <a href="https://bitmovin.com/html5-video-tag-guide/">HTML5 Video Tag </a></li>
<li><a href="https://bitmovin.com/vod-platforms/">What is a VoD Platform?</a> A comprehensive guide to Video on Demand (VOD)</li>
<li><a href="https://bitmovin.com/top-5-video-technology-trends/">Video Technology [2022]</a>: Top 5 video technology trends</li>
<li><a href="https://bitmovin.com/vp9-vs-hevc-h265/">HEVC vs VP9</a>: Modern codecs comparison</li>
<li>What is the <a href="https://bitmovin.com/av1/">AV1 Codec</a>?</li>
</ul>
</div>
</div>
<h2><strong>Did you know?</strong></h2>
<p>Bitmovin has a range of <a href="https://bitmovin.com/vod-platforms/">VOD</a> services that can help you deliver content to your customers effectively.<br />
Its variety of features allows you to create content tailored to your specific audience, without the stress of setting everything up yourself. Built-in analytics also help you make technical decisions to deliver the optimal user experience.<br />
Why not <a href="https://bitmovin.com/dashboard/signup?email=">try Bitmovin for Free</a> and see what it can do for you.</p>
<p>The post <a rel="nofollow" href="https://bitmovin.com/vp9-vs-hevc-h265">HEVC vs VP9: The Battle of the Video Codecs</a> appeared first on <a rel="nofollow" href="https://bitmovin.com">Bitmovin</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Best Video Codec: An Evaluation of AV1, AVC, HEVC and VP9</title>
		<link>https://bitmovin.com/av1-multi-codec-dash-dataset</link>
		
		<dc:creator><![CDATA[Christian Feldmann]]></dc:creator>
		<pubDate>Fri, 20 Mar 2020 19:59:40 +0000</pubDate>
				<category><![CDATA[Developers]]></category>
		<category><![CDATA[av1]]></category>
		<category><![CDATA[hevc]]></category>
		<category><![CDATA[video encoding]]></category>
		<category><![CDATA[VP9]]></category>
		<guid isPermaLink="false">http://bitmovin.com/?p=22726</guid>

					<description><![CDATA[<p>  This scientific evaluation puts AV1 to the test against industry standard codecs and shows that AV1 is able to outperform VP9 and even HEVC by up to 40% Introduction For practical Over-the-top (OTT) streaming applications it is mostly necessary to supply streams using multiple different video codec standards in order to stream to a wide...</p>
<p>The post <a rel="nofollow" href="https://bitmovin.com/av1-multi-codec-dash-dataset">Best Video Codec: An Evaluation of AV1, AVC, HEVC and VP9</a> appeared first on <a rel="nofollow" href="https://bitmovin.com">Bitmovin</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<pre class="wp-block-code"><code>Did you know our <a href="https://bitmovin.com/video-player/" data-type="URL" data-id="https://bitmovin.com/video-player/">video player</a> guarantees playback quality on any screen through our modular architecture, including low-latency, configurable ABR and Stream Lab, the world’s first stream QoE testing service? Check out the <a href="https://bitmovin.com/video-player/" data-type="URL" data-id="https://bitmovin.com/video-player/">Bitmovin Player</a> to learn more.</code></pre>


<h2> <img loading="lazy" decoding="async" class="size-full wp-image-22747 alignnone" src="https://bitmovin.com/wp-content/uploads/2018/03/Av1-Banner.jpg" alt="AV1 40% more efficient that HEVC" width="800" height="361" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2018/03/Av1-Banner-300x135.jpg?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2018/03/Av1-Banner.jpg?size=384x173&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2018/03/Av1-Banner-768x347.jpg?lossy=2&amp;strip=1&amp;webp=1 768w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2018/03/Av1-Banner.jpg?lossy=2&amp;strip=1&amp;webp=1 800w" sizes="(max-width: 800px) 100vw, 800px" /></h2>
<blockquote>
<p>This scientific evaluation puts AV1 to the test against industry standard codecs and shows that AV1 is able to outperform <a href="https://bitmovin.com/mpeg-dash-vp9-vod-live/">VP9</a> and even HEVC by up to 40%</p>
</blockquote>
<h2>Introduction</h2>
<p><span style="font-weight: 400;">For practical Over-the-top (OTT) streaming applications it is mostly necessary to supply streams using multiple different video codec standards in order to stream to a wide range of devices and platforms. </span></p>
<p><span style="font-weight: 400;">The most commonly used video codes in this scenario are AVC, VP9 and HEVC. With the standardization of AV1, another modern video coding standard is joining in. </span></p>
<p><span style="font-weight: 400;">While AVC offers the best compatibility across devices and platforms, the newer standards such as HEVC and AV1 offer a much higher compression efficiency and thereby also a better user experience. </span></p>
<p><span style="font-weight: 400;">Another key difference between the codecs is that VP9 and AV1 were developed with the goal of being open source and freely available for anybody to implement and use without any royalties while AVC and HEVC require a royalty to be paid. </span></p>
<p><span style="font-weight: 400;">The multi-codec dataset presented here adopts the aforementioned standards in a practical OTT adaptive streaming scenario. The full dataset is freely available online (</span><a href="http://www.itec.aau.at/ftp/datasets/mmsys18/" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">http://www.itec.aau.at/ftp/datasets/mmsys18/</span></a><span style="font-weight: 400;">). For an in-depth description of the dataset, please reference (<a href="https://arxiv.org/abs/1803.06874" rel="nofollow noopener" target="_blank">https://arxiv.org/abs/1803.06874</a>).</span></p>
<h2>The Dataset</h2>
<p><span style="font-weight: 400;">Since the main focus is on an HTTP Adaptive Streaming (HAS) dataset, we adopted a set of bitrate/resolution pairs &#8211; referred to as the </span><i><span style="font-weight: 400;">bitrate ladder</span></i><span style="font-weight: 400;"> &#8211; with a range from very low bitrates/resolutions of 100 kbits at 256&#215;144 pixels up to 4k resolutions at 20 megabits. </span></p>
<p><span style="font-weight: 400;">This is a well-established approach for OTT streaming applications. </span></p>
<p><span style="font-weight: 400;">For the video sequences, we tried to cover a range of video sequences with different properties. For this, we calculated the spatial and temporal information so that the sequences contain different amounts of motion and texture.</span></p>
<p><span style="font-weight: 400;">For the adaptive streaming encoding, a size per segment of 2, as well as 4 seconds, was used. </span></p>
<p><span style="font-weight: 400;">For AV1 encoding a snapshot of the reference software was used (v0.1.0-7691-g84dc6e9). For the encoding, the cpu_used preset was set to 2. </span></p>
<p><span style="font-weight: 400;">The encoding for AVC, HEVC, and VP9 was performed utilizing ffmpeg and, thus, libx264, libx265, and libvpx-vp9 are used. For these codecs, encoding performed with the </span><i><span style="font-weight: 400;">slow </span></i><span style="font-weight: 400;">preset. For all codecs, a two-pass scheme is employed. </span></p>
<p><span style="font-weight: 400;">Encoding of the AV1 bitstreams according to these specifications was performed by the Institute of Information Technology at the Alpen-Adria Universität Klagenfurt. Encodings using the other codecs AVC, HEVC, and VP9 was carried out by Bitmovin using the </span><a href="https://bitmovin.com/encoding-service/"><span style="font-weight: 400;">Bitmovin Video Encoding</span></a><span style="font-weight: 400;"> cloud infrastructure. </span></p>
<p><span style="font-weight: 400;">All bitstreams were then collected and jointly evaluated.</span></p>
<h2>The Evaluation</h2>
<p><span style="font-weight: 400;">For evaluation, the reconstruction at lower resolutions was upscaled to the original resolution and the weighted PSNR relative to the original source was calculated ((6*Y+U+V)/8). </span></p>
<p><span style="font-weight: 400;">From these values we calculated the corresponding Bjøntegaard-Delta bit-rate (BD-rate) values. </span></p>
<p><span style="font-weight: 400;">When calculated over the entire bitrate ladder, we were able to observe an average bitrate reduction of AV1 compared to VP9 of 13% and compared to HEVC of 17%. </span></p>
<p><span style="font-weight: 400;">When we focus on the higher part of the bitrate ladder, the BD-rate reduction compared to VP9 increases to 22%-27% while compared to HEVC, the reduction increases to 30%-43%. </span></p>
<p><span style="font-weight: 400;">It should be noted that because of the fixed bitrate ladder, the overlap becomes rather small for the highest resolutions in some sequences and the results should therefore be interpreted with some caution. </span></p>
<p><span style="font-weight: 400;">This could definitely be improved by adapting the bitrate ladder to the properties of the different sequences.</span><br /><img loading="lazy" decoding="async" class="alignnone size-full wp-image-22797" src="https://bitmovin.com/wp-content/uploads/2018/03/chart-1.jpg" alt="vvc - Bitmovin" width="576" height="356" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2018/03/chart-1-300x185.jpg?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2018/03/chart-1.jpg?size=384x237&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2018/03/chart-1.jpg?lossy=2&amp;strip=1&amp;webp=1 576w" sizes="(max-width: 576px) 100vw, 576px" /><br /><img loading="lazy" decoding="async" class="alignnone size-full wp-image-22798" src="https://bitmovin.com/wp-content/uploads/2018/03/chart-2.jpg" alt="vvc - Bitmovin" width="658" height="371" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2018/03/chart-2-300x169.jpg?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2018/03/chart-2.jpg?size=384x217&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2018/03/chart-2.jpg?lossy=2&amp;strip=1&amp;webp=1 658w" sizes="(max-width: 658px) 100vw, 658px" /></p>
<h2>Conclusion</h2>
<p><span style="font-weight: 400;">The dataset is meant to offer a first HLS set environment for the emerging video coding standard AV1 and the other in OTT applications most frequently used codecs AVC, VP9 and HEVC. </span></p>
<p><span style="font-weight: 400;">The coding performance results for this test set indicate, that AV1 is able to outperform VP9 and even HEVC by up to 40%. </span></p>
<p><span style="font-weight: 400;">Please note that this evaluation </span><span style="font-weight: 400;">primarily targets HAS services and has a very specific setup. </span></p>
<p><span style="font-weight: 400;">While it can give an indication on the coding performance of AV1, the results should be interpreted with caution.</span></p>
<h2><strong>Video technology guides and articles</strong></h2>
<ul>
<li>Back to Basics: Guide to the <a href="https://bitmovin.com/html5-video-tag-guide/">HTML5 Video Tag </a></li>
<li><a href="https://bitmovin.com/vod-platforms/">What is a VoD Platform?</a>A comprehensive guide to Video on Demand (VOD)</li>
<li><a href="https://bitmovin.com/top-5-video-technology-trends/">Video Technology [2023]</a>: Top 5 video technology trends</li>
<li><a href="https://bitmovin.com/vp9-vs-hevc-h265/">HEVC vs VP9</a>: Modern codecs comparison</li>
<li>What is the <a href="https://bitmovin.com/av1/">AV1 Codec</a>?</li>
<li>Video Compression: <a href="https://bitmovin.com/encoding-definition-bitrates/">Encoding Definition and Adaptive Bitrate</a></li>
<li>What is <a href="https://bitmovin.com/adaptive-streaming/">adaptive bitrate streaming</a></li>
<li><a href="https://bitmovin.com/mkv-vs-mp4/">MP4 vs MKV</a>: Battle of the Video Formats</li>
<li><a href="https://bitmovin.com/video-streaming-models-svod-avod-tvod/">AVOD vs SVOD</a>; the “fall” of SVOD and Rise of AVOD &amp; TVOD (Video Tech Trends)</li>
<li><a href="https://bitmovin.com/dynamic-adaptive-streaming-http-mpeg-dash/">MPEG-DASH</a> (Dynamic Adaptive Streaming over HTTP)</li>
<li><a href="https://bitmovin.com/container-formats-fun-1/">Container Formats</a>: The 4 most common container formats and why they matter to you.</li>
<li><a href="https://bitmovin.com/qoe-why-quality-video-matters/">Quality of Experience</a> (QoE) in Video Technology [2023 Guide]</li>
</ul><p>The post <a rel="nofollow" href="https://bitmovin.com/av1-multi-codec-dash-dataset">Best Video Codec: An Evaluation of AV1, AVC, HEVC and VP9</a> appeared first on <a rel="nofollow" href="https://bitmovin.com">Bitmovin</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>State of Compression: What is VVC and how does it work?</title>
		<link>https://bitmovin.com/compression-standards-vvc-2020</link>
		
		<dc:creator><![CDATA[Christian Feldmann]]></dc:creator>
		<pubDate>Fri, 14 Feb 2020 17:45:49 +0000</pubDate>
				<category><![CDATA[Developers]]></category>
		<category><![CDATA[video encoding]]></category>
		<category><![CDATA[vvc]]></category>
		<guid isPermaLink="false">https://bitmovin.com/?p=96897</guid>

					<description><![CDATA[<p>What is VVC? Versatile Video Coding (VVC) is the most recent international video coding standard which was finalized in July of 2020. It is the successor to High-Efficiency Video Coding (HEVC) as it was also developed jointly by the ITU-T and ISO/IEC. So what is really new in VVC? Is this a real revolution when...</p>
<p>The post <a rel="nofollow" href="https://bitmovin.com/compression-standards-vvc-2020">State of Compression: What is VVC and how does it work?</a> appeared first on <a rel="nofollow" href="https://bitmovin.com">Bitmovin</a>.</p>
]]></description>
										<content:encoded><![CDATA[<h2><span style="font-weight: 400;">What is VVC?</span></h2>
<p><span style="font-weight: 400;">Versatile Video Coding (VVC) is the most recent international video coding standard which was finalized in July of 2020. It is the successor to High-Efficiency Video Coding (HEVC) as it was also developed jointly by the ITU-T and ISO/IEC. </span><br />
<span style="font-weight: 400;">So what is really new in VVC? Is this a real revolution when it comes to video coding? In short: No. While it is technically highly advanced, it is only an evolutionary step forward from HEVC. It still uses the block-based hybrid video coding approach, an underlying concept of all major video coding standards since </span><span style="font-weight: 400;">h.261 (from 1988)</span><span style="font-weight: 400;">. In this concept, each frame of a video is split into blocks and all blocks are then processed in sequence. </span><br />
<span style="font-weight: 400;">The decoder processes every block in a loop, which starts with entropy decoding of the bitstream. The decoded transform coefficients are then put through an inverse quantization and an inverse transform operation. The output, which is an error signal in the pixel domain, then enters the coding loop and is added to a prediction signal. There are two prediction types. </span><i><span style="font-weight: 400;">Inter Prediction,</span></i><span style="font-weight: 400;"> which copies blocks from previously coded pictures (motion compensation), and </span><i><span style="font-weight: 400;">Intra Prediction, </span></i><span style="font-weight: 400;">which only uses decoded pixel information from the picture being decoded. The output of the addition is the reconstructed block that is put through some filters. This usually includes a filter to remove blocking artifacts that occur at the boundaries of blocks, but also more advanced filters can be used. Finally, the block is saved to a picture buffer so it can be output on a screen once decoding is done and the loop can continue with the next block.</span><br />
<span style="font-weight: 400;">At the encoder side, the situation is a little more complex as the encoder has to perform the corresponding forward operations, as well as the inverse operations from the decoder to obtain identical information for prediction.</span><br />
<figure id="attachment_96898" aria-describedby="caption-attachment-96898" style="width: 1172px" class="wp-caption alignnone center-text"><img loading="lazy" decoding="async" class="wp-image-96898 size-full" src="https://bitmovin.com/wp-content/uploads/2020/02/HybridVideoDecoder_VVC_1.png" alt="HybridVideoDecoder-VVC-Illustrated" width="1172" height="609" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/02/HybridVideoDecoder_VVC_1-300x156.png?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/02/HybridVideoDecoder_VVC_1.png?size=384x200&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/02/HybridVideoDecoder_VVC_1-768x399.png?lossy=2&amp;strip=1&amp;webp=1 768w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/02/HybridVideoDecoder_VVC_1-1024x532.png?lossy=2&amp;strip=1&amp;webp=1 1024w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/02/HybridVideoDecoder_VVC_1.png?lossy=2&amp;strip=1&amp;webp=1 1172w" sizes="(max-width: 1172px) 100vw, 1172px" /><figcaption id="caption-attachment-96898" class="wp-caption-text"><em>The generalized block diagram of a hybrid video decoder.</em></figcaption></figure><br />
<span style="font-weight: 400;">Although VVC also uses these basic concepts, all components have been improved and/or modified with new ideas and techniques. In this blog post, I will show some of the improvements that VVC yields. However, this is only a small selection of new tools in VVC as a full list of all details and tools could easily fill a whole book (and someone else probably already started writing one).</span></p>
<h2><span style="font-weight: 400;">VVC Coding structure</span></h2>
<h3><span style="font-weight: 400;">Slices Tiles and Subpictures</span></h3>
<p><span style="font-weight: 400;">As mentioned above, each frame in the video is split into a regular grid of blocks. In VVC the size of these so-called </span><i><span style="font-weight: 400;">Coding Tree Units</span></i><span style="font-weight: 400;"> (CTU) was increased from 64&#215;64 in HEVC to 128&#215;128 pixels. Multiple blocks can be arranged into logical areas. These are defined as </span><i><span style="font-weight: 400;">Tiles</span></i><span style="font-weight: 400;">, </span><i><span style="font-weight: 400;">Slices, </span></i><span style="font-weight: 400;">and </span><i><span style="font-weight: 400;">Subpictures</span></i><span style="font-weight: 400;">. Although these techniques are already known from earlier codecs, the way they are combined is new.</span><br />
<figure id="attachment_96899" aria-describedby="caption-attachment-96899" style="width: 600px" class="wp-caption aligncenter center-text"><img loading="lazy" decoding="async" class="wp-image-96899 aligncenter" style="text-align: center;" src="https://bitmovin.com/wp-content/uploads/2020/02/TilesAndSlices_VVC2-e1582119350563.png" alt="TilesAndSlices-VVC-illustrated" width="600" height="367" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/02/TilesAndSlices_VVC2-e1582119350563-300x183.png?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/02/TilesAndSlices_VVC2-e1582119350563.png?size=384x235&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/02/TilesAndSlices_VVC2-e1582119350563-768x469.png?lossy=2&amp;strip=1&amp;webp=1 768w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/02/TilesAndSlices_VVC2-e1582119350563.png?lossy=2&amp;strip=1&amp;webp=1 800w" sizes="(max-width: 600px) 100vw, 600px" /><figcaption id="caption-attachment-96899" class="wp-caption-text"><em>The picture is split into four tiles of equal size (blue). There are four slices (green). The one on the left contains two tiles. On the top right, the tile is split into two slices. CTUs are marked in grey.</em></figcaption></figure><br />
<span style="font-weight: 400;">The key feature of these regions is that they are also logically separated in the bitstream and enable various use-cases:</span></p>
<ul>
<li style="font-weight: 400;"><span style="font-weight: 400;">Since each region is independent, both the encoder and the decoder can implement parallel processing.</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">A decoder could choose to only partially decode the regions of the video that it needs. One possible application is the transmission of 360 videos where a user is only able to see parts of a full video.</span></li>
<li style="font-weight: 400;"><span style="font-weight: 400;">A bitstream could be designed to allow the extraction of a cropped part of the video stream on the fly without re-encoding. [</span><a href="http://phenix.int-evry.fr/jvet/doc_end_user/current_document.php?id=9676" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">JVET-Q2002</span></a><span style="font-weight: 400;">]</span></li>
</ul>
<h3><span style="font-weight: 400;">Block Partitioning</span></h3>
<p><span style="font-weight: 400;">Let’s go back to 128&#215;128 CTU blocks. As I mentioned before, the coding loop is traversed for each block. However, processing only full 128&#215;128 pixel blocks would be very inefficient, so each CTU is flexibly split into smaller sub-blocks and the information on how to split it is encoded into the bitstream. The encoder can choose the best division of the CTU based on the content of the block. In a rather uniform area, bigger blocks are more efficient. Whereas in areas with edges or more detail, smaller blocks are typically chosen. The partitioning in VVC is performed using two subsequent hierarchical trees:</span></p>
<ul>
<li style="font-weight: 400;"><b>Quaternary tree:</b><span style="font-weight: 400;"> There are two options for each block. Do not split the block further or split it into four square sub-blocks of half the width and half the height. For each sub-block, the same decision is made again in a recursive manner. If a block is not split further, the second three is applied.</span></li>
<li><b>Multi-type tree: </b>In the second tree, there are multiple options for each block. It can be split in half using a single vertical or horizontal split. Alternatively, it can be split vertically or horizontally into three parts (ternary split). As for the first tree, this one is also recursive and each subblock can be split using the same four options again. The leaf nodes of this tree that are not split any further are called <i>Coding Units</i> (CUs) and these are processed in the coding loop.</li>
</ul>
<p><figure id="attachment_96900" aria-describedby="caption-attachment-96900" style="width: 512px" class="wp-caption aligncenter center-text"><img loading="lazy" decoding="async" class="wp-image-96900 size-full" src="https://bitmovin.com/wp-content/uploads/2020/02/BlockPartioning_VV3-1.png" alt="BlockPartioning-VVC-Illustrated" width="512" height="240" /><figcaption id="caption-attachment-96900" class="wp-caption-text"><em>Each block is split into two stages. First using a hierarchical binary tree (left) and secondly using a hierarchical ternary tree (right).</em></figcaption></figure><br />
<span style="font-weight: 400;">The factor that distinguishes VVC from other video codecs is the high flexibility of block sizes and shapes that a CTU can be split into. With this, an encoder can flexibly adapt to a wide range of video characteristics that result in better coding performance. Of course, this high flexibility comes at a cost. The encoder must consider all possible splitting options which require more computation time. [</span><a href="http://phenix.int-evry.fr/jvet/doc_end_user/current_document.php?id=9676" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">JVET-Q2002</span></a><span style="font-weight: 400;">]</span></p>
<h2><span style="font-weight: 400;">Block Prediction</span></h2>
<h3><span style="font-weight: 400;">Intra Prediction</span></h3>
<p><span style="font-weight: 400;">In intra prediction, the current block is predicted from already decoded parts of the current picture. To be more precise, only a one-pixel wide strip from the neighborhood is used for normal intra prediction. There are multiple modes on how to predict a block from these reference pixels. Well-known modes that are also present in VVC are </span><i><span style="font-weight: 400;">Planar </span></i><span style="font-weight: 400;">and </span><i><span style="font-weight: 400;">DC </span></i><span style="font-weight: 400;">prediction as well as </span><i><span style="font-weight: 400;">Angular Prediction</span></i><span style="font-weight: 400;">. While the number of discrete directions for the angle was increased from 33 to 65 in VVC, not much else changed compared to HEVC. So, let&#8217;s concentrate on tools that are actually new:</span></p>
<ul>
<li><b>Wide Angle Intra Prediction: <span style="font-weight: 400;">Since prediction blocks in VVC can be non-square, the angels of certain directional predictions are shifted so that more reference pixels can be used for prediction. Effectively this extends the directional prediction angles to values beyond the normal 45° and below -135°. [</span><a href="http://phenix.int-evry.fr/jvet/doc_end_user/current_document.php?id=7900" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">JVET-P0111</span></a><span style="font-weight: 400;">]</span></b></li>
<li><b>Cross-component Prediction:<span style="font-weight: 400;"> In many cases (e.g. when there is an edge in the block) the luma and chroma components carry very similar information. In cross-component prediction, this is exploited by direct prediction of the chroma components from the reconstructed luma block using a linear combination of the reconstructed pixels with two parameters: a factor </span><span style="font-weight: 400;">and an offset </span><span style="font-weight: 400;">where the factors are calculated from the intra reference pixels. If necessary, scaling of the block is performed as well. [</span><a href="http://phenix.int-evry.fr/jvet/doc_end_user/current_document.php?id=9676" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">JVET-Q2002</span></a><span style="font-weight: 400;">]</span></b></li>
</ul>
<p><b>Multi Reference Line Prediction:</b><span style="font-weight: 400;"> As mentioned before, only one row of neighboring pixels is used for intra prediction. In VVC, this restriction is relaxed a bit so that prediction can be performed from two lines that are not directly next to the current block. However, there are several restrictions to this as only one line can be used at a time and no prediction across CTU boundaries is allowed. These limitations are necessary for efficient hardware implementations. [</span><a href="http://phenix.int-evry.fr/jvet/doc_end_user/current_document.php?id=4379" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">JVET-L0283</span></a><span style="font-weight: 400;">]</span><br />
<figure id="attachment_96901" aria-describedby="caption-attachment-96901" style="width: 400px" class="wp-caption aligncenter center-text"><img loading="lazy" decoding="async" class="wp-image-96901" src="https://bitmovin.com/wp-content/uploads/2020/02/MultiReferenceLineIntraPrediction_VVC4-1.png" alt="MultiReferenceLineIntraPrediction-VVC-illustrated" width="400" height="326" /><figcaption id="caption-attachment-96901" class="wp-caption-text"><em>In traditional intra prediction, only one line (line 0) is used for prediction of the current block. In Multi Reference Line Prediction this constraint is relaxed the lines 1 or 3 can be used for prediction as well.</em></figcaption></figure><br />
<span style="font-weight: 400;">Of course, this list is not complete and there are several more intra prediction schemes which further increase the coding efficiency. The method of intra mode prediction and coding of the mode was improved and refined as well.</span></p>
<h3><span style="font-weight: 400;">Inter prediction</span></h3>
<p><span style="font-weight: 400;">For inter prediction, the basic tools from HEVC were carried over and adapted. For example, the basic concepts of uni- and bi-directional motion compensation from one or two reference pictures are mostly unchanged. However, there are some new tools that haven’t been used like this in a video coding standard before:</span><br />
<b>Bi-directional optical flow (BDOF): <span style="font-weight: 400;">If a prediction block uses bi-prediction with one of the references in the temporal past and the second one in the temporal future, BDOF can be used to refine the motion field of the prediction block. For this, the prediction block is split into a grid of 4&#215;4 pixel sub-blocks. For each of these 4&#215;4 blocks, the motion vector is then refined by calculating the optical flow using the two references. While this adds some complexity to the decoder for the optical flow calculation, the refined motion vector field does not need to be transmitted and thus the bitrate is reduced. [</span><a href="http://phenix.int-evry.fr/jvet/doc_end_user/current_document.php?id=3425" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">JVET-J0024</span></a><span style="font-weight: 400;">]</span></b><br />
<b>Decoder side motion vector refinement: <span style="font-weight: 400;">Another method that allows for the motion vectors to automatically be refined at the decoder without the transmission of additional motion data is to perform an actual motion search at the decoder side. While this basic idea has been around for a while, the complexity of a search at the decoder side was always considered too high until now. The process works in three steps:</span></b></p>
<ul>
<li><span style="font-weight: 400;">First, a normal bi-prediction is performed, and the two prediction signals are weighted into a preliminary prediction block.</span></li>
<li><span style="font-weight: 400;">Using this preliminary block, a search around the position of the original block in each reference frame is performed. However, this is not a full search as an encoder would perform it, but a very limited search with a fixed number of positions.</span></li>
<li><span style="font-weight: 400;">If a better position is found, the original motion vector is updated accordingly. Lastly, bi-prediction with the updated motion vectors is performed again to obtain the final prediction. [</span><a href="http://phenix.int-evry.fr/jvet/doc_end_user/current_document.php?id=3523" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">JVET-J1029</span></a><span style="font-weight: 400;">]</span></li>
</ul>
<p><strong>Geometric Partitioning</strong>: <span style="font-weight: 400;">In the section about block partitioning it was shown how each CTU can be split into smaller blocks. All of these splitting operations only split rectangular blocks into smaller rectangular blocks. Unfortunately, natural video content typically contains more curved edges that can only poorly be approximated using rectangular blocks. In this case, </span><i><span style="font-weight: 400;">Geometric Partitioning</span></i><span style="font-weight: 400;"> allows the non-horizontal splitting of a block into two parts. For each of the two parts, motion compensation using independent motion vectors is performed and the two prediction signals are merged together using a blending at the edge. </span><br />
<figure id="attachment_96902" aria-describedby="caption-attachment-96902" style="width: 500px" class="wp-caption aligncenter center-text"><img loading="lazy" decoding="async" class="wp-image-96902" src="https://bitmovin.com/wp-content/uploads/2020/02/GeometricPartitioning_VVC5.png" alt="GeometricPartitioning-VVC-examples" width="500" height="109" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/02/GeometricPartitioning_VVC5-300x65.png?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/02/GeometricPartitioning_VVC5.png?size=384x84&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/02/GeometricPartitioning_VVC5-768x167.png?lossy=2&amp;strip=1&amp;webp=1 768w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/02/GeometricPartitioning_VVC5.png?lossy=2&amp;strip=1&amp;webp=1 896w" sizes="(max-width: 500px) 100vw, 500px" /><figcaption id="caption-attachment-96902" class="wp-caption-text"><em>Some example splits using geometric partitioning.</em></figcaption></figure><br />
<span style="font-weight: 400;">In the current implementation, there are 82 different geometric partition modes. They are made up of 24 slopes and 4 offset values for the partition line. However, the exact number of modes is still under discussion and may still change. [</span><a href="http://phenix.int-evry.fr/jvet/doc_end_user/current_document.php?id=8699" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">JVET-P0884</span></a><span style="font-weight: 400;">, </span><a href="http://phenix.int-evry.fr/jvet/doc_end_user/current_document.php?id=8700" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">JVET-P0085</span></a><span style="font-weight: 400;">]</span><br />
<b><b>Affine motion: <span style="font-weight: 400;">Conventional motion compensation using one motion vector can only represent two-dimensional planar motion. This means that any block can be moved on the image plane in x and y directions only. However, in a natural video, strictly planar motion is quite rare and things tend to move more freely (e.g. rotate and scale). VVC implements an affine motion model that uses two or three motion vectors to enable motion with four or six degrees of freedom for a block. In order to keep the implementational complexity low, the reference block is not transformed on a pixel basis, but a trick is applied to reuse existing motion compensation and interpolation methods. The prediction block is split into a grid of 4&#215;4 pixel blocks. From the two (or three) control point motion vectors, one motion vector is calculated for each 4&#215;4 pixel block. Then, conventional two-dimensional planar motion compensation is performed for each of these 4&#215;4 blocks. While this implementation is not a truly affine motion compensation it is a good approximation and allows for very efficient implementation in hard- and software. [</span><a href="http://phenix.int-evry.fr/jvet/doc_end_user/current_document.php?id=6674" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">JVET-O0070</span></a><span style="font-weight: 400;">]</span></b></b><br />
<figure id="attachment_96903" aria-describedby="caption-attachment-96903" style="width: 398px" class="wp-caption aligncenter center-text"><img loading="lazy" decoding="async" class="wp-image-96903 " src="https://bitmovin.com/wp-content/uploads/2020/02/4x4Sub-blocks_VV6.jpg" alt="4x4Sub-blocks-VVC-Illustrated" width="398" height="442" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/02/4x4Sub-blocks_VV6-270x300.png?lossy=2&amp;strip=1&amp;webp=1 270w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/02/4x4Sub-blocks_VV6.jpg?size=384x426&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/02/4x4Sub-blocks_VV6.jpg?lossy=2&amp;strip=1&amp;webp=1 461w" sizes="(max-width: 398px) 100vw, 398px" /><figcaption id="caption-attachment-96903" class="wp-caption-text"><em>For every 4&#215;4 subblock, an individual motion vector (green) is calculated from the control point motion vectors (blue). Then, conventional motion compensation is performed per 4&#215;4 block.</em></figcaption></figure></p>
<h3><span style="font-weight: 400;">Transformation and Quantization</span></h3>
<p><span style="font-weight: 400;">The transformation stage went through some major refactoring as well. Rectangular blocks that were introduced by the ternary split are now supported by the transformation stage by performing the transform for each direction separately. The maximum transform block size was also increased to 64&#215;64 pixels. These bigger transform sizes are particularly useful when it comes to HD and Ultra-HD content. Furthermore, two additional types of transform were added. While the Discrete Cosine Transform in variant 2 (DCT-II) is already well known from HEVC, one further variant of the DCT (the DCT-VIII) was added, as well as one Discrete Sine Transform (DST-VII). Depending on the prediction mode, an encoder can choose different transforms depending on which one works best.</span><br />
<span style="font-weight: 400;">The biggest change to the Quantization stage is the increase in the maximum </span><i><span style="font-weight: 400;">Quantization Parameter</span></i><span style="font-weight: 400;"> (QP) from 51 to 63. This was necessary as it was discovered that even at the highest possible QP setting, the coding tools of VVC worked so efficiently that it was not possible to reduce the bitrate and quality of certain encodes to the needed levels.</span><br />
<span style="font-weight: 400;">One more really interesting new tool is called </span><i><span style="font-weight: 400;">Dependent Quantization</span></i><span style="font-weight: 400;">. The purpose of the quantization stage is to map the output values from the transformation, which are continuous, onto discrete values that can be coded into the bitstream. This operation inherently comes with a loss of information. The coarser the quantization is (the higher the QP value is), the more information is lost. In the figure below, a simple quantization scheme is shown where all values between each pair of lines are quantized to the value of the marked blue cross. Only the index of the blue cross is then encoded into the bitstream and the decoder can reconstruct the corresponding value.</span><br />
<figure id="attachment_96904" aria-describedby="caption-attachment-96904" style="width: 650px" class="wp-caption aligncenter center-text"><img loading="lazy" decoding="async" class=" wp-image-96904" src="https://bitmovin.com/wp-content/uploads/2020/02/SimpleQuantizationScheme-VVC-7-1.jpg" alt="SimpleQuantizationScheme-VVC" width="650" height="132" /><figcaption id="caption-attachment-96904" class="wp-caption-text"><em>Basic quantization. Each vertical line marks a decision threshold. All values between the two thresholds are quantized to one reconstruction value. The reconstruction values are marked with blue crosses.</em></figcaption></figure><br />
<span style="font-weight: 400;">Typically, only one fixed quantization scheme is used in a video codec. In </span><i><span style="font-weight: 400;">Dependent Quantization,</span></i><span style="font-weight: 400;"> two of these quantization schemes are defined with slightly shifted reconstruction values.</span><br />
<figure id="attachment_96905" aria-describedby="caption-attachment-96905" style="width: 650px" class="wp-caption aligncenter center-text"><img loading="lazy" decoding="async" class="wp-image-96905" src="https://bitmovin.com/wp-content/uploads/2020/02/EmbeddedQuantizationCut2-VVC-9-1.jpg" alt="EmbeddedQuantizationCut2-VVC" width="650" height="183" /><figcaption id="caption-attachment-96905" class="wp-caption-text"><em>In Embedded quantization two sets of reconstruction values are used. The decoder automatically switches between these based on the previously decoded values.</em></figcaption></figure><br />
<span style="font-weight: 400;">Switching between the two quantizers happens implicitly using a tiny state machine that uses the parity of the already coded coefficients. The encoder can then switch between the quantizers by deliberately changing some of the reconstruction values. Finding the optimal place for this switch where the introduced error is lowest, and the switch gives the most gain can be performed using a rate-distortion trade-off. In some manner, this is related to </span><i><span style="font-weight: 400;">Sign Data Hiding</span></i><span style="font-weight: 400;"> (used in HEVC) where also information is “hidden” in other data. [</span><a href="http://phenix.int-evry.fr/jvet/doc_end_user/current_document.php?id=3571" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">JVET-K0070</span></a><span style="font-weight: 400;">]</span></p>
<h3><span style="font-weight: 400;">Other</span></h3>
<p><span style="font-weight: 400;">All tools discussed so far were built and optimized for the coding of conventional natural two-dimensional video. However, the word `versatile` in its name indicates that VVC is meant for a wide variety of applications. And indeed VVC includes some features for more specific tasks which make it very versatile. Former codecs typically put these specialized tools into separate standards or separate extensions. One such tool is the </span><i><span style="font-weight: 400;">Horizontal Wrap Around Motion Compensation. </span></i><span style="font-weight: 400;">A widespread method of transmission of 360° content is to map the 360° video onto a 2D plane using equi-rectangular projection. The 2D video can then be encoded using conventional 2D video coding. However, the video has some special properties which can be used by the encoder. One property is that there is no left or right border in the video. Since the 360° view wraps around, this can be used for motion compensation. So when motion compensation from outside of the left boundary is performed, the prediction wraps around and uses pixel values from the right side of the picture.</span><br />
<figure id="attachment_96907" aria-describedby="caption-attachment-96907" style="width: 544px" class="wp-caption aligncenter center-text"><img loading="lazy" decoding="async" class=" wp-image-96907" src="https://bitmovin.com/wp-content/uploads/2020/02/360VideoMotionCompensation-VVC-10-1.jpg" alt="360VideoMotionCompensation-VVC" width="544" height="252" /><figcaption id="caption-attachment-96907" class="wp-caption-text"><em>Prediction from outside of the left side of the issue will wrap around and use pixels from the right side of the picture as well.</em></figcaption></figure><br />
<span style="font-weight: 400;">While this tool increases the compression performance it also helps to improve the visual quality since normal video codecs tend to produce a visible edge at the line where the left and right side of the 2D video are stitched back together. [</span><a href="http://phenix.int-evry.fr/jvet/doc_end_user/current_document.php?id=4322" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">JVET-L0231</span></a><span style="font-weight: 400;">]</span><br />
<span style="font-weight: 400;">Another application of video coding is the coding of computer-generated video content, also referred to as screen content. This type of content usually has some special characteristics like very sharp edges and very homogeneous areas which are atypical for natural video content. One very powerful tool in this situation is </span><i><span style="font-weight: 400;">Intra Block Copy</span></i><span style="font-weight: 400;"> which performs a copy operation from the already decoded area of the same frame. This is very similar to motion compensation with the key difference that the signalled vector does not refer to a temporal motion but just points to the source area in the current frame for the copy operation. [</span><a href="http://phenix.int-evry.fr/jvet/doc_end_user/current_document.php?id=3450" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">JVET-J0042</span></a><span style="font-weight: 400;">]</span></p>
<h2><span style="font-weight: 400;">Coding performance</span></h2>
<p><span style="font-weight: 400;">With every Standardization meeting, the VVC test model software (VTM) is updated and a test is run to compare the latest version of VTM to the HEVC reference software (HM). This test is purely objective using PSNR values and the Bjøntegaard delta. While multiple different configurations are tested, we will focus on the so-called Random-Access configuration which is the most relevant when it comes to video transmission and streaming. </span><br />
<img loading="lazy" decoding="async" class="aligncenter wp-image-96913" src="https://bitmovin.com/wp-content/uploads/2020/02/Screenshot-2020-02-14-at-18.08.06.png" alt="vvc - Bitmovin" width="800" height="79" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/02/Screenshot-2020-02-14-at-18.08.06-300x30.png?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/02/Screenshot-2020-02-14-at-18.08.06.png?size=384x38&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/02/Screenshot-2020-02-14-at-18.08.06-768x76.png?lossy=2&amp;strip=1&amp;webp=1 768w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2020/02/Screenshot-2020-02-14-at-18.08.06.png?lossy=2&amp;strip=1&amp;webp=1 1014w" sizes="(max-width: 800px) 100vw, 800px" /></p>
<p style="text-align: center;"><i><span style="font-weight: 400;">BD-rate comparison of VTM 7.0 compared to HM 16.20.  </span></i><span style="font-weight: 400;">[</span><a href="http://phenix.int-evry.fr/jvet/doc_end_user/current_document.php?id=9363" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">Q0003</span></a><span style="font-weight: 400;">]</span></p>
<p><span style="font-weight: 400;">In terms of BD-rate performance, VTM is able to achieve similar PSNR values while reducing the required bandwidth by roughly 35%. While the encoding time is not a perfect measure of complexity it can give a good first indication. The complexity of VVC at the encoder side is roughly 10 times higher, while the decoder complexity only increases by a factor of 1.7. Please note that these results are all based on PSNR results. It is well known that PSNR values are not that well coupled to the actual subjectively perceived quality and some preliminary experiments show that the subjective results seem to be higher than 35% bitrate reduction. A formal subjective test is planned for later this year.</span></p>
<h2><span style="font-weight: 400;">Conclusion</span></h2>
<p><span style="font-weight: 400;">So after all of this technical detail what is the future of VVC going to be? From a technical side, VVC is the most efficient and advanced coding standard that money can buy. However, it is unknown as of yet how much it will really cost. Once the standardization process is officially finished in October 2020, the process to establish licensing terms for the new standard can be started. From previous standards, we have learned that this is a complicated process that can take a while. At the same time, there are other highly efficient codecs out there for which applications and implementations are maturing and evolving. </span></p>
<h3><span style="font-weight: 400;">Links and more information</span></h3>
<p><span style="font-weight: 400;">The JVET standardization activity is a very open and transparent one. All input documents to the standardization are publicly available </span><a href="http://phenix.int-evry.fr/jvet/" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">here</span></a><span style="font-weight: 400;">. Also, the reference encoder and decoder software are publicly available </span><a href="https://vcgit.hhi.fraunhofer.de/jvet/VVCSoftware_VTM.git" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">here</span></a><span style="font-weight: 400;">.</span></p>
<h3><span style="font-weight: 400;">Bitmovin &amp; Standardization</span></h3>
<p><span style="font-weight: 400;">Bitmovin is heavily involved in the standardization process around back-end vidtech; this includes our attendance and participation in the quarterly MPEG meetings, as well our membership and involvement in </span><a href="https://aomedia.org/" rel="nofollow noopener" target="_blank"><span style="font-weight: 400;">AOMedia</span></a><span style="font-weight: 400;">. </span></p>
<h2><strong>Video technology guides and articles</strong></h2>
<ul>
<li>Back to Basics: Guide to the <a href="https://bitmovin.com/html5-video-tag-guide/">HTML5 Video Tag </a></li>
<li><a href="https://bitmovin.com/vod-platforms/">What is a VoD Platform?</a>A comprehensive guide to Video on Demand (VOD)</li>
<li><a href="https://bitmovin.com/top-5-video-technology-trends/">Video Technology [2022]</a>: Top 5 video technology trends</li>
<li><a href="https://bitmovin.com/vp9-vs-hevc-h265/">HEVC vs VP9</a>: Modern codecs comparison</li>
<li>What is the <a href="https://bitmovin.com/av1/">AV1 Codec</a>?</li>
<li>Video Compression: <a href="https://bitmovin.com/encoding-definition-bitrates/">Encoding Definition and Adaptive Bitrate</a></li>
<li>What is <a href="https://bitmovin.com/adaptive-streaming/">adaptive bitrate streaming</a></li>
<li><a href="https://bitmovin.com/mkv-vs-mp4/">MP4 vs MKV</a>: Battle of the Video Formats</li>
<li><a href="https://bitmovin.com/video-streaming-models-svod-avod-tvod/">AVOD vs SVOD</a>; the “fall” of SVOD and Rise of AVOD &amp; TVOD (Video Tech Trends)</li>
<li><a href="https://bitmovin.com/dynamic-adaptive-streaming-http-mpeg-dash/">MPEG-DASH</a> (Dynamic Adaptive Streaming over HTTP)</li>
<li><a href="https://bitmovin.com/container-formats-fun-1/">Container Formats</a>: The 4 most common container formats and why they matter to you.</li>
<li><a href="https://bitmovin.com/qoe-why-quality-video-matters/">Quality of Experience</a> (QoE) in Video Technology [2022 Guide]</li>
</ul>
<p>&nbsp;</p>
<p>The post <a rel="nofollow" href="https://bitmovin.com/compression-standards-vvc-2020">State of Compression: What is VVC and how does it work?</a> appeared first on <a rel="nofollow" href="https://bitmovin.com">Bitmovin</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Cool New Video Tools: Five Encoding Advancements Coming in AV1</title>
		<link>https://bitmovin.com/cool-new-video-tools-five-encoding-advancements-coming-av1</link>
		
		<dc:creator><![CDATA[Christian Feldmann]]></dc:creator>
		<pubDate>Thu, 01 Mar 2018 20:20:25 +0000</pubDate>
				<category><![CDATA[Innovation]]></category>
		<category><![CDATA[av1]]></category>
		<category><![CDATA[video encoding]]></category>
		<guid isPermaLink="false">http://bitmovin.com/?p=22642</guid>

					<description><![CDATA[<p>Now that AV1 has entered its final stage of development and is getting close to finalizing its features, it’s a perfect time to take a closer look at what’s in store for the future of video streaming. With Apple announcing their decision to join the Alliance for Open Media in January, practically all major tech...</p>
<p>The post <a rel="nofollow" href="https://bitmovin.com/cool-new-video-tools-five-encoding-advancements-coming-av1">Cool New Video Tools: Five Encoding Advancements Coming in AV1</a> appeared first on <a rel="nofollow" href="https://bitmovin.com">Bitmovin</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p><img loading="lazy" decoding="async" class="alignnone wp-image-22654" src="https://bitmovin.com/wp-content/uploads/2018/03/cool-video-tools-300x150.png" alt="red wooden toolbox with av1 logo and title of blog post" width="908" height="454" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2018/03/cool-video-tools-300x150.png?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2018/03/cool-video-tools.png?size=384x192&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2018/03/cool-video-tools-768x384.png?lossy=2&amp;strip=1&amp;webp=1 768w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2018/03/cool-video-tools-1024x512.png?lossy=2&amp;strip=1&amp;webp=1 1024w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2018/03/cool-video-tools.png?lossy=2&amp;strip=1&amp;webp=1 1200w" sizes="(max-width: 908px) 100vw, 908px" /><br />
Now that <a href="https://bitmovin.com/av1-datasheet/">AV1</a> has entered its final stage of development and is getting close to finalizing its features, it’s a perfect time to take a closer look at what’s in store for the future of video streaming. With Apple announcing their decision to join the <a href="http://aomedia.org/" rel="nofollow noopener" target="_blank">Alliance for Open Media</a> in January, practically all major tech leaders are on board and AV1 looks to be in good shape for becoming a widespread standard in the near future. <a href="https://attendee.gotowebinar.com/register/5859292356098244609?source=Blog" rel="nofollow noopener" target="_blank">Learn what video encoding advancements are coming in this new open codec in the upcoming webinar on Thursday March 22</a>.<br />
But what makes AV1 stand out technologically? In this posting, we will cover five key tools included in AV1, which have been adopted to help reduce bandwidth demands by up to 30% while still retaining or improving picture quality.</p>
<h2>A royalty free solution to match increasing demands in streaming quality and speed</h2>
<p>Perhaps AV1’s most important feature is not a technological one: It was designed from the very start to be completely royalty-free, in an effort to provide a truly open video codec capable of providing high quality video streaming at lower bitrates. With the availability of high resolution content constantly increasing and technologies like VR and 360° video on the rise, the need for a suitable, technologically advanced and open codec has become apparent among large-scale content providers. This desire is probably best documented by the fact that virtually all leading industry players and tech companies are <a href="http://aomedia.org/about-us/" rel="nofollow noopener" target="_blank">contributing members</a> of the Alliance for Open Media, the development foundation behind AV1.<br />
The alliance has set out to finally provide an open standard for internet video streaming, following the path of other open standards like CSS or PNG, which are already shaping our daily digital reality. Bitmovin has been a trailblazer pushing AV1 to become the standard for years to come. Learn more about the<a href="https://bitmovin.com/av1/"> development timeline that lead to formation of AV1.</a><br />
To name an example, Netflix, a major provider and driver of innovation in the industry, has already stated that they expect to be an <a href="https://youtu.be/thvSyJN1vsA?t=5452" rel="nofollow noopener" target="_blank">early adopter of AV1</a>, in addition to their efforts of contributing to the royalty-free development community. Mozilla is another key supporter, providing <a href="https://hacks.mozilla.org/2017/11/dash-playback-of-av1-video/" rel="nofollow noopener" target="_blank">a successful browser implementation of AV1</a> for Firefox Nightly (<a href="https://demo.bitmovin.com/public/firefox/av1/">powered by Bitmovin</a>). With practically all big names on board, AV1 seems poised to become the standard for a world of content, which relies on large resolution video, VR and AR applications.<br />
For now, let’s take a closer look at the five key encoding and decoding techniques which make AV1 an interesting choice to use in video streaming.</p>
<h2>Film grain synthesis</h2>
<p>Film grain occurs commonly in photographic film, most noticeably in over-enlarged pictures, but can also be applied digitally for artistic effect. During digital video compression, film grain creates massive problems as it is hard to recognize as such for machines and the constant “noise” creates a lot of traffic in the bitstream. This leads to high bitrate requirements for transmitting very little information. Since the information is of little actual value for the perceived quality – after all the human brain tends to filter visual noise out to some extent – finding a way to not actually transfer the information with the bitstream, but rather re-apply it later, poses a desirable solution.<br />
This idea forms the base for AV1’s film grain synthesis. The goal is to de-noise the initial content before encoding it and then re-adding the noise or grain effect before output during the decoding process. This way, the unnecessary information would not have to be transmitted at all and the overall load of data could be reduced substantially.</p>
<p><figure id="attachment_22646" aria-describedby="caption-attachment-22646" style="width: 897px" class="wp-caption alignnone"><img loading="lazy" decoding="async" class=" wp-image-22646" src="https://bitmovin.com/wp-content/uploads/2018/03/figure1-300x118.png" alt="Figure 1: Film grain synthesis process (simplified)" width="897" height="353" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2018/03/figure1-300x118.png?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2018/03/figure1.png?size=384x151&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2018/03/figure1-768x302.png?lossy=2&amp;strip=1&amp;webp=1 768w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2018/03/figure1-1024x403.png?lossy=2&amp;strip=1&amp;webp=1 1024w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2018/03/figure1.png?lossy=2&amp;strip=1&amp;webp=1 1286w" sizes="(max-width: 897px) 100vw, 897px" /><figcaption id="caption-attachment-22646" class="wp-caption-text"><strong>Figure 1:</strong> Film grain synthesis process (simplified)</figcaption></figure></p>
<p>The potential in bandwidth savings for content providers from using this technology is enormous. More so for very “noisy” content, which can commonly occur in old video footage that has been digitized or in videos, which use film grain for artistic reasons. Either way, this tool can be used to great effect and forms a key benefit in AV1’s list of features.</p>
<h2>Constrained Directional Enhancement Filter</h2>
<p>Filtering is an essential process in every video codec, as it drastically increases the perceived quality of the encoded video. It mostly occurs along the outlines of each of the blocks, which are used to divide each picture into smaller sub-units during the compression process. AV1 contains various sets of filters, most of which are derived from existing codecs. <a href="https://tools.ietf.org/id/draft-midtskogen-netvc-cdef-00.html" rel="nofollow noopener" target="_blank">The Constrained Directional Enhancement Filter (CDEF)</a> is quite possibly the most impactful addition to the range of filters. This filter basically merges two existing filters: a directional de-ringing filter as used in the Daala video codec and the constrained low pass filter (CLPF) from the Thor video codec. CLPF is applied to filter out artifacts which stem from quantization errors and have not been corrected through the preceding application of a de-blocking filter. The directional de-ringing filter works by recognizing edges within each block and identifying their orientation. It then conditionally applies a directional low-pass filter along those edges, resulting in a smoother picture and an increase in perceived quality.</p>
<p><figure id="attachment_22645" aria-describedby="caption-attachment-22645" style="width: 880px" class="wp-caption alignnone"><img loading="lazy" decoding="async" class=" wp-image-22645" src="https://bitmovin.com/wp-content/uploads/2018/03/figure2-300x137.png" alt="Figure 2: Direction search in CDEF as presented in: Steinar Midtskogen &amp; Jean-Marc Valin: THE AV1 CONSTRAINED DIRECTIONAL ENHANCEMENT FILTER (CDEF). See: https://arxiv.org/abs/1602.05975" width="880" height="402" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2018/03/figure2-300x137.png?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2018/03/figure2.png?size=384x175&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2018/03/figure2-768x350.png?lossy=2&amp;strip=1&amp;webp=1 768w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2018/03/figure2.png?lossy=2&amp;strip=1&amp;webp=1 1414w" sizes="(max-width: 880px) 100vw, 880px" /><figcaption id="caption-attachment-22645" class="wp-caption-text"><strong>Figure 2: </strong>Direction search in CDEF as presented in: Steinar Midtskogen &amp; Jean-Marc Valin: THE AV1 CONSTRAINED DIRECTIONAL ENHANCEMENT FILTER (CDEF). See: <a href="https://arxiv.org/abs/1602.05975" rel="nofollow noopener" target="_blank">https://arxiv.org/abs/1602.05975</a></figcaption></figure></p>
<p>CDEF merges the two filters and works by analyzing the contents of each block, smoothing out artifacts along edges and de-blocking the picture. The search for the filtering parameters (direction and variance) is applied on the decoder’s end, after the actual video has already been encoded. The filtering process is also performed by the encoder, in order to get the correct reference frames. Since the filtering operation can be run on the consumer’s hardware, required network bandwidth can be reduced and with it the traffic load.</p>
<h2>Warped motion and global motion compensation</h2>
<p>Predicting and compensating motions is an important principle in video compression, as it allows for the reduction of redundant information, which would otherwise be part of the bitstream and thus increase the amount of data being transmitted. As such, motion compensation works by recognizing and anticipating movement patterns within frames and blocks and in turn, reducing the relevant information for the coding process to the required minimum.<br />
Warped motion compensation is a particularly interesting technique, as it anticipates movement patterns in three dimensions, predicting spatial movement trajectories within videos. Based on the calculated predictions, redundant information is identified and omitted in the coding process, resulting in a significant reduction to the required load of data.<br />
Global motion compensation predicts motions for an entire frame (e. g. camera movement, zooming sequences etc.) and uses these analyses to limit the amount of information transmitted in the bitstream. Basically, information is condensed to statements like “move all blocks right” or “pan this block”, thus saving data.<br />
Motion compensation algorithms have been used and theorized upon for a while, but only on a two-dimensional level. AV1 marks the first time that non-planar motion compensation has been implemented into a video codec. Due to the constant increase in processing power of consumer devices, this technique is now ready to see use in mass-market applications.<br />
These techniques work extremely well for predicting large area movements, like background motion or camera movements. Additionally, they can handle consistent backgrounds and color schemes very effectively, which is one of the reasons why animated videos tend to deliver great encoding results, even with very high levels of compression.</p>
<h2>Increased coding unit size (up to 128&#215;128)</h2>
<p>As video resolutions keep getting larger, an increase in block size is an effective way to scale the compression process along with high resolution contents. Each frame is partitioned up into individual coding units (or blocks), which are then processed individually during the coding procedure. Consequently, small resolutions like 1280&#215;720 (720p) can be divided into blocks with an individual size of 64&#215;64 quite easily, whereas the same block size yields less practicality for large resolutions like 7680&#215;4320 (8k UHD).</p>
<p><figure id="attachment_22644" aria-describedby="caption-attachment-22644" style="width: 887px" class="wp-caption alignnone"><img loading="lazy" decoding="async" class=" wp-image-22644" src="https://bitmovin.com/wp-content/uploads/2018/03/figure3-300x158.jpg" alt="Figure 3: relative sizes of common video resolutions (current and historic)" width="887" height="467" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2018/03/figure3-300x158.jpg?size=177x93&amp;lossy=2&amp;strip=1&amp;webp=1 177w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2018/03/figure3-300x158.jpg?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2018/03/figure3.jpg?lossy=2&amp;strip=1&amp;webp=1 309w" sizes="(max-width: 887px) 100vw, 887px" /><figcaption id="caption-attachment-22644" class="wp-caption-text"><strong>Figure 3:</strong> relative sizes of common video resolutions (current and historic) [<a href="https://upload.wikimedia.org/wikipedia/commons/thumb/c/c6/Digital_video_resolutions_%28VCD_to_4K%29.svg/1920px-Digital_video_resolutions_%28VCD_to_4K%29.svg.png" rel="nofollow noopener" target="_blank">Source</a>]</figcaption></figure>As 4K and 8K video content is about to become more widespread, the move towards larger coding units is a necessary step in achieving high quality compression. Bigger units mean less blocks per frame, a factor which is beneficial for the encoding of large resolution video, as it allows for a higher level of compression while retaining great perceived quality. It does so by allowing for a reduction in coding delay for large resolutions, as well as by lowering signaling rates per block. An increased block size also enables the use of bigger prediction and transform units, which again benefits the handling of large resolution content.</p>
<h2>Non-binary arithmetic coding</h2>
<p>This technique marks an interesting change from other current codecs like HEVC or AVC. For those, every symbol which is entered into the arithmetic coding engine has to be binary. With AV1, these symbols can also be non-binary, meaning that they can have up to eight possible values instead of just two. The symbols are then processed by the arithmetic coding engine, which produces a binary bitstream as output. Both ends, encoder and decoder, operate using probability calculations to estimate how many output bits will be created from a given symbol. Theoretically, any given input symbol could therefore produce multiple bits or even just a fraction of a bit.</p>
<p><figure id="attachment_22643" aria-describedby="caption-attachment-22643" style="width: 903px" class="wp-caption alignnone"><img loading="lazy" decoding="async" class=" wp-image-22643" src="https://bitmovin.com/wp-content/uploads/2018/03/figure4-300x181.png" alt="Figure 4: Binary and non-binary coding schemes" width="903" height="545" srcset="https://b3148424.smushcdn.com/3148424/wp-content/uploads/2018/03/figure4-300x181.png?lossy=2&amp;strip=1&amp;webp=1 300w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2018/03/figure4.png?size=384x232&amp;lossy=2&amp;strip=1&amp;webp=1 384w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2018/03/figure4-768x463.png?lossy=2&amp;strip=1&amp;webp=1 768w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2018/03/figure4-1024x617.png?lossy=2&amp;strip=1&amp;webp=1 1024w, https://b3148424.smushcdn.com/3148424/wp-content/uploads/2018/03/figure4.png?lossy=2&amp;strip=1&amp;webp=1 1440w" sizes="(max-width: 903px) 100vw, 903px" /><figcaption id="caption-attachment-22643" class="wp-caption-text"><strong>Figure 4: </strong>Binary and non-binary coding schemes</figcaption></figure></p>
<p>Although non-binary coding renders the coding process more complex by combining multiple values into a single symbol, it is still less complex than if it were one bit per symbol. One major benefit lies in the possibility to process more symbols per clock cycle using this procedure. As clock cycles have to be performed serially, non-binary coding achieves improvements by allowing multiple symbols to be handled during each serial cycle.</p>
<h2>Where is AV1 headed?</h2>
<p>As the final stages of development are winding down, it seems not too far-fetched to assume that AV1 is going to have massive impact on the world of video streaming in the near future. User demand for high quality video streaming is already more than just tangible and the coming generation of high resolution mobile devices and VR-enabled gadgets are about to push their way into mainstream availability. Seeing new technologies emerge and pave their way into our everyday lives is a fascinating process and AV1 will likely shape up to be a major factor in structuring our digital realities going forward.<br />
<strong>AV1 is the next generation video codec and is on track to deliver a 30% improvement over VP9 &amp; HEVC –</strong> <a href="https://bitmovin.com/av1/">Learn about Bitmovin and AV1</a></p>
<h2><strong>More AV1 Resources:</strong></h2>
<ul>
<li style="list-style-type: none;">
<ul>
<li><a href="https://bitmovin.com/av1-4k-video-sd-bitrates/">4K Video at SD Bitrates with AV1</a></li>
<li><a href="https://bitmovin.com/bitmovin-improves-av1-video-encoding/">Bitmovin Improves Support AV1 Video Encoding for VoD</a></li>
<li><a href="https://bitmovin.com/av1-encoding-gift-guide/">Bitmovin’s AV1 Encoding Gift Guide</a></li>
</ul>
</li>
</ul>
<ul>
<li style="list-style-type: none;"></li>
</ul>
<p><a href="Apple joins AV1 codec consortium. What does it mean for you?"> </a></p>
<p>The post <a rel="nofollow" href="https://bitmovin.com/cool-new-video-tools-five-encoding-advancements-coming-av1">Cool New Video Tools: Five Encoding Advancements Coming in AV1</a> appeared first on <a rel="nofollow" href="https://bitmovin.com">Bitmovin</a>.</p>
]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>
