The 126th MPEG meeting concluded on March 29, 2019 in Geneva, Switzerland with the following topics:
- Three Degrees of Freedom Plus (3DoF+) – MPEG evaluates responses to the Call for Proposal and starts a new project on Metadata for Immersive Video
- Neural Network Compression for Multimedia Applications – MPEG evaluates responses to the Call for Proposal and kicks off its technical work
- Low Complexity Enhancement Video Coding – MPEG evaluates responses to the Call for Proposal and selects a Test Model for further development
- Point Cloud Compression – MPEG promotes its Geometry-based Point Cloud Compression (G-PCC) technology to the Committee Draft (CD) stage
- MPEG Media Transport (MMT) – MPEG approves 3rd Edition of Final Draft International Standard
- MPEG-G – MPEG-G standards reach Draft International Standard for Application Program Interfaces (APIs) and Metadata technologies
The corresponding press release of the 126th MPEG meeting can be found here: https://mpeg.chiariglione.org/meetings/126
Three Degrees of Freedom Plus (3DoF+)
MPEG evaluates responses to the Call for Proposal and starts a new project on Metadata for Immersive Video
MPEG’s support for 360-degree video — also referred to as omnidirectional video — is achieved using the Omnidirectional Media Format (OMAF) and Supplemental Enhancement Information (SEI) messages for High Efficiency Video Coding (HEVC). It basically enables the utilization of the tiling feature of HEVC to implement 3DoF applications and services, e.g., users consuming 360-degree content using a head mounted display (HMD). However, rendering flat 360-degree video may generate visual discomfort when objects close to the viewer are rendered. The interactive parallax feature of Three Degrees of Freedom Plus (3DoF+) will provide viewers with visual content that more closely mimics natural vision, but within a limited range of viewer motion.
At its 126th meeting, MPEG received five responses to the Call for Proposals (CfP) on 3DoF+ Visual. Subjective evaluations showed that adding the interactive motion parallax to 360-degree video will be possible. Based on the subjective and objective evaluation, a new project was launched, which will be named Metadata for Immersive Video. A first version of a Working Draft (WD) and corresponding Test Model (TM) were designed to combine technical aspects from multiple responses to the call. The current schedule for the project anticipates Final Draft International Standard (FDIS) in July 2020.
Neural Network Compression for Multimedia Applications
MPEG evaluates responses to the Call for Proposal and kicks off its technical work
Artificial neural networks have been adopted for a broad range of tasks in multimedia analysis and processing, such as visual and acoustic classification, extraction of multimedia descriptors or image and video coding. The trained neural networks for these applications contain a large number of parameters (i.e., weights), resulting in a considerable size. Thus, transferring them to a number of clients using them in applications (e.g., mobile phones, smart cameras) requires compressed representation of neural networks.
At its 126th meeting, MPEG analyzed nine technologies submitted by industry leaders as responses to the Call for Proposals (CfP) for Neural Network Compression. These technologies address compressing neural network parameters in order to reduce their size for transmission and the efficiency of using them, while not or only moderately reducing their performance in specific multimedia applications.
After a formal evaluation of submissions, MPEG identified three main technology components in the compression pipeline, which will be further studied in the development of the standard. A key conclusion is that with the proposed technologies, a compression to 10% or less of the original size can be achieved with no or negligible performance loss, where this performance is measured as classification accuracy in image and audio classification, matching rate in visual descriptor matching, and PSNR reduction in image coding. Some of these technologies also result in the reduction of the computational complexity of using the neural network or can benefit from specific capabilities of the target hardware (e.g., support for fixed point operations).
Bitmovin products and solutions utilize artificial intelligence (AI) techniques, such as AI-enabled encoding, for increased performance and quality of experience (QoE).
Low Complexity Enhancement Video Coding
MPEG evaluates responses to the Call for Proposal and selects a Test Model for further development
MPEG started a new work item referred to as Low Complexity Enhancement Video Coding (LCEVC), which will be added as part 2 of the MPEG-5 suite of codecs. The new standard is aimed at bridging the gap between two successive generations of codecs by providing a codec-agile extension to existing video codecs that improves coding efficiency and can be readily deployed via software upgrade and with sustainable power consumption.
The target is to achieve:
- coding efficiency close to High Efficiency Video Coding (HEVC) Main 10 by leveraging Advanced Video Coding (AVC) Main Profile and
- coding efficiency close to upcoming next generation video codecs by leveraging HEVC Main 10.
This coding efficiency should be achieved while maintaining overall encoding and decoding complexity lower than that of the leveraged codecs (i.e., AVC and HEVC, respectively) when used in isolation at full resolution. This target has been met, and one of the responses to the CfP will serve as starting point and test model for the standard. The new standard is expected to become part of the MPEG-5 suite of codecs and its development is expected to be completed in 2020.
Bitmovin is known to be an early adopter w.r.t. video codecs and was the first to demonstrate AV1 in production environment. Currently, all modern video codecs (AVC, HEVC, VP9, AV1) are fully supported and codecs under development including VVC but also EVC and LCEVC are currently under investigation by our encoding team.
Point Cloud Compression
MPEG promotes its Geometry-based Point Cloud Compression (G-PCC) technology to the Committee Draft (CD) stage
MPEG’s Geometry-based Point Cloud Compression (G-PCC) standard addresses lossless and lossy coding of time-varying 3D point clouds with associated attributes such as color and material properties. This technology is appropriate especially for sparse point clouds.
MPEG’s Video-based Point Cloud Compression (V-PCC) addresses the same problem but for dense point clouds, by projecting the (typically dense) 3D point clouds onto planes, and then processing the resulting sequences of 2D images with video compression techniques.
G-PCC provides a generalized approach, which directly codes the 3D geometry to exploit any redundancy found in the point cloud itself and is complementary to V-PCC and particularly useful for sparse point clouds representing large environments.
Point clouds are typically represented by extremely large amounts of data, which is a significant barrier for mass market applications. However, the relative ease to capture and render spatial information compared to other volumetric video representations makes point clouds increasingly popular to present immersive volumetric data. The current implementation of a lossless, intra-frame G PCC encoder provides a compression ratio up to 10:1 and acceptable quality lossy coding of ratio up to 35:1.
Bitmovin — in collaboration with Alpen-Adria-Universität Klagenfurt — is actively researching in the area of dynamic adaptive point cloud streaming, published first results in ACM Packet Video 2018 and will continue this effort considering both V-PCC and G-PCC.
MPEG Media Transport (MMT)
MPEG approves 3rd Edition of Final Draft International Standard
MMT 3rd edition will introduce two aspects:
- enhancements for mobile environments and
- support of Contents Delivery Networks (CDNs).
The support for multipath delivery will enable delivery of services over more than one network connection concurrently, which is specifically useful for mobile devices that can support more than one connection at a time.
Additionally, support for intelligent network entities involved in media services (i.e., Media Aware Network Entity (MANE)) will make MMT-based services adapt to changes of the mobile network faster and better. Understanding the support for load balancing is an important feature of CDN-based content delivery, messages for DNS management, media resource update, and media request is being added in this edition.
On going developments within MMT will add support for the usage of MMT over QUIC (Quick UDP Internet Connections) and support of FCAST in the context of MMT.
Bitmovin is fully committed to new transport mechanisms and protocols and actively researched — again in collaboration with Alpen-Adria-Unviersität Klagenfurt — the usage of HTTP/2, QUIC, and even Information-Centric Networking (ICN) in the context of HTTP Adaptive Streaming (HAS) resulting in a plethora of high impact papers published at renowned scientific conferences and journals.