Table of Contents
Preface
Bitmovin is a proud member and contributor to several organizations working to shape the future of video, including the Moving Pictures Expert Group (MPEG), where I along with a few senior developers at Bitmovin are active members. Personally, I have been a member and attendant of MPEG for 20+ years and have been documenting the progress since early 2010. Today, we’re working hard to further improve the capabilities and energy efficiency of the industry’s newest standards, such as VVC, while maintaining and modernizing older codecs like HEVC and AVC to take advantage of advancements in neural network post-processing.
The 143rd MPEG Meeting Highlights
The official press release of the 143rd MPEG meeting can be found here and comprises the following items:
- MPEG finalizes the Carriage of Uncompressed Video and Images in ISOBMFF
- MPEG reaches the First Milestone for two ISOBMFF Enhancements
- MPEG ratifies Third Editions of VVC and VSEI
- MPEG reaches the First Milestone of AVC (11th Edition) and HEVC Amendment
- MPEG Genomic Coding extended to support Joint Structured Storage and Transport of Sequencing Data, Annotation Data, and Metadata
- MPEG completes Reference Software and Conformance for Geometry-based Point Cloud Compression
In this report, I’d like to focus on ISOBMFF and video codecs and, as always, I will conclude with an update on MPEG-DASH.
ISOBMFF Enhancements
The ISO Base Media File Format (ISOBMFF) supports the carriage of a wide range of media data such as video, audio, point clouds, haptics, etc., which has now been further extended to uncompressed video and images.
ISO/IEC 23001-17 – Carriage of uncompressed video and images in ISOBMFF – specifies how uncompressed 2D image and video data is carried in files that comply with the ISOBMFF family of standards. This encompasses a range of data types, including monochromatic and colour data, transparency (alpha) information, and depth information. The standard enables the industry to effectively exchange uncompressed video and image data while utilizing all additional information provided by the ISOBMFF, such as timing, color space, and sample aspect ratio for interoperable interpretation and/or display of uncompressed video and image data.
ISO/IEC 14496-15, formerly known as MP4 file format (and based on ISOBMFF), provides the basis for “network abstraction layer (NAL) unit structured video coding formats” such as AVC, HEVC, and VVC. The current version is the 6th edition, which has been amended to support neural-network post-filter supplemental enhancement information (SEI) messages. This amendment defines the carriage of the neural-network post-filter characteristics (NNPFC) SEI messages and the neural-network post-filter activation (NNPFA) SEI messages to enable the delivery of (i) a base post-processing filter and (ii) a series of neural network updates synchronized with the input video pictures/frames.
Bitmovin has supported ISOBFF in our encoding pipeline and API from day 1 and will continue to do so. For more details and information about container file formats, check out this blog.
Video Codec Enhancements
MPEG finalized the specifications of the third editions of the Versatile Video Coding (VVC, ISO/IEC 23090-3) and the Versatile Supplemental Enhancement Information (VSEI, ISO/IEC 23002-7) standards. Additionally, MPEG issued the Committee Draft (CD) text of the eleventh edition of the Advanced Video Coding (AVC, ISO/IEC 14496-10) standard and the Committee Draft Amendment (CDAM) text on top of the High Efficiency Video Coding standard (HEVC, ISO/IEC 23008-2).
These SEI messages include two systems-related SEI messages, (a) one for signaling of green metadata as specified in ISO/IEC 23001-11 and (b) the other for signaling of an alternative video decoding interface for immersive media as specified in ISO/IEC 23090-13. Furthermore, the neural network post-filter characteristics SEI message and the neural-network post-processing filter activation SEI message have been added to AVC, HEVC, and VVC.
The two SEI messages for describing and activating post-filters using neural network technology in video bitstreams could, for example, be used for reducing coding noise, spatial and temporal upsampling (i.e., super-resolution and frame interpolation), color improvement, or general denoising of the decoder output. The description of the neural network architecture itself is based on MPEG’s neural network representation standard (ISO/IEC 15938 17). As results from an exploration experiment have shown, neural network-based post-filters can deliver better results than conventional filtering methods. Processes for invoking these new post-filters have already been tested in a software framework and will be made available in an upcoming version of the VVC reference software (ISO/IEC 23090-16).
Bitmovin and our partner ATHENA research lab have been exploring several applications of neural networks to improve the quality of experience for video streaming services. You can read the summaries with links to full publications in this blog post.
The latest MPEG-DASH Update
The current status of MPEG-DASH is depicted in the figure below:
The latest edition of MPEG-DASH is the 5th edition (ISO/IEC 23009-1:2022) which is publicly/freely available here. There are currently three amendments under development:
- ISO/IEC 23009-1:2022 Amendment 1: Preroll, nonlinear playback, and other extensions. This amendment has been ratified already and is currently being integrated into the 5th edition of part 1 of the MPEG-DASH specification.
- ISO/IEC 23009-1:2022 Amendment 2: EDRAP streaming and other extensions. EDRAP stands for Extended Dependent Random Access Point and at this meeting the Draft Amendment (DAM) has been approved. EDRAP increases the coding efficiency for random access and has been adopted within VVC.
- ISO/IEC 23009-1:2022 Amendment 3: Segment sequences for random access and switching. This amendment is at Committee Draft Amendment (CDAM) stage, the first milestone of the formal standardization process. This amendment aims at improving tune-in time for low latency streaming.
Additionally, MPEG Technologies under Consideration (TuC) comprises a few new work items, such as content selection and adaptation logic based on device orientation and signaling of haptics data within DASH.
Finally, part 9 of MPEG-DASH — redundant encoding and packaging for segmented live media (REAP) — has been promoted to Draft International Standard (DIS). It is expected to be finalized in the upcoming MPEG meetings.
Bitmovin recently announced its new Player Web X which was reimagined and built from the ground up with structured concurrency. You can read more about it and why structured concurrency matters in this recent blog series.
The next meeting will be held in Hannover, Germany, from October 16-20, 2023. Further details can be found here.
Click here for more information about MPEG meetings and their developments.
Are you currently using the ISOBMFF or CMAF as a container format for fragmented MP4 files? Do you prefer hard-parted fMP4 or single-file MP4 with byte-range addressing? Vote in our poll and check out the Bitmovin Community to learn more.
Looking for more info on streaming formats and codecs? Here are some useful resources: