Industry Workshops – 2023 IEEE International Conference on Image Processing

Meta and Google Workshop

Title: Alliance for Open Media Workshop, sponsored by Meta & Google
Abstract: The Alliance for Open Media was formed in 2015 and produced its first specification, AV1, in 2018 with the goal of being a royalty-free video coding standard. We will present an overall status update on the adoption and performance of AV1 as well as progress towards a potential future video coding standard, through sets of new coding tools contributed by AOMedia members. Advances in video quality metrics will also be presented, as well as related topics.

Time: Monday, October 9, 14:30-18:00
Location: Plenary Hall

Organizers:

Dr. Ioannis Katsavounidis, Research Scientist, Video Infrastructure, Meta
Dr. Ryan Lei, Video Codec Specialist, Video Infrastructure, Meta
Dr. Debargha Mukherjee, Principal Engineer, Google
Dr. Balu Adsumilli, Head of Media Algorithms, YouTube

Speakers

Dr. Ioannis Katsavounidis (Meta)
Mr. Hassene Tmar (Meta)
Dr. Ryan Lei (Meta)
Dr. Anush Moorthy (Netflix)
Dr. In Suk Chong (Google)
Dr. Yilin Wang (YouTube/Google)
Dr. Balu Adsumili (YouTube/Google)
Dr. Debargha Mukherjee (Google)
Dr. Yeping Su (Google)
Dr. Xin Zhao (Tencent)
Dr. Onur Guleryuz (Google)
Mr. Joe Young (Google)

Dr. Ioannis Katsavounidis is part of the Video Infrastructure team, leading technical efforts in improving video quality and quality of experience across all video products at Meta. Before joining Meta, he spent 3.5 years at Netflix, contributing to the development and popularization of VMAF, Netflix’s open-source video quality metric, as well as inventing the Dynamic Optimizer, a shot-based perceptual video quality optimization framework that brought significant bitrate savings across the whole video streaming spectrum. He was a professor for 8 years at the University of Thessaly’s Electrical and Computer Engineering Department in Greece, teaching video compression, signal processing and information theory. He was one of the cofounders of Cidana, a mobile multimedia software company in Shanghai, China. He was the director of software for advanced video codecs at InterVideo, the makers of the popular SW DVD player, WinDVD, in the early 2000’s and he has also worked for 4 years in high-energy experimental Physics in Italy. He is one of the co-chairs for the statistical analysis methods (SAM) and no-reference metrics (NORM) groups at the Video Quality Experts Group (VQEG). He is actively involved within the Alliance for Open Media (AOMedia) as co-chair of the software implementation working group (SWIG). He has over 150 publications, including 50 patents. His research interests lie in video coding, quality of experience, adaptive streaming, and energy efficient HW/SW multimedia processing.

Dr. Ryan Lei is currently working as a video codec specialist and technical lead in the Video Infrastructure Media Algorithm team at Meta. His focus is on algorithms and architecture for cloud based video processing, transcoding, and delivery at large scale for various Meta products. Ryan Lei is also the co-chair of the Alliance for Open Media (AOM) testing subgroup and is actively contributing to the standardization of AV1 and AV2. Before joining Meta, Ryan worked at Intel as a principal engineer and codec architect. He worked on algorithm implementation and architecture definition for multiple generations of hardware based video codecs, such as AVC, VP9, HEVC and AV1. Before joining Intel, Ryan worked at ATI handhelp department, where he implemented embedded software for hardware encoder/decoder in mobile SoCs. Ryan received his Ph.D. in Computer Science from the University of Ottawa. His research interests include image/video processing, compression, adaptive streaming and parallel computing. He has (co-) authored over 50 publications, including 17 patents.

Dr. Debargha Mukherjee received his M.S./Ph.D. degrees in ECE from University of California Santa Barbara in 1999. Since 2010 he has been with Google LLC, where he is currently a Principal Engineer/Director leading next generation video codec research and development efforts. Prior to that he was with Hewlett Packard Laboratories, conducting research on video/image coding and processing. Debargha has made extensive research contributions in the area of image and video compression throughout his career, and was elected to IEEE Fellow for leadership in standard development for video-streaming industry. He has (co-)authored more than 120 papers on various signal processing topics, and holds more than 200 US patents, with many more pending. He currently serves as a Senior Area Editor of the IEEE Trans. on Image Processing, and as a member of the IEEE Visual Signal Processing and CommunicationsTechnical Committee (VSPC-TC).

Dr. Balu Adsumilli is currently the Head of Media Algorithms group at YouTube/Google, leading transcoding infrastructure, audio/video quality, and media innovation at YouTube. Prior to this, he led the Advanced Technology group and the Camera Architecture group at GoPro, and before that, he was Sr. Staff Research Scientist at Citrix Online. He received his masters at the University of Wisconsin Madison, and his PhD at the University of California Santa Barbara. He has co-authored more than 120 papers and 100 granted patents with many more pending. He serves on the board of the Television Academy, on the board of NATAS Technical committee, on the board of Visual Effects Society, on the IEEE MMSP Technical Committee, and on ACM MHV Steering Committee. He is on TPCs and organizing committees for various conferences and workshops, and currently serves as Associate Editor for IEEE Transactions on Multimedia (T-MM). His fields of research include image/video processing, audio and video quality, video compression and transcoding, AR/VR, visual effects, video ML/AI models, and related areas.

Mr. Anush Moorthy currently leads the Video & Image Encoding team at Netflix, who’s stellar engineers are responsible for the highquality video content that one has come to expect of the service. He has been part of the Encoding Technologies team since 2016 where he contributes to video & image encoding and visual quality assessment at scale. He has previously worked at Qualcomm as a Senior Video Systems Engineer and at Texas Instruments Inc. as an Advanced Imaging Engineer. His interests include image and video quality assessment, image and video compression, and computational vision.

Dr. In Suk Chong holds a B.S. in Electrical Engineering from Seoul National University (1998) and earned his MS/Ph.D. in Electrical Engineering from the University of Southern California (USC) in 2004 and 2008, respectively. He worked at Qualcomm from 2008 to 2017 before joining Google as the Video Codec Lead, spearheading advancements in video compression technology.

Dr. Yilin Wang is a staff software engineer in the Media Algorithms team at YouTube/Google. He spent the last ten years on improving YouTube video processing and transcoding infrastructures, and building video quality metrics. Beside the video engineering work, he is also an active researcher in video quality related areas, and published papers in CVPR, ICCV, TIP, ICIP, etc. He received his PhD from the University of North Carolina at Chapel Hill in 2014, working on topics in computer vision and image processing.

Dr. Yeping Su is a software engineer at Youtube. Yeping specializes in video codec and processing, with a focus on improving YouTube’s video processing infrastructure. Previously he worked at Apple, Sharp Labs of American, and Technicolor. Before that he received a Ph.D. in Electrical Engineering from University of Washington.

Onur G. Guleryuz is a Software Engineer at Google working on machine learning and computer vision problems. Prior to Google he worked at LG Electronics, Futurewei, NTT DoCoMo, and Seiko-Epson all in Silicon Valley. Before coming to Silicon Valley in 2000 he served as an Asst. Prof. with NYU Tandon School of Engineering in New York.

He received the BS degrees in electrical engineering and physics from Bogazici University, Istanbul, Turkey in 1991, the M.S. degree in engineering and applied science from Yale University, New Haven, CT in 1992, and the Ph.D. degree in electrical engineering from University of Illinois at Urbana-Champaign (UIUC), Urbana, in 1997. He received the National Science Foundation Career Award, the IEEE Signal Processing Society Best Paper Award, the IEEE International Conference on Image Processing Best Paper Award, the Seiko-Epson Corporation President’s Award for Research and Development, and the DoCoMo Communications Laboratories President’s Award for Research.

He has served in numerous panels, conference committees, and media-related industry standardization bodies. He has authored an extensive number of refereed papers, granted US patents, and has leading edge contributions to products ranging from mobile phones to displays and printers. He has been an active member of IEEE Signal Processing Society.

Mr. Joe Young graduated from the University of Washington with a B.S. degree in Computer Engineering and a M.S. in Electrical Engineering. He has extensive experience in the field of video compression, having developed both software and hardware-based encoders and transcoders at a variety of companies including Motorola Mobility and Google. Currently, Joe works at YouTube, where he is focused on video acceleration and coding efficiency, and contributing to the AV1 and AVM video standards.

Dr. Xin Zhao is currently a principal researcher and the manager of multimedia standards with Tencent Media Lab based in California, United States. Prior to joining Tencent, he was a Staff Engineer with Qualcomm, California, United States. He received the B.S. degree in electronic engineering from Tsinghua University in China, and the Ph.D. degree in computer applications from the Institute of Computing Technology, Chinese Academy of Sciences. Dr. Xin Zhao has been actively contributing to the development of multiple video standards for over 15 years, with hundreds of technical proposal adoptions and around 60 paper publications. He is a senior member of IEEE and now leading a team actively contributing to the next-generation open video codec within AOM known as AVM.

Presentations:

Talk-01
Title: Introduction to the Alliance for Open Media
Speaker: Dr. Ioannis Katsavounidis (Meta)
Abstract: We will start by sharing the history of the Alliance Open Media (AOM), its current members, group structure and the standards it has already established (AV1 and AVIF).

Talk-02
Title: SIWG / SVT-AV1: 15-Month Milestones and Achievements
Speaker: Mr. Hassene Tmar (Meta)
Abstract: The SIWG group has been continuing to work on a product-level AV1 implementation since the release of SVT-AV1 1.0. This presentation will highlight the progress made in improving compression vs computational efficiency tradeoffs across various use cases, including VOD, Live, and RTC, as well as future plans for the group.

Talk-03
Title: AV1 Deployment at Meta
Speaker: Dr. Ryan Lei (Meta)
Abstract: AV1 was the first generation royalty-free coding standard developed by Alliance for Open Media, of which Meta is one of the founding members. Since AV1 was released in 2018, we have worked closely with the open source community to implement and optimize AV1 software decoder and encoder. Early in 2022, we believed AV1 was ready for delivery at scale for key VOD applications such as Facebook (FB) Reels and Instagram (IG) Reels. Since then, we have started delivering AV1 encoded FB/IG Reels videos to selected iPhone and Android devices. After roll out, we have observed great engagement win, playback quality improvement, and bitrate reduction with AV1. In this talk, we will share our journey on how we enabled AV1 end-to-end production and delivery. First, we will talk about AV1 production, including encoding configuration and ABR algorithms. Second, since the main delivery challenge is on the decoder and client side, we will also talk about the learnings on integrating AV1 software decoder on both iOS and Android devices. Third, we will also talk about how we enabled Mixed Codec Manifest to expand AV1 delivery to low end Android phones. In the end, we will talk about how AV1 is leveraged within Meta for other use cases other than VOD.

Talk-04
Title: AV1 Deployment @ Netflix: Past, Present & Future
Speaker: Dr. Anush Moorthy (Netflix)
Abstract: TBA

Talk-05
Title: YouTube’s AV1 Activities
Speaker: Dr. In Suk Chong (Google)
Abstract: In this talk we are sharing how Google helps AV1 adoptions in the ecosystem. YouTube, Android, Chrome within Google really worked hard to expedite the adoptions of AV1 in the market. Furthermore, Google designed their own HW AV1 encoder/ decoder to exploit the gain that AV1 can provide in scale.

Talk-06
Title: UVQ in Video Compression
Speaker: Dr. Yilin Wang (YouTube/Google), Dr. Balu Adsumili (YouTube/Google)
Abstract: Universal Video Quality (UVQ) model is a deep learning based video quality metric proposed by Google, which has also been widely applied in production. In this talk, we will focus on the compression related development of UVQ, share insights from practical applications, and introduce improved UVQ models for compression.

Talk-07
Title: Towards a Next-Gen Video Codec
Speaker: Dr. Debargha Mukherjee (Google)
Abstract: TBA

Talk-08
Title: AOM Common Test Condition Design and Latest Result
Speaker: Dr. Yeping Su (Google), Dr. Ryan Lei (Meta)
Abstract: AV1 is an open, royalty-free video coding format designed by the Alliance for Open Media (AOMedia). Since it was finalized in 2018, AV1 has been supported by major content providers, such as YouTube, Meta and Netflix, and achieved great compression efficiency gain over previous generations of codecs. Since the middle of 2019, AOM member companies have started the research and exploration work for the next generation of the coding standard after AV1. The actual development work started from the beginning of 2021 in the codec working group, which is the main forum to discuss and review coding tool proposals from AOM member companies. Meanwhile, the testing sub-group also started the work to define the Common Test Conditions that are used to evaluate the compression efficiency gain and implementation complexity of the proposed coding tools. In this talk, we will first present a high level overview of the Common Test Condition finalized by the testing sub group. We will focus on its design intent and present some details of a few unique test configurations that are close to production usage, but never supported in any previous coding standard development process. In the second part of the talk, We will present a high level summary of the latest compression efficiency result that has been achieved by the Alliance.

Talk-09
Title: Overview of Coding Tools Under Consideration in AVM
Speaker: Dr. Debargha Mukherjee (Google), Mr. Xin Zhao (Tencent), Dr. Onur Guleryuz (Google), Mr. Joe Young (Google)
Abstract: In this talk we will present a high-level overview of the video coding tools that are under consideration for inclusion in the next-gen AOM codec. The tools discussed are either included in AVM (the AOM Video Model) or are under discussion in the Codec Working Group.