ICVGIP 2025 Tutorial

Abstract

Video restoration is an essential component of multimedia processing, with applications in archival preservation, media production, streaming, and consumer content. This tutorial offers a structured overview of the field, tracing developments from classical filtering to deep learning and generative approaches.

We structure the discussion around four perspectives: enhancement of perceptual detail, restoration of chromatic fidelity, adaptation to unknown degradations, and development of unified restoration frameworks. The tutorial concludes with an outlook on emerging research directions, including adaptation of foundation models, test-time refinement, generative architectures, and multimodal guidance.

Timeline (3 Hours)

Hour 1

Introduction & Motivation

Welcome and tutorial overview
Importance of video restoration in multimedia (archival, production, streaming)
Typical degradations in videos (noise, blur, compression artifacts, chromatic distortions, etc.)
Challenges: mixed/unknown degradations, generalization, deployment

Historical Foundations

Classical approaches: filtering, deblurring, deblocking
Early machine learning methods for restoration
Limitations of handcrafted pipelines
Transition to data-driven methods

Hour 2

Semantic Detail Enhancement

Problem space: noise, texture loss, structural artifacts
Representative models: CNNs, GAN, Diffusion Models
Applications: Compression artifact reduction, denoising, deblurring, etc.

Chromatic Fidelity Restoration

Challenges in color distortions & temporal consistency
Neural color transfer and deep recolorization models
Applications: film recolorization, consumer video enhancement

Hour 3

Unified Restoration Frameworks

Motivation for multi-task architectures
Representative frameworks
Discussion of failure cases and domain shifts

Adaptation to Unknown Degradations

Foundation-model adaptation for video restoration
Multimodal integration (e.g., vision–language guidance)
Open research problems and future opportunities

Target Audience

This tutorial is intended for researchers, practitioners, and students in computer vision, multimedia, and machine learning. Participants will gain a structured taxonomy of restoration approaches, organized around general principles and recent research trends. They will also develop a critical understanding of current limitations and learn about emerging approaches from foundation-model adaptation to multimodal integration that are defining the next phase of progress.

Speakers

Guan-Ming Su

Guan-Ming Su is Director at Dolby Laboratories, leading image technology development. He holds a Ph.D. in Electrical and Computer Engineering from the University of Maryland. His expertise spans multimedia coding, computer vision, 3D imaging, HDR, and machine learning. He has held roles in Dolby Vision, Marvell, and ESS Technology, contributing to video codec architecture and advanced imaging systems.

Aupendu Kar

Aupendu Kar is a Senior Researcher at Dolby Laboratories, specializing in computer vision and deep learning. His work focuses on low-level vision tasks like image super-resolution, dehazing, rain removal, and underwater enhancement. He also explores learning frameworks grounded in physical models and has contributed to medical image analysis.

Shiv Gehlot

Shiv Gehlot is a Senior Researcher at Dolby Laboratories, working on computer vision, multimodal learning, and generative AI. He holds a Ph.D. from IIIT-Delhi, with current research focused on video and audio processing. His recent work includes developing diffusion-based frameworks for diverse video enhancement tasks, along with deep audio quality prediction, and notable contributions to generative video coding.

Sutanu Bera

Sutanu Bera is a Senior Researcher at Dolby Laboratories, working on video restoration and low-level image processing using deep learning. He holds a Ph.D. from IIT Kharagpur and has prior research experience in low-dose CT denoising, GAN-based restoration, and invertible networks.