Jun 5 (Wed) @ 1:15pm: ”Advancements in Higher Order Ambisonics Compression and Loss Concealment Techniques,” Mahmoud Namazi, ECE PhD. Defense
Zoom Meeting – https://ucsb.zoom.us/j/89047021771?pwd=fUTsh4onvevf64BC7pJSpHr3CJhTzb.1
Abstract
Virtual reality's resurgence has intensified interest in higher order Ambisonics (HOA), renowned for its ability to recreate spatial audio across diverse speaker setups which is crucial for many applications. Using spherical harmonics to create a 3D soundfield, HOA's popularity in spatial audio storage and transmission is significant. However, challenges arise in enabling immersive experiences due to the potential for HOA to encompass up to 64 audio channels, necessitating effective compression methods. This talk addresses this challenge by proposing new and tailored algorithms for the compression and loss concealment of HOA signals. The first part of the talk discusses a new adaptive framework for HOA compression which considers both the previous frame reconstructed data and the current frame data, to obtain a more relevant set of SVD basis vectors spanning the null space, in order to extend the available set of dominant basis vectors, at the decoder, at little bitrate cost, leading to significantly improved audio quality.
The second part of the talk focuses on low-delay HOA compression. Modern codecs use a combination of inter-channel and inter-frame linear predictors or combine frame-based singular value decomposition (SVD) with the MDCT. This talk will show that reduced delay and superior bitrate can be gained, by instead applying an adaptive SVD transform, which relies on previously decoded data rather than the current time samples, for inter-channel decorrelation, as well as LPC and cascaded long term prediction (which can capture the periodic components of polyphonic signals) to capture short-term and long-term temporal correlations, respectively. The third part of the talk focuses on loss concealment for HOA. Current methods for loss concealment involve essentially applying a predictor trained on past and future data to predict the lost frame. However, such methods do not consider the spatial aspects of HOA. The talk will show how significant improvements can be made by decorrelating the signal using SVD, treating the audio aspect with a predictor and the spatial basis vectors with interpolation on a sample-by-sample basis, before recombining the audio and spatial aspects of the signal to arrive at a superior estimate of the lost frame.
Bio
Mahmoud Namazi is a PhD Candidate in the Department of Electrical and Computer Engineering at the University of California, Santa Barbara, where he is advised by Professor Kenneth Rose. He graduated with a triple major from George Mason University, earning one B.S. in Electrical Engineering and another B.S. in Mathematics and Physics. His research interests are in the fields of audio compression, information theory, and machine learning.
Hosted by: Professor Kenneth Rose
Submitted by: Mahmoud Namazi <mnamazi@ucsb.edu>