Image and video processing research at WSIL and LIVE addresses a wide variety of problems on the design of multimedia communication systems. The primary goal of this research is to assimilate the state-of-the-art in wireless communication system design and multimedia algorithm design and efficiently use the information in designing next generation multimedia communication systems.
A popular and powerful technique employed in reliably communicating image and video data over error-prone channels is joint source channel coding (JSCC). In JSCC, the goal is to optimize the use of available resources (most commonly bitrate) for source and channel coding while minimizing the end-to-end distortion. A significant reduction in transmission bandwidth/data-rate can be achieved along with a reduction in distortion in the decoded data by using JSCC methods as compared to sequential and separate designs. A fundamental step in the design of all JSCC schemes is to estimate the distortion introduced in the coded and transmitted image or video due to quantization and channel errors. This distortion estimate is used in the optimization routine to find the optimal bit allocation strategy for source and channel coding. There are a few different ways to estimate the distortion at the transmitter. The most common way of obtaining this distortion estimate in the literature is via simulations and/or operational rate-distortion curves. In these methods, simulations are carried out to determine the amount of distortion in the coded image or video at various source coding rates and channel bit error rates. These values of distortion are then used to construct operational rate-distortion curves, which are in turn used to determine the optimal source-channel bit allocation configuration for that particular image or video. These distortion estimation methods are computationally intensive and source-dependent, and hence are not feasible for real-time applications.
Another method for distortion estimation is via empirically or analytically developed distortion models that can predict distortion due to quantization and channel errors at various source coding rates and channel bit error rates. These models are usually computationally non-intensive and can be used in the design of real-time JSCC techniques. However, model based distortion estimation methods are usually less accurate as compared to rate-distortion based methods. This is because model based methods often make certain simplifying assumptions to make the model mathematically tractable. The performance of the JSCC scheme depends on how accurately the distortion model predicts the distortion. We are currently working on developing high accuracy joint source-channel distortion models for images and videos, and designing efficient low complexity model-based JSCC methods for image and video communication over MIMO systems.
In the design of such systems, it is important to keep in mind that the ultimate receiver of visual information, be it images or videos, is the human eye. Further, the recent emergence of strong full-reference image quality measures offer tremendous potential for improving the quality of existing image processing algorithms. Our research takes into account both these important features in solving multimedia communication problems.
Current research :
Algorithm design optimized for structural distortion measures : We have used a perceptual distortion metric – the structural similarity (SSIM) index (invented at LIVE), to derive a new linear estimator for estimating zero-mean Gaussian sources distorted by additive white Gaussian noise (AWGN). This estimator is used in an image denoising application and its performance is compared with the traditional linear least squared error (LLSE) estimator. Although images denoised using the SSIM-optimized estimator have a lower peak signal-to-noise ratio (PSNR) compared to their LLSE couterparts, the SSIM-optimized estimator clearly outperforms the LLSE estimator in terms of the visual quality of the denoised images.
Multiple description coding : Multiple description codes generated by quantized frame expansions have been shown to perform well on erasure channels when compared to traditional channel codes. The design of a multiple description image coding scheme in the wavelet domain using frame expansions is being considered. We form zerotrees from wavelet coefficients and apply a tight frame operator to the zerotrees. Appropriate frame expansion coefficients are grouped to form packets. The performance of this technique was evaluated over an erasure channel and shown to outperform the performance of a conventional channel coding scheme.
Natural scene statistics (NSS) and its applications in image coding : The statistics of natural scenes in the wavelet domain are accurately characterized by the Gaussian Scale Mixture (GSM) model. The model lends itself easily to analysis and many applications that use this model are emerging (for e.g., de-noising, watermark detection). In our research, we explore an error-resilient image communications application that uses the GSM model and Multiple description coding (MDC) to provide error-resilience. We have derived a rate-distortion bound for GSM random variables, derived the redundancy rate distortion function, and finally implemented an MD image communication system.
Journal Publications :