Image and Video Communication
The next generation of video networks will deliver images and video content to mobile users, leveraging rapidly expanding wireless networks. By many predictions, multimedia especially video traffic is going to saturate capacity in the next few years. A natural approach to solving capacity problems in wireless is to leverage the benefits provided by MIMO communication. Indeed, much of my group’s work has been on making the most out of MIMO (see the related research pages for example). The physical layer, though, is just one part of the solution. We believe that the next generation of wireless networks will need to exploit structure in the multimedia data to enable sensible tradeoffs between source coding and channel rate.
Our recent work has focused on developing distortion models for common image and video compression strategies. We are interested in both distortions due to quantization and due to bit errors caused by the channel. In this way it would be possible to adjust the source rate (impacts amount of quantization) and the channel rate (determines the bit error rate profile). Effectively such models provide practical rate-distortion curves. A challenge in creating these models is that compression strategies used in practice include features like arithmetic coding, run-length coding, transform coding, or motion compensation. In our recent work we developed distortion models for progressive JPEG and MPEG-4 video that accurately predict PSNR under a variety of parameters.
In the design of video communication systems, it is important to keep in mind that the ultimate receiver of visual information, be it images or videos, is the human brain. In our on-going work with LIVE, we are developing models for distortion to predict the impact of quantization and bit error rate on perceptual models of distortion. We are using these models to develop rate adaptation protocols that automatically tradeoff source and channel coding, not just in the wireless link but also in the network.
Several of our recent publications are summarized below.
This paper shows the power of our recently developed distortion model for JPEG transmission to implement adaptive transmission in a wireless communication system. In idea is to divide the JPEG-compressed image into different quality layers, which are transmitted simultaneously from different transmit antennas using unequal transmit power, with a constraint on the total transmit power during any symbol period. The power allocation is determined by the empirical distortion model, and is also a function of the propagation channel. The optimal power allocation give large improvements in peak signal-to-noise ratio.
M. F. Sabir, R. W. Heath, Jr., and A. C. Bovik, “Joint Source-Channel Distortion Modeling for MPEG-4 Video,” IEEE Trans. on Image Processing, vol. 18, no. 1, pp. 90-105, Jan. 2009.
This paper establishes a distortion model for MPEG-4 video that accounts for quantization and channel errors at different source coding rates and channel bit error rates. An important feature of this model is that it includes practical aspects of video compression such as transform coding, motion compensation, and variable length coding. The model has low complexity, making it amenable to real-time optimization of video transmission over wireless links.
SSIM is a full reference metric for measuring image quality, with reasonable correlation with human perception. In this paper, we derive bounds on the structural similarity (SSIM) index as a function of quantization rate for fixed-rate uniform quantization of image discrete cosine transform (DCT) coefficients with different source models. The proposed bounds are found to be reasonable for a large set of natural images.
S. S. Channappayya, A. C. Bovik, C. Caramanis, and R. W. Heath, Jr., “Design of Linear Equalizers Optimized for the Structural Similarity Index,” IEEE Trans. on Image Processing, vol. 17, no. 6, pp. 857-872, June 2008.
This paper presents the first design of an equalizer for the SSIM index. The key idea is to reformulate the nonconvex problem as a quasi-convex optimization problem, which admits a tractable solution. We find the coefficients of the SSIM optimal equalizers and illustrate its performance. The SSIM equalizer gives better SSIM performance (as expected) with comparable complexity as MMSE solutions.
M. F. Sabir, R. W. Heath, Jr., and A. C. Bovik, “A joint source channel distortion model for JPEG compressed images,” IEEE Trans. on Image Processing, vol. 15, no. 6, pp. 1349-1364, June 2006.
This paper proposes a distortion model for progressive JPEG compressed images that accounts jointly for quantization and channel bit errors. Of particular interest, this model incorporates practical features of JPEG including Huffman coding, differential pulse-coding modulation, and run-length coding are included in the model. The model is found to provide accurate prediction of peak signal-to-noise ratio, and is flexible enough for optimization of image delivery in communication systems.
S. S. Channappayya, J. Lee, R. W. Heath, Jr., and A. C. Bovik, “Frame Based Multiple Description Image Coding in the Wavelet Domain,” Proc. of the IEEE Int. Conf. on Image Processing, vo. 3, pp. 920-923, Genova, Italy, Sept. 11-14, 2005.
Multiple description codes generated by quantized frame expansions have been shown to perform well on erasure channels when compared to traditional channel codes. In this paper we propose a multiple description image coding scheme in the wavelet domain using quantized frame expansions. We form zerotrees from wavelet coefficients and apply a tight frame operator to the zerotrees. The proposed approach compares favorably with conventional channel coding.
We have been collaborating extensively on these topics with Prof. Al Bovik, the LIVE Lab, and the Center for Perceptual Systems. New collaborations in the area of wireless video networks have started with Prof. Constantine Caramanis, Prof. Jeff Andrews, and Prof. Gustavo de Veciana courtesy of our new project on Perceptual Optimization of Wireless Video Networks.
We have been fortunate to have several outstanding sponsors support our research including Intel, Cisco, and the Texas Advanced Technology Program. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the aforementioned sponsors.