Final Report Summary - MIC_RS_TM (Multidimensional Image Coding. Applications to Remote Sensing and Telemedicine)
Project context and objectives
The research project has focused on the coding and transmission of massive images, i.e. those images acquired by sensors able to capture data with more than three dimensions. Such multi-dimensional images are currently available in the medical environment, including functional magnetic resonance imaging (fMRI) in 4D (3D spatial + 1D temporal), and also in remote-sensing scenarios, e.g. through Atmospheric Infrared Sounder Sensor (AIRS), which continuously captures ultra-spectral images of the Earth's surface producing ultra-spectral video (2D spatial + 1D spectral + 1D temporal). With the ever-increasing number of sensors acquiring massive images in airplanes, satellites, and medical centres, institutions need to deal properly with such data.
There are few studies tackling the coding and appropriate manipulation of these massive data sets. Professor Michael W. Marcellin (Department of Electrical and Computer Engineering, University of Arizona, USA) and the Group on Interactive Coding of Images (GICI, Department of Information and Communications Engineering, Universitat Autònoma de Barcelona, Spain) have established themselves as pioneers in this nascent field. GICI group has hosted Professor Marcellin for 18 months with the main goal of investigating the coding and transmission of massive images.
Project work
The core of the project has been built upon JPEG2000 standard ((ISO/IEC 15444 and ITU-T Rec. T.800) for which Prof. Marcellin was one of the key developers and of many applications thereof. The study has been undertaken on digital medical images granted by Parc Tauli Health Corporation (Sabadell, Spain), and on remote sensing granted by the Center for Ecological Research and Forestry Applications (Bellaterra, Spain).
The project has helped to raise European excellence and competitiveness (in Spain, in particular), and is foreseen to continue through further exchange collaboration, as demonstrated by the numerous activities that have already taken place and those that are programmed for the future. In this respect, Professor Marcellin has visited 10 institutions in Spain and has begun joint collaborations with some of them (in addition to the hosting group).
Project conclusions
Remote sensing: each year, these technologies gather an increasing amount of hyperspectral information, which is treated with more sophisticated data-processing methods demanding large amounts of computing resources. Spectral decorrelation is a widely used method with a significant computational cost, in particular in the image coding context for applications on-board satellites. We have shown that divide-and-conquer strategies mitigate these issues with schemes that provide approximate decorrelation at a fraction of the original cost, as well as with improved component scalability and lower memory requirements. We have reported, in three practical cases of divide-and-conquer decorrelation strategies for hyperspectral images, the benefits and advantages of these strategies, and given insights into the applicability of these technologies to adjacent fields, which we hope may foster its use in other research fields.
Medical imaging: considering the number and type of medical tests that are performed in a medical centre, computed tomography (CT) is among the most popular types of medical images in use today. 3D CT images are obtained from a series of X-ray exposures. Unfortunately, this radiation exposure increases the risk of inducing cancer in a patient. The radiation dose is often reduced by the radiologist to minimise the risk of cancer, which can result in a reduction of image quality. To enhance the quality of the image, different noise filters have been developed. Given the huge volume of CT data, CT image coding is a relevant topic for practical medical scenarios and research. We have presented a new coding scheme for CT images. Our proposal is aimed to encode CT images acquired at low radiation dose while maintaining compatibility with Digital Imaging and Communication in Medicine (DICOM) protocol. Because of this compatibility, our proposal is based on the JPEG2000 coding system. The proposed coding scheme includes a noise filtering (NF) stage, suitable to improve the image quality and reduce image noise, as well as a multi-component decision-maker stage, able to determine if the multi-component transform needs to be carried out. We put forward a model for the correlation (r) of images as a function of the acquisition parameters. A study of the influence of correlation in 3D coding performance is carried out, which shows, for the evaluated corpus, that for images with r > 0.90 the RWT and RHAAR along the z dimension provide significant coding gain.
Experimental results indicate that a NF stage enhances the coding performance and provides higher rate-distortion performance and lossless coding gain for all images with respect to no-NF approaches. In practice, the size of the corpus is reduced from 4.46Gb to 1.80Gb with JPEG2000 and to 0.75Gb with our NF+JPEG2000 proposal. When a multi-component transform is carried out, RWT is the multi-component transform that produces the best coding performance in terms of SNR quality, always outperforming RHAAR. However, when a specific subset of components needs to be retrieved, NF+JPEG2000 or NF+RHAAR+JPEG2000 can sometimes yield the best rate-distortion performance depending on the value of r and the number of slices decoded. Finally, we show that our proposal is superior in coding performance compared to another preprocessing technique recently presented in 2011. When our proposed strategy is used together with that preprocessing technique of 2011, more than 7 bppps can be saved on average.
Video transmission: rate allocation is of paramount importance in video transmission schemes to optimise video quality. Applications that transmit video over local area networks, internet or dedicated networks, may experience variations in channel conditions due to network saturation, TCP congestion, or router failures. We have proposed a rate allocation algorithm for the transmission of JPEG2000 video for time-varying channels, which we call FAST-TVC. The proposed method is built on our previous FAst rate allocation through STeepest descent (FAST) algorithm, extending and exploiting some of its features. The main insight behind FAST-TVC is to employ complexity scalability and the roughly linear relation between computational load and number of frames to re-compute frame rates once a variation on the channel capacity takes place.
Experimental results indicate that FAST-TVC achieves virtually the same coding performance as that of the optimal Viterbi algorithm (when the Viterbi algorithm is computationally feasible). When the server needs to control the resources dedicated to the rate allocation algorithm depending on system load or other indicators, FAST-TVC can use one of three proposed strategies. The first strategy is named "constant tc" and provides a constant execution time to the algorithm. Although this strategy achieves a non-negligible gain in coding performance with respect to a constant-rate strategy, results vary significantly depending on the video sequence, buffer size, and channel conditions. The second strategy is named "estimated tc" referring to its ability to estimate the total time that FAST-TVC requires to finish its execution. This allows FAST-TVC to achieve more consistent results, but does not supply any mechanism to reduce computational time when the server is busy. The "weighted tc" strategy is a compromise between the previous two: it achieves virtually same results as "estimated tc," and reduces computational load significantly. Experimental results evaluating the computational costs of FAST-TVC indicate that very few computational resources are expended. These characteristics make FAST-TVC a suitable method for the transmission of pre-encoded JPEG2000 video in real-world applications.
The research project has focused on the coding and transmission of massive images, i.e. those images acquired by sensors able to capture data with more than three dimensions. Such multi-dimensional images are currently available in the medical environment, including functional magnetic resonance imaging (fMRI) in 4D (3D spatial + 1D temporal), and also in remote-sensing scenarios, e.g. through Atmospheric Infrared Sounder Sensor (AIRS), which continuously captures ultra-spectral images of the Earth's surface producing ultra-spectral video (2D spatial + 1D spectral + 1D temporal). With the ever-increasing number of sensors acquiring massive images in airplanes, satellites, and medical centres, institutions need to deal properly with such data.
There are few studies tackling the coding and appropriate manipulation of these massive data sets. Professor Michael W. Marcellin (Department of Electrical and Computer Engineering, University of Arizona, USA) and the Group on Interactive Coding of Images (GICI, Department of Information and Communications Engineering, Universitat Autònoma de Barcelona, Spain) have established themselves as pioneers in this nascent field. GICI group has hosted Professor Marcellin for 18 months with the main goal of investigating the coding and transmission of massive images.
Project work
The core of the project has been built upon JPEG2000 standard ((ISO/IEC 15444 and ITU-T Rec. T.800) for which Prof. Marcellin was one of the key developers and of many applications thereof. The study has been undertaken on digital medical images granted by Parc Tauli Health Corporation (Sabadell, Spain), and on remote sensing granted by the Center for Ecological Research and Forestry Applications (Bellaterra, Spain).
The project has helped to raise European excellence and competitiveness (in Spain, in particular), and is foreseen to continue through further exchange collaboration, as demonstrated by the numerous activities that have already taken place and those that are programmed for the future. In this respect, Professor Marcellin has visited 10 institutions in Spain and has begun joint collaborations with some of them (in addition to the hosting group).
Project conclusions
Remote sensing: each year, these technologies gather an increasing amount of hyperspectral information, which is treated with more sophisticated data-processing methods demanding large amounts of computing resources. Spectral decorrelation is a widely used method with a significant computational cost, in particular in the image coding context for applications on-board satellites. We have shown that divide-and-conquer strategies mitigate these issues with schemes that provide approximate decorrelation at a fraction of the original cost, as well as with improved component scalability and lower memory requirements. We have reported, in three practical cases of divide-and-conquer decorrelation strategies for hyperspectral images, the benefits and advantages of these strategies, and given insights into the applicability of these technologies to adjacent fields, which we hope may foster its use in other research fields.
Medical imaging: considering the number and type of medical tests that are performed in a medical centre, computed tomography (CT) is among the most popular types of medical images in use today. 3D CT images are obtained from a series of X-ray exposures. Unfortunately, this radiation exposure increases the risk of inducing cancer in a patient. The radiation dose is often reduced by the radiologist to minimise the risk of cancer, which can result in a reduction of image quality. To enhance the quality of the image, different noise filters have been developed. Given the huge volume of CT data, CT image coding is a relevant topic for practical medical scenarios and research. We have presented a new coding scheme for CT images. Our proposal is aimed to encode CT images acquired at low radiation dose while maintaining compatibility with Digital Imaging and Communication in Medicine (DICOM) protocol. Because of this compatibility, our proposal is based on the JPEG2000 coding system. The proposed coding scheme includes a noise filtering (NF) stage, suitable to improve the image quality and reduce image noise, as well as a multi-component decision-maker stage, able to determine if the multi-component transform needs to be carried out. We put forward a model for the correlation (r) of images as a function of the acquisition parameters. A study of the influence of correlation in 3D coding performance is carried out, which shows, for the evaluated corpus, that for images with r > 0.90 the RWT and RHAAR along the z dimension provide significant coding gain.
Experimental results indicate that a NF stage enhances the coding performance and provides higher rate-distortion performance and lossless coding gain for all images with respect to no-NF approaches. In practice, the size of the corpus is reduced from 4.46Gb to 1.80Gb with JPEG2000 and to 0.75Gb with our NF+JPEG2000 proposal. When a multi-component transform is carried out, RWT is the multi-component transform that produces the best coding performance in terms of SNR quality, always outperforming RHAAR. However, when a specific subset of components needs to be retrieved, NF+JPEG2000 or NF+RHAAR+JPEG2000 can sometimes yield the best rate-distortion performance depending on the value of r and the number of slices decoded. Finally, we show that our proposal is superior in coding performance compared to another preprocessing technique recently presented in 2011. When our proposed strategy is used together with that preprocessing technique of 2011, more than 7 bppps can be saved on average.
Video transmission: rate allocation is of paramount importance in video transmission schemes to optimise video quality. Applications that transmit video over local area networks, internet or dedicated networks, may experience variations in channel conditions due to network saturation, TCP congestion, or router failures. We have proposed a rate allocation algorithm for the transmission of JPEG2000 video for time-varying channels, which we call FAST-TVC. The proposed method is built on our previous FAst rate allocation through STeepest descent (FAST) algorithm, extending and exploiting some of its features. The main insight behind FAST-TVC is to employ complexity scalability and the roughly linear relation between computational load and number of frames to re-compute frame rates once a variation on the channel capacity takes place.
Experimental results indicate that FAST-TVC achieves virtually the same coding performance as that of the optimal Viterbi algorithm (when the Viterbi algorithm is computationally feasible). When the server needs to control the resources dedicated to the rate allocation algorithm depending on system load or other indicators, FAST-TVC can use one of three proposed strategies. The first strategy is named "constant tc" and provides a constant execution time to the algorithm. Although this strategy achieves a non-negligible gain in coding performance with respect to a constant-rate strategy, results vary significantly depending on the video sequence, buffer size, and channel conditions. The second strategy is named "estimated tc" referring to its ability to estimate the total time that FAST-TVC requires to finish its execution. This allows FAST-TVC to achieve more consistent results, but does not supply any mechanism to reduce computational time when the server is busy. The "weighted tc" strategy is a compromise between the previous two: it achieves virtually same results as "estimated tc," and reduces computational load significantly. Experimental results evaluating the computational costs of FAST-TVC indicate that very few computational resources are expended. These characteristics make FAST-TVC a suitable method for the transmission of pre-encoded JPEG2000 video in real-world applications.