The main objectives of the TAPESTRIES project were to develop both subjective and objective quality assessment procedures for the performance evaluation of digital multimedia services. A major goal of the project was to develop an automated quality assessment tool that is able to track the scene dependent quality variations found in MPEG-2 encoded television pictures. The aim being to provide a system that will allow operators to monitor the quality of their delivered services on a continuous basis and also provide a means to compare the performance of different commercially available MPEG-2 encoders.
A further objective of the project was to employ the Single-stimulus quality assessment procedure (SSCQE), developed in the earlier RACE MOSAIC project, to determine the minimum encoded bit-rate requirements for different types of thematic programme material such as Entertainment, Cartoons, and Sports.
TAPESTRIES also set itself the objective of developing a test methodology for 3D television services that is able to measure enhanced experience of viewing in 3D and unwanted effects such as viewer headaches and eyestrain.
Finally, an important goal of the project was to support the MPEG-4 testing process, by organising and conducting competition and verification tests. To this aim new test methodologies, suited to evaluate MPEG-4 functionalities, were also developed by the project. Test support was also provided to other ACTS projects to assist with the performance assessment of their developed systems.
The SSCQE methodology has been further developed and refined during the TAPESTRIES project. Experiments made by TAPESTRIES found that the stability of SSCQE assessment results were affected by the inconsistency of certain observers. A process to reject results from inconsistent observers was developed for the SSCQE test protocol and is now included in a revision of the ITU-R BT.500-8 recommendation . Experiments were also made to investigate the relationship between the SSCQE and the conventional Double Stimulus Continuous Quality Scale (DSCQS) methodology. Results concluded that it was possible to relate SSCQE results to the DSCQS scale and this procedure (known as Stage 2 of the SSCQE methodology) has been submitted for inclusion in ITU-R recommendations. TAPESTRIES also proposed a method (known as Stage 3 of the SSCQE methodology) to derive a single quality grade from SSCQE results.
An adaptation of the SSCQE methodology was jointly proposed by the ACTS projects TAPESTRIES and MOMUSYS to the MPEG Test Subgroup for the evaluation of video codec error robustness. This is an important parameter for mobile communication networks where due to varying propagation conditions transmission-errors may be high. This method has been applied in MPEG-4 verification tests and it is expected to be included in ITU recommendations on subjective video quality assessment [2, 3]. TAPESTRIES and MoMuSys have also proposed to MPEG modifications of standard assessment methods to adapt them to the evaluation of content-based coding schemes. These modifications, are also expected to be included in ITU recommendations . Finally, TAPESTRIES has actively participated in MPEG testing activities and has helped MPEG-4 developers in the evaluation of both new coding schemes and MPEG-4 standards performance.
TAPESTRIES also provided subjective quality assessment support to other ACTS projects for the evaluation of their developed systems and services and through this collaborative work has had a wide impact on the results of the ACTS programme.
The results of the subjective tests showed that the statistical multiplexing method does not reduce the average bit-rate required for the programmes in the multiplex and hence will not allow an increased number of programmes to be transmitted in a multiplexed channel without some loss in programme picture quality. For the same number of multiplexed programmes, however, the statistical multiplexer technique does have the advantage over constant bit-rate encoding that it is able to minimise the instantaneous reductions in programme picture quality for busy programme scenes.
The TAPESTRIES model has been submitted to a competitive evaluation organised by the ITU Video Quality Experts Group (VQEG) and is a candidate to become the world-wide standard for the automated measurement of video quality. If the TAPESTRIES model wins this international competition, in accordance with the rules of VQEG, it will be made available on fair and reasonable terms to third parties wishing to commercially exploit the system.
Comparison of automated model and subjective quality results
A simpler automated system that does not require the use of a reference signal and is able to operate without the need for the reference uncoded video sequence has also been developed in the TAPESTRIES project. This system is well suited to system monitoring applications and has been patented by TAPESTRIES members.
Extraneous audio-video distractions provide a strong negative cue to presence and act to bring the viewer back to reality. A specially designed isolating experimental environment known as the Platform for Immersive Television (PIT) was developed for presence evaluations. Using this approach it has been demonstrated that viewers experience a much higher level of presence when viewing three rather than two dimensional material. Using subjective ratings of perceived depth and viewer eye-stain the optimum camera filming parameters for stereoscopic services have also been defined.
View from inside the Platform for Immersive Television (PIT)The evaluation of coding scheme performance is fundamental to the development of MPEG standards. Expert evaluations and subjective tests are typically used to determine performance. Since subjective tests are relatively expensive and time-consuming to implement, they are typically performed only at the beginning and end of the standards development process. Tests at the beginning of the process are used to rank order the proponent systems, and at the end to provide a reliable evaluation of the chosen standard.
The TAPESTRIES project provided considerable support to the Test Subgroup activities of MPEG and: proposed new test methodologies for MPEG-4 video functionality assessments, ran its own tests on MPEG-4 video, and until October '98 one of the TAPESTRIES partner's was responsible for the co-ordination of the Test Subgroup activities.
During this period the Test Subgroup performed a number of tests, including the second round of MPEG-4 competition tests and MPEG-4 verification tests.
TAPESTRIES provided support to these tests by: co-ordinating the entire test process, defining the experimental design, providing test administrators and performing a statistical analysis on the results. Three tests were carried out, corresponding to different ranges of bit rate and criticality of the video material. In each test the video coding efficiency of the two proposals and the MPEG-4 VM was evaluated by using appropriate test procedures. The results of the tests indicated that there was no significant difference between the performance of the VM and the performance of the two video-coding proposals. Based on this conclusion, the Video Subgroup of MPEG decided that neither of the two proposed algorithms would be included in the video VM.
MPEG-4 audio verification tests were completed in October '98. These addressed the following applications: narrow-band audio broadcasting, speech coding, and audio on the Internet. The formal tests for narrow-band audio broadcasting were carried out in collaboration with the NADIB (Narrowband Digital Audio Broadcasting) Group. These tests explored the performance of speech and music coders operating at bit rates in the range 6 kb/s to 24 kb/s, including scaleable codec options. The results showed that a significant improvement in quality can be offered in relation to conventional analogue AM broadcasting and that scaleable coders offer superior performance over simulcast networks .
The verification tests on speech coding evaluated the performance of MPEG-4 speech codecs against available standards over three ranges of bit rates from 2 kbit/s up to 18 kbit/s, including scalable options. The results showed that overall MPEG-4 codecs are competitive with existing standards and at very low bitrate (up to 4 kbit/s) MPEG-4 demonstrated better performance .
The verification tests for audio Internet applications explored the performance of speech and music coders operating in the bit rate range 6 kb/s to 24 kb/s, including scaleable codec options, different ranges of bit rates and different codecs. Due to the complexity of the test design it is difficult to summarise the results. The main conclusions of these tests, however, were that AAC audio coding provided significantly better audio quality than MP3 and scaleable AAC performed better than existing standards. More details on these tests can be found in .
In October '98 a complete plan for the MPEG-4 video verification tests was defined. It included testing of error robustness, content-based coding, and scalability [8, 9, 10].
From the point of view of subjective evaluation, the most critical MPEG-4 video functionalities were error robustness and content-based coding. Artefacts due to transmission errors are sparse, highly variant in terms of occurrence, duration and intensity, and at very low bit rates may be masked by compression impairments. Artefacts due to content-based coding may be concentrated on specific areas of the scene (e.g. object contours, texture of particular objects ) and the impact of the impairment of an object depends on the displayed background. For these two functionalities TAPESTRIES and MoMuSys proposed two new testing methods [11, 12], named the Simultaneous Double Stimulus Continuous Evaluation (SDSCE) and object-based evaluation methods. The Simultaneous Double Stimulus for a Continuous Evaluation (SDSCE) is derived from the SSCQE method described in . SSCQE is suitable to evaluate sparse impairments, but since no references are used, it is not suitable to evaluate fidelity nor to distinguish a particular source of artefacts.
An important requirement for the MPEG tests was to evaluate the fidelity of MPEG-4 coded sequences as channel errors may cause whole objects to disappear without producing other appreciable artefacts. A further requirement was to evaluate the annoyance of residual transmission impairments, but not taking into account the annoyance of coding impairments themselves. To meet these requirements it was decided to compare compressed sequences affected by transmission errors against the same compressed sequences without transmission errors. In these tests a panel of subjects watch the two sequences contemporaneously on the same screen, as illustrated in the figure below. The observers were requested to identify the differences between the two sequences and to judge the fidelity of the video information using the slider on a handset-voting device. When the fidelity was perfect, the slider should be moved to the top of the scale range (coded 100), when the fidelity was null, the slider should be moved to the bottom of the scale (coded 0). During these tests the subjects were aware of which picture sequence was the reference and which picture sequence was the one they needed to express an opinion on.
Typical screen presentation during a SDSCE subjective test
Experiments were made by TAPESTRIES to confirm the validity of this new test procedure which has now been adopted by MPEG for the evaluation of MPEG-4 systems, and it is expected to be included in ITU recommendations [2,3].
The second modification proposed to existing subjective assessment methods for MPEG-4 applications is related to the evaluation of object-based functionalities. The reason for this modification is twofold. First, there is an interaction between the perceived quality of each object in a scene. Secondly a content-based coded scene can be used and presented as it has been composed by its author or modified to using different combinations of objects from the original scene to provide a new scene.
TAPESTRIES and MoMuSys proposed to evaluate content-based functionalities (object scalability and object-based quality scalability) in two test runs. In the first run the overall quality of the scene is evaluated, and in the second the quality of a single object in the scene is evaluated. In the first run standard ITU methods are used, whilst in the second run a new test method is used to evaluate the efficiency of the object texture and shape coding by displaying it on a grey background. This is illustrated in the figure below. This approach eliminates the interaction between the quality of the object under evaluation and the spectral characteristics of the other objects in the same scene.
A test to validate the proposed modifications was carried out by TAPESTRIES in the framework of WP4 Activity 2 'Evaluation of MPEG-4 applications' and this procedure is expected to be included in ITU-T recommendations.
Testing of content based functionalities
 CSELT (Italy), FranceTélécom (France) - Proposed modifications to Recommendation P.910 , ITU-T SG12 Delayed Document 085, November 1998
 France, Italy - Draft proposal for modification of recommendation ITU-R BT.500 - A novel method for error robustness evaluation in video communication: the simultaneous double stimulus for a continuous evaluation , ITU-R Doc.10-11Q/30 April 1999
 ACO55/CSE/DS/I/010 - ' Experimental results of MPEG-4 competition tests'
 ISO/IEC JTC1/SC29/WG11/MPEG98/N2276- Report on the MPEG-4 audio NADIB verification tests , July 1998
 ISO/IEC JTC1/SC29/WG11/MPEG98/N2424, Report on the MPEG-4 speech codec verification tests , October 1998
 ISO/IEC JTC1/SC29/WG11/MPEG98/N2425 , MPEG-4 Audio verification test results: Audio on Internet , October 1998
 ISO/IEC JTC1/SC29/WG11/MPEG98/N2488, Revised Test Conditions for Video Verification Test On Content-Based Coding , October 1998
 ISO/IEC JTC1/SC29/WG11/MPEG98/N2489, Revised test conditions for video verification test on scalability , October 1998
 ISO/IEC JTC1/SC29/WG11/MPEG98/N2490, Error Resilience Verification Test Plan , October 1998
 ACO55/CSE/DS/R/012/b - ' Evaluation of selected applications provided by content-based coding schemes '
 AC055/CSE/DS/R/009 - Subjective assessment methodologies for use in MPEG-4 validation test
 ACO55/EBU/DR/22 - ' Evaluations of bit-rate reduced services and review of standardisation activities '