CORDIS - Forschungsergebnisse der EU
CORDIS

Personalised Content Creation for the Deaf Community in a Connected Digital Single Market

Periodic Reporting for period 2 - Content4All (Personalised Content Creation for the Deaf Community in a Connected Digital Single Market)

Berichtszeitraum: 2019-03-01 bis 2020-11-30

Improving the accessibility of content for the Deaf community is an important goal for both EU governments and broadcast industry regulators across the EU. Although legislation is being used to coerce content producers and broadcasters to do so, the cost of producing sign-language content and the negative impact of having a sign-interpreter appearing on the content for hearing individuals has relegated sign-language programming to late nights or a small number of sign-presented programs.
CONTENT4ALL aims to make more content accessible to the Deaf community by developing the necessary technologies and algorithms to capture a sign interpreter in a broadcaster remote studio, process it and render it making use of a state-of-the-art photorealistic 3D virtual human, who will look like a real sign interpreter. It envisages a cost-effective mechanism that will encourage broadcasters to provide sign-interpreted content for many programs, regardless of the time of day and without any detrimental impact on the hearing users’ quality of experience.
The main goal of CONTENT4ALL is to propose a solution to the most immediate need of the Deaf users and broadcasters; providing a low-cost solution to create sign-interpreted versions of content produced for the hearing. The second major goal of the project is to create datasets and algorithms to enable automated sign-interpreted content creation in the longer term. An exploration of automatic sign-language production is the last objective of the project. The most important benefits of the project is related to the social impact, e.g. a larger pool of content will be made available to Deaf users at low costs to broadcasters satisfying the upcoming legislative requirements of EU governments. It will also be an enabling technology to allow the Deaf community to become more involved in its own content creation; however, the benefits will not only be to society but also financial benefits will also accrue to SMEs in the media production and sign-language translations businesses via the expended market place enabled by the technologies developed in CONTENT4ALL. Finally, other indirect business opportunities, that can leverage those technologies, can also be foreseen e.g. sign-language teaching, personalized Virtual Reality content creation for the hearing, etc.
During the whole duration of the project, partners of CONTENT4ALL could achieve all the goals of the project by creating a first Demonstrator capable of reproducing a sign language interpreter body and hands movements and facial expressions into a photorealistic 3D virtual human (realatar). Originally based on Kinect and Deep Learning algorithms for poses estimation, the system evolved into a tool requiring only an HD camera and a Deep Learning approach to extract 2D parameters from images and lift them into 3D. This allowed for animating the 3D photorealistic virtual human, capable of rendering 3 different human figures.
This work resulted in a series of publications in top-tier journals and conferences; among the outstanding publications we can mention the two best paper awards at the Conference on Computer Vision and Pattern Recognition in 2019 and 2020. Moreover, for the technologies employed in the Demonstrator, CONTENT4ALL was awarded the NAB Technology Innovation Award 2020 reserved to innovative projects which are not yet commercialized but manifest a high potential of impact on the broadcasting market.
The other two achievements of CONTENT4ALL are 1) the release of a collection of 200h of broadcast content together with 20h of annotated videos i.e. sign language aligned with subtitles and 2) the creation of a laboratory proof-of-concept to demonstrate the advancement in the automatic sign language production.
Thanks to the strong relationship build with the deaf communities in Switzerland, Germany and Flanders, a set of 4 focus groups and several online questionnaires in different sign languages (e.g. DSGS, VGT, DGS, …) were held and distributed to collect user feedback on the different technological components, to continuously and incrementally improved the technical development.
Preliminary contacts with possible customers (broadcasters) were made and a detailed business plan derived accordingly.
CONTENT4ALL, at the end of the project, produced an innovative technical system that allows a virtual human (originated and animated by a professional sign language interpreter) to present broadcasted television content as a natural sign language interpreter via HbbTV or WebApp. In addition to that, a proof-of-concept of automatic sign-language generation via the virtual human is created for a limited domain of news, for example, sport or programs reporting about COVID.
To deploy such an innovative system, CONTENT4ALL makes use of Hardware and Software tools to capture a human signer in real-time in a studio which can be deployed in many locations even at home, thus called “Remote Studio”. This is cost-effective compared to a regular broadcast studio and is based on the state-of-the-art technologies developed to photo-realistically reproduce posture, gesture and facial expressions of the human sign interpreter via a 3D photorealistic virtual human. The model of a human is first recorded in a dedicated volumetric studio and then associated to a set of mathematical algorithms which allows the virtual human to exactly reproduce the real one. The creation and animation of such virtual human constitutes an advancement in the state of the art: first, the incorporation into a single human model of different elements at a high level of details e.g. hands at a precision of fingers or facial expressions including cheeks and forehead, then the challenge of animating such a model in real-time in the most precise way and without markers of obstructive elements. The generated stream of the virtual human is combined with the original broadcaster one and can be transmitted to watchers as a separate stream, to be watched on-demand on HbbTV 1.5 and 2.x compliant devices.
Another goal reached by the project is the creation of a repository of annotated data based on broadcast-quality video which is intended to be open to other research groups and within CONTENT4ALL will be used for training Artificial Intelligence models for experimenting with automatic translation into sign language and rendered with the improved 3D virtual human. The design of these algorithms, their implementation and testing constitute an advanced in the state-of-the-art, as they are among the first ones published in the scientific literature.
Finally, in the medium to long term, automated sign interpretation technologies are explored and tested. If this goal can be fulfilled, the sign language-speaking community in Europe will overcome the current state of media poverty closing that gap and explores avenues to enable a better society with equal opportunities.
Content4All Logo