Community Research and Development Information Service - CORDIS


ECOMODE Report Summary

Project ID: 644096
Funded under: H2020-EU.

Periodic Reporting for period 1 - ECOMODE (Event-Driven Compressive Vision for Multimodal Interaction with Mobile Devices)

Reporting period: 2015-01-01 to 2015-12-31

Summary of the context and overall objectives of the project

Summary of the action context

The visually impaired and the elderly, often suffering from mild speech and/or motor disabilities, are experiencing a significant and increasing barrier in accessing ICT technology and services. Yet, in order to be able to participate in a modern, interconnected society that relies on ICT technologies for handling everyday issues, there is clear need also for these user groups to have access to ICT, in particular to mobile platforms such as tablet computers or smartphones. Smartphones have become indispensable everyday devices, used by many almost 24 hours a day.
However, most of these products are designed and marketed for the young, tech savvy and multi‐media affine. To the visually and motor impaired, handling of these mobile devices can be overwhelming, confusing, and unnecessarily difficult. Concepts such as pinching, swiping and double clicking are both alien and physically challenging. The project aims at defining how to improve the human‐machine interface experience in mobile platforms (tablets, smartphones). More natural ways of human‐computer interaction are non‐touchscreen‐based (“in the air”) hand and finger gestures and spoken commands.

Air gesture recognition for mobile devices is still in its infancy and lacks robustness and user‐friendliness. Speed and (minimum) duration of gestures are limited, for vision‐based gesture recognition as well as for the most advanced technologies using structured infrared light. But the most restrictive aspect of all state‐of‐the art gesture recognition is in its range of acceptable environmental operating conditions. Neither video nor infrared technologies reliably work outdoors. However this is the main theatre of operation for mobile devices. When including the user groups of visually and motor impaired, those restrictions become even more severe as e.g. the quality of gestures become increasingly variable and the exact position of hands with respect to the device is less controlled. For speech recognition and speech‐to‐text input, noisy outdoor scenarios are very challenging. Again, including mild speech or motor impaired users and the elderly worsens the situation for the speech recognizer, as the temporal dynamics of speech are different and more variable than standard speech most recognition engines are trained for.
ECOMODE tackles those limitations by introducing a novel disruptive technology we call EDC (event‐driven compressive), a biology‐inspired approach to sensing and processing visual and auditory information, to realize a robust indoor and outdoor low‐power touch‐less human‐computer interface technology, particularly supporting an enlarged number of user group profiles including the visually and motor impaired.

The project aims at developing and exploiting the recently matured and quickly advancing biologically‐inspired technique of event‐driven compressive sensing (EDC) of audio‐visual information, for enabling more natural and efficient ways of human interaction with ICT technology, especially focusing on mobile computational devices such as tablet computers and smart phones. We will demonstrate that the proposed technology step is particularly suited to support the user groups of the visually and motor impaired in their interaction with modern ICT devices, and hence improve their potential to better participate in a modern, interconnected society. Operation of the proposed devices will be more independent of the environment, particularly offering unrestricted use of the mobile device under uncontrolled lighting and background noise conditions such as present e.g. in inner‐city outdoor scenarios. In the ECOMODE project, we propose to take advantage of the new EDC sensing and processing paradigm, of which partners in the consortium are leading pioneers, to realize a new generation of low‐power multi‐modal human‐computer interface for mobile portable devices by combining audition and gesture based control, especially focusing on disabled people who have limited capabilities to interact with touch‐controlled devices.

The project is based on two main technology pillars: (A) an air gesture control set, and (B) a vision-assisted speech recognition set. In (A) the consortium will perform research on exploiting EDC vision sensor technology (either in single camera or multiple camera setups) for low and high level EDC visual processing for hand and finger gesture recognition and subsequent command execution. In (B) the goal is to combine visual clues from lip and chin motion and temporal dynamics acquired using EDC vision sensors with auditory sensor input to gain robustness and background noise immunity of spoken command recognition and speech‐to‐text input. In contrast to state‐of‐the‐art technologies used in this context (infrared, ultrasound, video), both proposed human‐computer communication channels are expected to work reliably under uncontrolled outdoor conditions. ECOMODE will consequently apply an end‐user centered co‐design. Through assessment loops in living‐labs, the consortium will learn how users use and want to use the system and iteratively adapt the product to their needs. This user‐centered co‐design will be reflected on the software and hardware level where the algorithms will unobtrusively learn the user’s context of use (space, time, noise, movement,...) and behavior.
ECOMODE will greatly benefit from the extensive involvement of user associations in the project.

Summary of Objectives:
• Define target user groups, research their needs and preferences, and define demonstration use cases in line with the objectives of the project.
• Develop and adapt EDC technology hardware in terms of usability, interfacability, and integrability required in order to design and assemble final demonstrator systems to be validated in selected use cases.
• Further develop mathematical foundations, algorithms and models for EDC event‐based, data‐driven computation and processing for motion detection and analysis in the context of the project.
• Develop demonstrator/prototype EDC‐based efficient and robust multi‐mode, touch‐less human computer interfaces for integration into a portable, battery‐powered device. The interface will be based on gesture control and speech‐to‐text input.
• Iteratively evaluate and validate the developed technology with individuals of the target user groups in selected use cases, applying user‐centered design and validation strategies.
• Pave the way for a future easy industrialization of a commercial product by demonstrating availability of the required hardware and software components and their integration into a portable computational device such as a tablet computer.
• Research the respective markets and define the appropriate exploitation strategies. New business models and exploitation strategies will be assessed on the grounds of the value chain actors represented within the consortium

Work performed from the beginning of the project to the end of the period covered by the report and main results achieved so far

"The work performed during the first reporting period (month 1 - month 12) of the ECOMODE project has been carried out as planned in Annex 1 according to the planned schedule of workpackages and tasks. The strong collaboration and intensive communication between technology providers, industrial partners and user-experience focused partners have enabled the consortium to complete deliverables and milestones on time. Any delay is not foreseen for the next reporting periods. An overview of the progress can be highlighted as follows:
At early stage of the project, the partners involved in EDC HW developments have defined the initial specifications for the HW developments (specifications for new EDC Sensor, HW platform specifications and Inter-EDC HW interfaces) and have made them available to the consortium members (deliverable completed). A new sensor design with ultra-small pixel almost concluded (tape out on the 15th of March). One major result is the camera ECOCAM-1 available with plug-on mechanism for portable Android devices with running elementary features on the android platform. The preliminary EDC prototypes have been provided to a subset of partners (milestone achieved). The EDC preprocessing platform vhdl/Verilog code has been developed and prototyped on optional backup FPGA based PCB.

In the first months of the project, a small, simple and intuitive hand gestures vocabulary has been defined with a corresponding database with the existing EDC vision sensors, in various conditions and outdoor environments. The first database with existing event-based camera has been achieved. This database will initially allow the implementations and evaluations of the first stage algorithms. It will be progressively increase and refine according to the work done in WP6 on the use cases and users surveys. Eventually, a final bigger database using the adapted EDC‐vision sensor will be collected during year 3 and will be used for quantitative performance evaluations of all algorithmic modules developed in this WP. 8 coarse gestures for recognition of the global motion of the hand have been defined and a motion-based feature for event-based vision has been provided and a set of fine gestures have been defined. A local version of the feature (to detect, recognize, track interest points) and a method to focus attention to the moving hand in the visual scene (""positive"" probability map) have been implemented. A more user‐friendly simulator, to make easier the adaptation of the ConvNets’ architecture to the context recognition issue has been provided.
Speech production-related visual events from EDC sensors have been combined with automatically extracted acoustic landmarks for the segmentation of speech into variable-length phonetically meaningful segments.

A analysis of the effects of speech impairments on automatic Gspeech segmentation through comparison between normal and impaired speech has been made. Some strategies to reduce the expected gap between segmentation accuracy in normal and impaired speech have been developed. The speech segmentation in both speaker-dependent and speaker-independent settings has been evaluated. The cross-modal segmentation will be tested by comparing the automatically identified segment boundaries to the actual phone boundaries of test utterances.
Preliminary guidelines to drive the design and the evaluation of the ECOMODE technology have been provided at early stage of the project. Existing multimodal technologies have been analysed and assessed. An ECOMODE usability checklist and evaluation grid have been provided. The first application scenario and corresponding dictionary of gestures and vocal commands has been implemented.

A first market analysis providing a comprehensive perspectives of the targeted end-users has been provided (a new iteration is planned for 2016). The market analysis quantifies the potential market for the potential product outcomes of the ECOMODE project. It also illustrates the main market trends regarding hardware, software and system developers, as well as tablet/smartphone manufacturers, service providers and external entities (associations, organisations, healthcare instututions,…). The study gives also an overview on the competitive environment and identify possible business models to be used for the commercial realization of the works issued by ECOMODE. Some work has been initialized to advertize about the work ongoing in ECOMODE and to find channels for disseminating about the activities"

Progress beyond the state of the art and expected potential impact (including the socio-economic impact and the wider societal implications of the project so far)

ECOMODE has the goal of improving the usability and acceptability of mobile ICT devices such as tablets and smart phones for visually impaired and people with mild speech and motor impairments, of which a high percentage of representative are among the elderly.
According to the World Health Organization, over 285 million people in the world are visually impaired, of whom 39 million are blind and 246 million have moderate to severe visual impairment. It is predicted that without extra interventions, these numbers will rise to 75 million blind and 200 million visually impaired by the year 2020. For Europe alone, an estimated 30 million individuals are blind or visually impaired. This higher figure takes into account the prevalence of sight‐loss amongst an increasing population of elderly people in Europe which is extremely difficult to accurately quantify, and also the fact that there exists a number of people who suffer from varying degrees of sight loss but who either ignore this or decide for personal reasons not to declare their condition.
The European Disability Strategy 2010‐2020 states that “Persons with disabilities have the right to participate fully and equally in society and economy. Denial of the right to participate fully and equally is a breach of human rights”. By its very nature, visual impairment has a significant impact on individuals’ quality of life, including their ability to work and to carry out daily life tasks.

Around 80% of blind people never leave their home. If ECOMODE is successful and provides reliable and robust outdoor interaction with a smart device,
• Navigational aid and related services become available, improving confidence of the visually impaired individual to leave home and navigate the outdoor environment.
• Internet‐based service such as e.g. information on public transport will be accessible and support mobility of the users.
• Simply doing a phone call becomes much easier, again boosting visually impaired individual’s confidence to leave their home and participate in society.

The current proportion of people aged 50 and over in the European Union is 35.2% (Gallagher and Petrie, 2013) and that figure is expected to increase. As reported in (Caprani et. al, 2012), one person out of four is projected to be over the age of 65 by the year 2030. Recent research in the United Kingdom (Office for National Statistics, 2012) shows 7.82 million people have never used the Internet. Among those, 3.91million are people with some disabilities. In particular, 71% of those are aged over 75 years, 40% of those aged 65–74 years, and 18.7% of those aged 55 – 64 years (Gallagher and Petrie, 2013). Moreover, in the United Kingdom it is estimated that 1 in 2 people aged over 65 have some degree of disability (Papworth Trust, 2012). In 2012, the overall percentage of people with disability ages 65 to 74 in the US was 25%, whereas it was 50% for people ages 75 years and older.
Among the six types of disabilities affecting those people, the highest prevalence rate was for motor performance deficit: 16% for people aged 65 to 75 and 33% for people aged 75 and older (Disability Status Report ‐ US, 2014), including coordination difficulty, increased variability of movement, slowing of movement and difficulties with balance and gait in comparison to young adults (Seidler et al., 2002; Contreras‐Vidal et al., 1998; Diggles‐Buckles, 1993;Tang and Woollacott, 1996). Elderly people are not a homogeneous group, they express different levels of age‐related impairments (such as minor sensory, cognitive and motor disabilities) and they may be simultaneously affected by several minor disabilities (e.g. reduced sight, hearing, speech, mobility), with overall effects exceeding the sum of each single disability (Gallagher and Petrie, 2013).
The use of ICT devices has the potential to improve (disabled) elderly’s quality of life. Compared to traditional desktop devices, touch‐based interfaces such as smartphones and tablets, have demonstrated a number of potentialities to support older adults activities (Saffer, 2008). Such devices have also been shown to positively impact on older adult’s acceptance of technology. Especially with novice elderly users, a touch‐panel interface may turn computers into something more attractive and friendly, because it seems to be able to encourage play and exploration requiring less learning time (Peek et al., 2014).
However, smartphones and other mobile devices still remain poorly adopted by older adults. Despite the growing adoption of smartphone by the U.S. population, older adults (> 65) continue to exhibit relatively low adoption levels (18% of Americans aged 65 and older now own a smartphone) (Smith, 2014). There are several factors that come into play to explain the non‐usage of such technologies. Attitudinal, cognitive and age‐related changes hinder older adults in adopting technologies (Charness, 2009). The reduced visual acuity and constrained visual field, largely affecting older adults, makes it difficult to recognize fine details of icons or pointers used in graphical user interfaces (Fraser, 2000). Difficulties with input devices may relate to muscle strength and power (Metter et al., 1997), as well as to reduced range and slower motion and greater difficulty in performing fine motor tasks (Taveira, 2009). Tapping a touch pad, for instance, is challenging because of the complexity of motor skills involved (Wood, 2005; Caprani, 2012).

Alternative inputs, such as speech and gestures proposed by ECOMODE, may be helpful to older adults and facilitate accessibility and usability of technologies, as they constitute a more natural way of communication (Turk, 2014). Speech‐enabled interfaces are a promising way to improve older adult’s accessibility to ICT:
Users with perceptual and motor impairments can be facilitated by voice input (Basson, 2007). Furthermore, studies on speech recognition targeting older adults demonstrated a good acceptance of the speech input, as well as the perception of easiness of use due to little training required to control a system by speech (Basson, 2007).
Multimodal interfaces may support older adults to access digital contents thanks to the possibility of choosing the combination of input/output modalities that best fit their capabilities (Naumann, 2010). This possibility becomes crucial when the target population experiences decreasing capabilities such as coordination, physical strength, decreased fine motor skills and vision. The combination of more than one input mode can increase recognition rates and hence reduce errors and improve flexibility of systems. Moreover, multimodal interfaces are expected to be easier to learn and use. They may be used by a broader spectrum of everyday people accommodate adverse usage conditions (Oviatt, 2002).
Smart mobility devices exploit their maximum potentiality outdoor, where people often have to move around in a complex environment. But outdoor scenarios are still challenging. For example, applications based on touch may be penalized by adverse light conditions, while applications based on speech control may have low performance due to the noise in outdoor environment (Rousan & Assaleh, 2011). Although the advantages of devices based on multimodal interactions are widely recognized (Turk, 2014), further work is needed to improve the interaction accuracy and make technologies more accessible to people with disabilities or minor impairments due to age. In particular, even if touch screens are better than other input devices because they promote directness and good hand‐eye coordination (Taveira, 2009), further development should be done to make them suitable for extensive data entry and to accommodate visual and motor control capabilities of older adults. Similarly, devices with speech‐based input should be improved to reduce both the technical difficulties in processing speech of older adults and the cognitive load when older adults interact with the system through speech (Basson, 2007). These challenges are fully taken into account by the objectives, methodology and approach of ECOMODE, fully targeting the expected impact of the work program.
Designing and developing multimodal technology able to meet the needs of elderly disabled people is definitely impacting on our society. Thanks to the user‐centric design and the EDC enabling technology, the ECOMODE final prototype will improve acceptability and usability of mobile ICT tools for elderly and disabled.

Smart phones and tablet applications that can be easily used and controlled have the potential to improve independent living of disabled and elderly, for instance by
• improving social networking and inclusion, decreasing isolation and the often consequent depression
• managing domotic environments and use of the Internet for shopping, information and
• managing reminders for medicaments, dietary rules, appointments, etc.
• making it possible for these groups of people to use ICT tools for interacting with the public administration
• improving outdoor navigation (see the points above for visually impaired people)
The number of potential applications is exploding with the development of targeted apps and add‐ons. The effect has disruptive impact of the quality of life and independence of elderly, disabled and visually impaired, impacting also on their caregivers and societal costs of home‐assistance, hospitalization, etc.
The expected impact of ECOMODE related technology will include the opportunity to turn a negative spiral of isolation, dependence and low motivation, to a positive one of inclusion, independence and active participation in society.

Impacts on European competitiveness
The growth in the mobile market is currently generating a fierce competition worldwide from which European actors are mostly excluded. The EU must take part in the leadership of this next generation of industrialization. Only in doing so, it can hope to control the impact that increased degrees of mobile platforms will have on industries and economy.
The population initially targeted in ECOMODE for this technology is a particular group of handicapped individuals; however the technology is equally applicable to the healthy population thus potentially serving a wide consumer market.
EDC drastically changes the way visual and auditory information is sensed and processed. The benefits of the technology are significant. EDC excels conventional technologies in several aspects including bandwidth and computational requirements, power consumption, dynamic range, speed and robustness of operation, however requires a paradigm shift in the way sensing is performed. The targeted application has the potential to establish a first consumer market product using this technology. Companies such as Chronocam, providing and developing the EDC technology, will benefit from the project results and IP generated in the project, and might impose EDC technology as a leader in this domain. During the project, Chronocam and Innovati will prepare an exploitation plan.
They will carefully evaluate the market, find and involve stakeholders, develop market and patent research in the domain of speech and gesture recognition, tablets, and relevant domains to devise a strategic roadmap and possible business plan for the path going from the ECOMODE prototype to the market

Environmental and social impacts
ECOMODE targets a key problem affecting the ageing population typical to all developed, industrial societies. ECOMODE will improve the interaction with ICT devices by developing and improving the robustness of natural communication strategies such as gestures (non‐touch based) and voice. Its device independent approach is aimed at supporting independence in the home and mobility outside the home, whatever the device.

This strategy will not only impact on the quality of life of disabled and elderly, their caregivers and on the costs related to public assistance, loss of productivity and related issues, but also on many other users and domains. Being a general‐purpose interface, it can be applied to tablets and smart phones (as targeted by the ECOMODE project), but it has the potential to be transferred to the domain of smart home environments, automotive, entertainment, smart cities, etc.

Environmental and social impacts
ECOMODE targets a key problem affecting the ageing population typical to all developed, industrial societies. ECOMODE will improve the interaction with ICT devices by developing and improving the robustness of natural communication strategies such as gestures (non‐touch based) and voice. Its device independent approach is aimed at supporting independence in the home and mobility outside the home, whatever the device.This strategy will not only impact on the quality of life of disabled and elderly, their caregivers and on the costs related to public assistance, loss of productivity and related issues, but also on many other users and domains. Being a general‐purpose interface, it can be applied to tablets and smart phones (as targeted by the ECOMODE project), but it has the potential to be transferred to the domain of smart home environments, automotive, entertainment, smart cities, etc.

Related information

Record Number: 191361 / Last updated on: 2016-11-16