TAPAS: Training network on Automatic Processing of PAthological Speech (#766287) EU H2020 Marie Sklodowska-Curie Innovative Training Networks European Training Networks (MSCA-ITN-ETN:ENG) Runtime: 4 years Role: Principal Investigator, Co-Author Proposal Partners: IDIAP, Université Paul Sabatier Toulouse III, Universitair Ziekenhuis Antwerpen, FAU Erlangen-Nürnberg, Stichting Katholieke Universiteit, INESC ID, LMU Munich, Interuniersitair Micro-Electronicacentrum IMEC, Stichting het Nederlands Kanker Instituutantoni van Leeuwenhoek Ziekenhuis, University of Passau, University of Sheffield
There are an increasing number of people across Europe with debilitating speech pathologies (e.g., due to stroke, Parkinson’s, etc). These groups face communication problems that can lead to social exclusion. They are now being further marginalised by a new wave of speech technology that is increasingly woven into everyday life but which is not robust to atypical speech. TAPAS is proposing a programme of pathological speech research, that aims to transform the well-being of these people. The TAPAS work programme targets three key research problems: (a) Detection: We will develop speech processing techniques for early detection of conditions that impact on speech production. The outcomes will be cheap and non-invasive diagnostic tools that provide early warning of the onset of progressive conditions such as Alzheimer’s and Parkinson’s. (b) Therapy: We will use newly-emerging speech processing techniques to produce automated speech therapy tools. These tools will make therapy more accessible and more individually targeted. Better therapy can increase the chances of recovering intelligible speech after traumatic events such a stroke or oral surgery. (c) Assisted Living: We will re-design current speech technology so that it works well for people with speech impairments and also helps in making informed clinical choices. People with speech impairments often have other co-occurring conditions making them reliant on carers. Speech-driven tools for assisted-living are a way to allow such people to live more independently. TAPAS adopts an inter-disciplinary and multi-sectorial approach. The consortium includes clinical practitioners, academic researchers and industrial partners, with expertise spanning speech engineering, linguistics and clinical science. All members have expertise in some element of pathological speech. This rich network will train a new generation of 15 researchers, equipping them with the skills and resources necessary for lasting success.
ACLEW: Analyzing Child Language Experiences Around the World (HJ-253479) – 14 winning projects in total T-AP (Trans-Atlantic Platform for the Social Sciences and Humanities along with Argentina (MINCyT), Canada (SSHRC, NSERC), Finland (AKA), France (ANR), United Kingdom (ESRC/AHRC), United States (NEH)) Digging into Data Challenge 4th round Runtime:01.06.2017 – 31.05.2020 Role: Principal Investigator, Co-Author Proposal Partners: Duke University, École Normale Supérieure, Aalto University, CONICET, Imperial College London, University of Manitoba, Carnegie Mellon University, University of Toronto
An international collaboration among linguists and speech experts to study child language development across nations and cultures to gain a better understanding of how an infant’s environment affects subsequent language ability.
OPTAPEB: Optimierung der Psychotherapie durch Agentengeleitete Patientenzentrierte Emotionsbewältigung (#V5IKM010) BMBFIKT2020-Grant (Forschungsprogramm zur Mensch-Technik-Interaktion: Technik zum Menschen bringen – Interaktive körpernahe Medizintechnik) Runtime: 01.08.2017 – 31.07.2020 Role: Beneficiary Partners: Universität Regensburg, Fraunhofer IIS, VTplus GmbH, Ambiotex GmbH, NTT GmbH, eHealthLabs, audEERING GmbH
OPTAPEB aims to develop an immersive and interactive virtual reality system that assists users in curing phobia. The system will allow to experience situations of phobia and protocol this emotional experience and the user’s behaviour. Various levels of emotional reactions will be monitored continuously and in real time by the system that applies sensors based on innovative e-wear technology, speech signals, and other pervasive technologies (e.g. accelerometres). A further goal of the project is the development of a game-like algorithm to control the user experience of anxieties through exposure therapy and to adapt the course of the therapy to the user needs and the current situation automatically.
Deep Learning Speech Enhancement Industry Cooperation with HUAWEI TECHNOLOGIES Runtime: 12.11.2016 – 11.11.2018 Role: Principal Investigator, Author Proposal Partners: University of Passau, HUAWEI TECHNOLOGIES
The research target of this project is to develop state-of-the-art methods for speech enhancement based on deep learning. The aim is to overcome limitations in challenging scenarios that are posed by non-stationary noise and distant speech with a potentially moving device and potentially limited power and memory on the device. It will be studied how deep learning speech enhancement can successfully be applied to multi-channel input signals. Furthermore, an important aspect is robustness and adaptation to unseen conditions, such as different noise types.
ZAM: Zero-resource keyword recognition for Audio Mass-Data (Zero-Resource Schlagworterkennung bei Audio-Massendaten) Runtime: 01.12.2016 – 31.08.2017 Role: Coauthor Proposal, Beneficiary, Principal Investigator Partners: University of Passau and others
To process mass audio data captured by a range of diverse sensors, technical solutions within the field of keyword recognition shall be investigated. It shall be shown which approaches simplify, accelerate, and optimise audio analysis as well as optimise manual work processes. The major aim thereby is to significantly reduce human work load by utmost automation given the following focus: 1) limited to no resources (“zero resource”) for training and 2) answering the question on how low audio quality can be when reasonably processing audio highly automatically.
RADAR CNS: Remote Assessment of Disease and Relaps in Central Nervous System Disorders (#115902) – 15.8% acceptance rate in the call EU H2020 / EFPIA Innovative Medicines Initiative (IMI) 2 Call 3
Runtime: 01.04.2016 – 31.03.2021 Role: Coauthor Proposal, Beneficiary, Principal Investigator, Workpackage Leader Partners: King’s College London, Provincia Lombardo-Veneta – Ordine Ospedaliero di San Giovanni di Dio— Fatebenefratelli Lygature, Università Vita-Salute San Raffaele, Fundacio Hospital Universitari Vall D’Hebron, University of Nottingham, Centro de Investigacion Biomedica en Red, Software AG, Region Hovedstaden, Stichting VU-Vumc, University Hospital Freiburg, Stichting IMEC Nederland, Katholieke Universiteit Leuven, Northwestern University, Stockholm Universitet, University of Passau, Università degli Studi di Bergamo, Charité – Universitätsmedizin Berlin, Intel Corporation (UK) Ltd, GABO:mi, Janssen Pharmaceutica NV, H. Lundbeck A/S, UCB Biopharma SPRL, MSD IT Global Innovation Center
The general aim is to develop and test a transformative platform of remote monitoring (RMT) of disease state in three CNS diseases: epilepsy, multiple sclerosis and depression. Other aims are: (i) to build an infrastructure to identify clinically useful RMT measured biosignatures to assist in the early identification of relapse or deterioration; (ii) to develop a platform to identify these biosignatures; (iii) to anticipate potential barriers to translation by initiating a dialogue with key stakeholders (patients, clinicians, regulators and healthcare providers).
EngageME: Automated Measurement of Engagement Level of Children with Autism Spectrum Conditions during Human-robot Interaction (#701236) – 14.4% acceptance rate in the call EU Horizon 2020 Marie Skłodowska-Curie action Individual Fellowship (MASCA-IF 2015)
Runtime: 01.09.2016 – 31.08.2019 Role: Coauthor Proposal, Coordinator, Beneficiary, Supervisor Partners: University of Passau, Massachussetts Insititute of Technology
Engaging children with ASC (Autism Spectrum Conditions) in communication centred activities during educational therapy is one of the cardinal challenges by ASC and contributes to its poor outcome. To this end, therapists recently started using humanoid robots (e.g., NAO) as assistive tools. However, this technology lacks the ability to autonomously engage with children, which is the key for improving the therapy and, thus, learning opportunities. Existing approaches typically use machine learning algorithms to estimate the engagement of children with ASC from their head-pose or eye-gaze inferred from face-videos. These approaches are rather limited for modeling atypical behavioral displays of engagement of children with ASC, which can vary considerably across the children. The first objective of EngageME is to bring novel machine learning models that can for the first time effectively leverage multi-modal behavioural cues, including facial expressions, head pose, vocal and physiological cues, to realize fully automated context-sensitive estimation of engagement levels of children with ASC. These models build upon dynamic graph models for multi-modal ordinal data, based on state-of-the-art machine learning approaches to sequence classification and domain adaptation, which can adapt to each child, while still being able to generalize across children and cultures. To realize this, the second objective of EngageME is to provide the candidate with the cutting-edge training aimed at expanding his current expertise in visual processing with expertise in wearable/physiological, and audio technologies, from leading experts in these fields. EngageME is expected to bring novel technology/models for endowing assistive robots with ability to accurately ‘sense’ engagement levels of children with ASC during robot-assisted therapy, while providing the candidate with a set of skills needed to become one of the frontiers in the emerging field of affect-sensitive assistive technology.
DE-ENIGMA: Multi-Modal Human-Robot Interaction for Teaching and Expanding Social Imagination in Autistic Children (#688835) – 6.9% acceptance rate in the call
EU Horizon 2020Research & Innovation Action (RIA) Runtime: 01.02.2016 – 31.07.2020 Role: Coauthor Proposal, Beneficiary, Principal Investigator, WP Leader Partners: University of Twente, Savez udruzenja Srbije za pomoc osobama sa autizmom, Autism-Europe, IDMIND, University College London, University of Passau, Romane Institute of Mathematics Simion Stoilow of the Romanian Academy, Imperial College London
Autism Spectrum Conditions (ASC, frequently defined as ASD — Autism Spectrum Disorders) are neurodevelopmental conditions, characterized by social communication difficulties and restricted and repetitive behaviour patterns. There are over 5 million people with autism in Europe – around 1 in every 100 people, affecting lives of over 20 million people each day. Alongside their difficulties, individuals with ASC tend to have intact and sometimes superior abilities to comprehend and manipulate closed, rule-based, predictable systems, such as robotbased technology. Over the last couple of years, this has led to several attempts to teach emotion recognition and expression to individuals with ASC, using humanoid robots. This has been shown to be very effective as an integral part of the psychoeducational therapy for children with ASC. The main reason for this is that humanoid robots are perceived by children with autism as being more predictable, less complicated, less threatening, and more comfortable to communicate with than humans, with all their complex and frightening subtleties and nuances. The proposed project aims to create and evaluate the effectiveness of such a robot-based technology, directed for children with ASC. This technology will enable to realise robust, context-sensitive (such as user- and culture-specific), multimodal (including facial, bodily, vocal and verbal cues) and naturalistic human-robot interaction (HRI) aimed at enhancing the social imagination skills of children with autism. The proposed will include the design of effective and user-adaptable robot behaviours for the target user group, leading to more personalised and effective therapies than previously realised. Carers will be offered their own supportive environment, including professional information, reports of child’s progress and use of the system and forums for parents and therapists.
U-STAR: Universal Speech Translation Advanced Research
Runtime: since 01.01.2016 Role: Consortial Partner Partners: University of Passau and 36 further partners – cf. homepage
The Universal Speech Translation Advanced Research Consortium (U-STAR) is an international research collaboration entity formed to develop a network-based speech-to-speech translation (S2ST) with the aim of breaking language barriers around the world and to implement vocal communication between different languages.
EmotAsS: Emotionsensitive Assistance System (#16SV7213)
BMBF IKT2020-Grant (Sozial- und emotionssensitive Systeme für eine optimierte Mensch-Technik-Interaktion) Runtime: 01.06.2015 – 31.05.2018 Role: Coauthor Proposal, Beneficiary, Principal Investigator Partners: University of Bremen, University of Passau, vacances Mobiler Sozial- und Pflegedienst GmbH, Martinshof (Werkstatt Bremen), Meier und Schütte GmbH und Co. KG.
The aim of the project is to develop and investigate emotion detection and according usage for interaction processes in manufactories for handicapped individuals. It is therefore intended to develop a system, which reliably recognizes, responds, and reacts appropriately to emotions of people with disabilities during their everyday work routinge. The findings are to be transferred to further fields of application, and tested in particular for the communication with dementia patients.
(Original German description: Emotionen und deren Erkennung in der gesprochenen Sprache sind für die erfolgreiche Mensch-Technik- Interaktion wichtig, insbesondere bei Menschen mit Erkrankungen oder Behinderungen. Ziel des Projekts ist es, Emotionserkennung und deren Nutzung für Interaktionsprozesse in Werkstätten für behinderte Menschen zu entwickeln und zu untersuchen. Es soll daher ein System entwickelt werden, das sicher Emotionen bei Menschen mit Behinderungen in der Sprache erkennt und angemessen und unterstützend auf diese reagiert. Die Erkenntnisse sollen auf ein weiteres Anwendungsgebiet übertragen und in der Kommunikation mit Demenzerkrankten erprobt werden.)
Promoting Early Diagnosis of Rett Syndrome through Speech-Language Pathology
(Akustische Parameter als diagnostische Marker zur Früherkennung von Rett-Syndrom) (#16430) Österreichische Nationalbank (OeNB) Jubiläumsfonds Runtime: 01.11.2015 – 31.10.2019 Role: Main Cooperation Partner Partners: Medical University of Graz, Karolinska Institutet, Boston Children’s Hospital and Harvard Medical School, University of Passau, Imperial College London, Victoria University of Wellington
FP7 ERC Starting Grant (StG) – 8.6% acceptance rate in the call (7% in Computer Science) Runtime: 01.01.2014 – 31.12.2018 Role: Author Proposal, Principal Investigator and Grant Holder Partners: University of Passau, TUM
Recently, automatic speech and speaker recognition has matured to the degree that it entered the daily lives of thousands of Europe’s citizens, e.g., on their smart phones or in call services. During the next years, speech processing technology will move to a new level of social awareness to make interaction more intuitive, speech retrieval more efficient, and lend additional competence to computer-mediated communication and speech-analysis services in the commercial, health, security, and further sectors. To reach this goal, rich speaker traits and states such as age, height, personality and physical and mental state as carried by the tone of the voice and the spoken words must be reliably identified by machines. In the iHEARu project, ground-breaking methodology including novel techniques for multi-task and semi-supervised learning will deliver for the first time intelligent holistic and evolving analysis in real-life condition of universal speaker characteristics which have been considered only in isolation so far. Today’s sparseness of annotated realistic speech data will be overcome by large-scale speech and meta-data mining from public sources such as social media, crowd-sourcing for labelling and quality control, and shared semi-automatic annotation. All stages from pre-processing and feature extraction, to the statistical modelling will evolve in “life-long learning” according to new data, by utilising feedback, deep, and evolutionary learning methods. Human-in-the-loop system validation and novel perception studies will analyse the self-organising systems and the relation of automatic signal processing to human interpretation in a previously unseen variety of speaker classification tasks. The project’s work plan gives the unique opportunity to transfer current world-leading expertise in this field into a new de-facto standard of speaker characterisation methods and open-source tools ready for tomorrow’s challenge of socially aware speech analysis.
SEWA: Automatic Sentiment Estimation in the Wild (#645094)
EU Horizon 2020 Innovation Action (IA) – 9.3% acceptance rate in the call Runtime: 01.02.2015 – 31.07.2018 Role: Principal Investigator, Coauthor Proposal, Project Steering Board Member, Workpackage Leader Partners: Imperial College London, University of Passau, PlayGen Ltd, RealEyes
The main aim of SEWA is to deploy and capitalise on existing state-of-the-art methodologies, models and algorithms for machine analysis of facial, vocal and verbal behaviour, and then adjust and combine them to realise naturalistic human-centric human-computer interaction (HCI) and computer-mediated face-to-face interaction (FF-HCI). This will involve development of computer vision, speech processing and machine learning tools for automated understanding of human interactive behaviour in naturalistic contexts. The envisioned technology will be based on findings in cognitive sciences and it will represent a set of audio and visual spatiotemporal methods for automatic analysis of human spontaneous (as opposed to posed and exaggerated) patterns of behavioural cues including continuous and discrete analysis of sentiment, liking and empathy.
ARIA-VALUSPA: Artificial Retrieval of Information Assistants – Virtual Agents with Linguistic Understanding, Social skills, and Personalised Aspects (#645378)
EU Horizon 2020 Research & Innovation Action (RIA) – 9.3% acceptance rate in the call Runtime: 01.01.2015 – 31.12.2017 Role: Principal Investigator, Coauthor Proposal, Project Steering Board Member, Workpackage Leader Partners: University of Nottingham, Imperial College London, CNRS, University of Augsburg, University of Twente, Cereproc Ltd, La Cantoche Production
The ARIA-VALUSPA project will create a ground-breaking new framework that will allow easy creation of Artificial Retrieval of Information Assistants (ARIAs) that are capable of holding multi-modal social interactions in challenging and unexpected situations. The system can generate search queries and return the information requested by interacting with humans through virtual characters. These virtual humans will be able to sustain an interaction with a user for some time, and react appropriately to the user’s verbal and non-verbal behaviour when presenting the requested information and refining search results. Using audio and video signals as input, both verbal and non-verbal components of human communication are captured. Together with a rich and realistic emotive personality model, a sophisticated dialogue management system decides how to respond to a user’s input, be it a spoken sentence, a head nod, or a smile. The ARIA uses special speech synthesisers to create emotionally coloured speech and a fully expressive 3D face to create the chosen response. Back-channelling, indicating that the ARIA understood what the user meant, or returning a smile are but a few of the many ways in which it can employ emotionally coloured social signals to improve communication. As part of the project, the consortium will develop two specific implementations of ARIAs for two different industrial applications. A ‘speaking book’ application will create an ARIA with a rich personality capturing the essence of a novel, whom users can ask novel-related questions. An ‘artificial travel agent’ web-based ARIA will be developed to help users find their perfect holiday – something that is difficult to do with existing web interfaces such as those created by booking.com or tripadvisor.
Automatic General Audio Signal Classification China Scholarship Council Runtime: 01.09.2014 – 31.08.2018 Role: Supervisor Partners: TUM