2103

Title: Enhancing the features extraction process for automatic speech recognition with fractal dimensions

Authors: Aitzol Ezeiza, Karmele López de Ipiña, Carmen Hernandez, Nora Barroso

Keywords: Non-linear speech recognition, automatic speech recognition, Mel frecuency cepstral coefficinets, fractal dimensions

Field: Computational Intelligence, Language Technology

Information: Journal || Cognitive Computation || Bol.: 5(4) pp: 545-550 ISSN: 1866-9964

Impact Factor: JCR (2012): 0,867

Abstract: Mel frequency cepstral coefficients (MFCCs) are a standard tool for automatic speech recognition (ASR), but they fail to capture part of the dynamics of speech. The nonlinear nature of speech suggests that extra information provided by some nonlinear features could be especially useful when training data are scarce or when the ASR task is very complex. In this paper, the Fractal Dimension of the observed time series is combined with the traditional MFCCs in the feature vector in order to enhance the performance of two different ASR systems. The first is a simple system of digit recognition in Chinese, with very few training examples, and the second is a large vocabulary ASR system for Broadcast News in Spanish.

Link: SpringerLink


Title: On the selection of non-invasive methods based on speech analysis orientes to automatic alzheimer disease diagnosis

Authors: Karmele López de Ipiña, Jesus-Bernardino Alonso, Carlos Manuel Travieso, Jordi Solé-Casals, Harkaitz Egiraun, Marcos Faundez-Zanuy, Aitzol Ezeiza, Nora Barroso, Miriam Ecay-Torres, Pablo Martinez-Lage, Unai Martinez de Lizardui

Keywords: Alzheimer’s disease diagnosis, spontaneous speech, emotion recognition, machine learning, non-invasive diagnostic techniques, dementia

Field: Biometry, Emotions Processing, Language Technology

Information: Journal || Sensors|| Bol.: 13(5) pp: 6730-6745 ISSN: 1424-8220

Impact Factor: JCR (2012): 1,953

Abstract:: The work presented here is part of a larger study to identify novel technologies and biomarkers for early Alzheimer disease (AD) detection and it focuses on evaluating the suitability of a new approach for early AD diagnosis by non-invasive methods. The purpose is to examine in a pilot study the potential of applying intelligent algorithms to speech features obtained from suspected patients in order to contribute to the improvement of diagnosis of AD and its degree of severity. In this sense, Artificial Neural Networks (ANN) have been used for the automatic classification of the two classes (AD and control subjects). Two human issues have been analyzed for feature selection: Spontaneous Speech and Emotional Response. Not only linear features but also non-linear ones, such as Fractal Dimension, have been explored. The approach is non invasive, low cost and without any side effects. Obtained experimental results were very satisfactory and promising for early diagnosis and classification of AD patients.

Link: SENSORS


Title: On automatic diagnosis of alzheimer’s diseases based on spontaneous speech analysis and emotional temperature

Authors: Karmele López de Ipiña, Jesus-Bernardino Alonso, Jordi Solé-Casals, Nora Barroso,  P. Henriquez, Marcos Faundez-Zanuy, Carlos Manuel Travieso, Miriam Ecay-Torres, Pablo Martinez-Lage, Harkaitz Egiraun

Keywords Alzheimer’s disease diagnosis, Spontaneous speech, Emotion recognition

Field: Biomedicine, Emotions Processing, Language Technology, Computer Cognitive

Information: Journal || Cognitive Computation ||  ISSN: 1866-9956

Impact Factor: JCR (2012): 0,867

Abstract:: Alzheimer’s disease (AD) is the most prevalent form of progressive degenerative dementia; it has a high socioeconomic impact in Western countries. Therefore, it is one of the most active research areas today. Alzheimer’s disease is sometimes diagnosed by excluding other dementias, and definitive confirmation is only obtained through a postmortem study of the brain tissue of the patient. The work presented here is part of a larger study that aims to identify novel technologies and biomarkers for early AD detection, and it focuses on evaluating the suitability of a new approach for early diagnosis of AD by noninvasive methods. The purpose is to examine, in a pilot study, the potential of applying machine learning algorithms to speech features obtained from suspected Alzheimer’s disease sufferers in order to help diagnose this disease and determine its degree of severity. Two human capabilities relevant in communication have been analyzed for feature selection: spontaneous speech and emotional response. The experimental results obtained were very satisfactory and promising for the early diagnosis and classification of AD patients.

Link: SpringerLink


Title: Languahe identification for Internet security in the Basque context: A cross-lingual approach

Authors: Karmele López de Ipiña,  Nora Barroso, Aitzol Ezeiza, Carmen Hernandez

KeywordsAutomatic speech recognition, Hidden Markov models, Natural language processing, Security, Terminology

Field: Language Technology, Computational Intelligence

Information  Journal || IEEE Aerospace and Electronics Systems Magazine || Bol.: 28(8)   pp:24-31  ISSN: 0885-8985

Impact Factor:: JCR (2012): 0,343

Abstract:: The present work describes the development of an LID system suited for handling security tasks in the Internet. The development context was the Infozazpi Internet digital radio, and the task presented substantial complexity due to the trilingual environment and the scarcity of language resources for Basque. In order to overcome previous difficulties, we propose a hybrid system based on the selection of subword units by SVMs, MLP classifiers, and discriminant analysis improved with robust regularized covariance matrix estimation methods and stochastic methods for ASR tasks (SC-HMM and n-grams). Our new subword unit proposals and the use of triphones and cross-lingual approaches considerably improve the system performance, achieving an optimal and stable LID recognition rate despite the complexity of the problem.

Link: IEEE Xplore Digital Library


2012

Title: Experiments for the selection of sub-word units in the Basque context for semantic tasks

Authors: Nora Barroso, Karmele López de Ipiña, Carmen Hernandez, Manuel Graña

Keywords: Under-resourced languages, Sub-word units, Multilingual automatic speech recognition, Discriminant analysis, Matrix covariance estimation methods

Field: Language Technology, Computational Intelligence

Information: Journal || International Journal of Speech Technology || Bol.: 15(1) pp: 545-550 ISSN: 49-56

Impact Factor: SNIP (2012): 1,291

Abstract:: The long term goal of our project is the development of robust ASR systems in the Basque context where coexist French, Spanish and Basque (a minority language). The development of ASR systems involves dealing with issues such as Acoustic Phonetic Decoding (APD), Language Modelling (LM) or the development of appropriate Language Resources (LR). Thus, these applications are generally very language-dependent and require very large resources. This work is focused on the selection of appropriate sub-word units with under-resourced and noisy conditions. Nowadays, in particular, the work is oriented to Basque Broadcast News (BN) due to the interest of digital mass-media as the trilingual Infozazpi radio (situated in French Basque Country). Thus, in order to decrease the negative impact that the lack of resources has in this issue we apply several data optimization methodologies based on Matrix Covariance Estimation and Ontology-based approaches.

Link: SpringerLink


 

2011

Title: Semantic speech recognition in the Basque context Part I: cross-lingual approaches

Authors: Nora Barroso, Karmele López de Ipiña, Odei Barroso, Aitzol Ezeiza, Carmen Hernandez, Manuel Graña

Keywords: Cross-lingual approach, Under-resourced languages, Graphemes, Data optimization, Semantic tasks

Field: Language Technology, Computational Intelligence

Information:  Journal || International Journal os Speech Technology || Bol.: 15(1)   pp: 33-40  ISSN: 1381-2416

Impact Factor: SNIP (1,291)

Abstract:: This work, divided into Part I and II, describes the development of GorUP a Semantic Speech Recognition System in the Basque context. Part I analyses cross-lingual approaches oriented to under-resourced languages and Part II the development of the Language Identification system. During the development, data optimization methods and Soft Computing methodologies oriented to complex environment are used in order to overcome the lack of resources. Moreover, in this context three languages coexist: French, Spanish and Basque. Indeed our main goal is the development of robust Automatic Speech Recognition (ASR) systems for Basque, but all language variability has to be analyzed. In this regard, Basque speakers mix during the speech not only sounds but also words of the three languages which results in a strong presence of cross-lingual elements. Besides, Basque is an agglutinative language with a special morpho-syntactic structure inside the words that may lead to intractable vocabularies. Nowadays, our work is oriented to Information Retrieval and mainly to small internet mass-media. In these cases the available resources for Basque in general, and for this task in particular, are very few and complex to process because of the noisy environment. Thus, the methods employed in this development (ontology-based approach or cross-lingual methodologies oriented to profit from more powerful languages) could suit the requirements of many under-resourced languages.

Link: SpringerLink


Title: Semantic speech recognition in the Basque context Part II: language identification for under-resourced languages

Authors: Nora Barroso, Karmele López de Ipiña, Carmen Hernandez, Aitzol Ezeiza, Manuel Graña

Keywords: Language identification, Under resourced languages, Discriminant analysis, Covariance matrix estimation methods, Semantic speech recognition

Field: Language Technology, Computational Intelligence

Information:  Journal || International Journal os Speech Technology || Bol.: 15(1)   pp: 41-47  ISSN: 1381-2416

Impact Factor:: SNIP (1,291)

Abstract:: This paper describes the development of a Language Identification (LID) system oriented to robust Multilingual Speech Recognition in the Basque context where LID system is integrated in GorUP, a Semantic Speech Recognition system for industrial complex environments described in Part I. The work presents hybrid strategies for LID, based on the selection of system elements by several classifiers (Support Vector Machines and Multilayer Perceptron) and Discriminant Analysis improved with robust regularized covariance matrix estimation methods oriented to under-resourced languages and stochastic methods for speech recognition tasks (Hidden Markov Models and n-grams). The LID tool manages the main elements of the Automatic Speech Recognition system (Acoustic Phonetic Decoder, Language Model and Lexicons).

Link: SpringerLink