2/2017 - 11 |
Speech Rate Control for Improving Elderly Speech Recognition of Smart DevicesSON, G. , KWON, S. , LIM, Y. |
Extra paper information in |
Click to see author's profile in SCOPUS, IEEE Xplore, Web of Science |
Download PDF (1,769 KB) | Citation | Downloads: 1,084 | Views: 3,397 |
Author keywords
automatic speech recognition, human computer interaction, speech analysis, man machine systems, human factor
References keywords
speech(15), time(4), communication(4), aging(4)
Blue keywords are present in both the references section and the paper title.
About this article
Date of Publication: 2017-05-31
Volume 17, Issue 2, Year 2017, On page(s): 79 - 84
ISSN: 1582-7445, e-ISSN: 1844-7600
Digital Object Identifier: 10.4316/AECE.2017.02011
Web of Science Accession Number: 000405378100011
SCOPUS ID: 85020117598
Abstract
Although smart devices have become a widely-adopted tool for communication in modern society, it still requires a steep learning curve among the elderly. By introducing a voice-based interface for smart devices using voice recognition technology, smart devices can become more user-friendly and useful to the elderly. However, the voice recognition technology used in current devices is attuned to the voice patterns of the young. Therefore, speech recognition falters when an elderly user speaks into the device. This paper has identified that the elderly's improper speech rate by each syllable contributes to the failure in the voice recognition system. Thus, upon modifying the speech rate by each syllable, the voice recognition rate saw an increase of 12.3%. This paper demonstrates that by simply modifying the speech rate by each syllable, which is one of the factors that causes errors in voice recognition, the recognition rate can be substantially increased. Such improvements in voice recognition technology can make it easier for the elderly to operate smart devices that will allow them to be more socially connected in a mobile world and access information at their fingertips. It may also be helpful in bridging the communication divide between generations. |
References | | | Cited By «-- Click to see who has cited this paper |
[1] Korea National Statistic office. "Social Survey; Welfare Category; Difficulties Experienced by Senior Citizens, Official Statistics Research Newsletter, vol.5, pp.2-3, 2013.
[2] W. S. Kang, M. S. Kim, J. W. Ko, "Effects of the smartphone information use and performance on life satisfaction among the elderly," Korean Gerontological Society, vol.33, no.1, pp.199-214, 2013. [3] B. C. Sonies, "Oral-motor Problems," Communication Disorders in Aging: Assessment and Management, Washington, Gallaudet University Press, pp. 185-213, 1987. [4] J. W. Bennett, P. H. H. M. Van Lieshout, C. M. Steele, "Tongue control for speech and swallowing in healthy younger and older subjects," International Journal of Orofacial Myology," vol.33, pp.518, 2007. [5] J. C. Kahane, "Anatomic and physiologic changes in the aging peripheral speech mechanism," Aging: Communication processes and disorders, pp.21-45, 1981. [6] S. Y. Lee, "The overall speaking rate and articulation rate of normal elderly people," Graduate program in speech and language pathology, Master these, Yonsei University, 2011. [7] W. J. Ryan, J.William, "Acoustic aspects of the aging voice", Journal of Gerontology, vol.27, no.2, pp.265-268, 1972. [CrossRef] [SCOPUS Times Cited 66] [8] Y. H. Kim. "Geriatric speech. plenary session IV," Yonsei University College of Medicine, Otolaryngology clinic. pp.205-207, 2003. [9] W. H. Manning, K. L. Monte, "Fluency breaks in older speakers: implications for a model of stuttering throughout the life cycle," Journal of fluency disorders. Vol.6, no.1, pp.3548, 1981. [CrossRef] [Web of Science Times Cited 15] [SCOPUS Times Cited 17] [10] J. D. Harnsberger, R. Shrivastav, R. Brown, W.S. Rothman, H. Hollien, "Speaking rate and fundamental frequency as speech cues to perceived age," Journal of voice, vol.22, no.1, pp.58-69, 2008. [CrossRef] [Web of Science Times Cited 97] [SCOPUS Times Cited 108] [11] H. Y. Pyo, H. S. Shim, "Paralytic disorder words (dysarthria) for improving the clarity of research trends: A Literature Review," Special Education, vol.4, no.1, pp.35-50, 2005 [12] M. Richardson, M. Hwang, A, Acero, X.Huang, "Improvements on speech recognition for fast talkers," Eurospeech, pp.411-414, 1999. [13] S. Kwon, S. Kim, J. Choeh. "Preprocessing for elderly speech recognition of smart devices," Computer Speech & Language. vol.36, pp.110-121, 2016. [CrossRef] [Web of Science Times Cited 7] [SCOPUS Times Cited 19] [14] A. Aniruddha, M. Mathew, S. Amantula, C. Sekhar, "Gammatone wavelet Cepstral Coefficients for robust speech recognition," TENCON 2013, pp.1-4, 2013. [CrossRef] [SCOPUS Times Cited 31] [15] W. Verhelst, M. Roelands, "An overlap-add technique based on waveform similarity (WSOLA) for high quality time-scale modification of speech," Acoustics, Speech, and Signal Processing(ICASSP), vol.2, pp. 554-557, 1993. [CrossRef] [16] W. Verhelst, "Overlap-add methods for time-scaling of speech. Speech Communication," vol.30, no.4, pp.207-221, 2000. [CrossRef] [Web of Science Times Cited 45] [SCOPUS Times Cited 60] [17] C. d'Alessandro, "Time-frequency speech transformation based on an elementary waveform representation. Speech communication," pp.419-431, 1990. [CrossRef] [Web of Science Times Cited 7] [SCOPUS Times Cited 13] [18] D. Henja, B. Musicus "The solafs time-scale modification algorithm," Technical Report of BBN, 1991. [19] S. Kwon, "Voice-driven sound effect manipulation," International Journal of Human-Computer Interaction, pp.373382, 2012. [CrossRef] [Web of Science Times Cited 3] [SCOPUS Times Cited 4] [20] S. Dusan, L.R. Rabiner, "On the relation between maximum spectral transition positions and phone boundaries," INTERSPEECH, pp.17-21, 2006. [CrossRef] [Web of Science Times Cited 5] [SCOPUS Times Cited 6] Web of Science® Citations for all references: 179 TCR SCOPUS® Citations for all references: 324 TCR Web of Science® Average Citations per reference: 9 ACR SCOPUS® Average Citations per reference: 15 ACR TCR = Total Citations for References / ACR = Average Citations per Reference We introduced in 2010 - for the first time in scientific publishing, the term "References Weight", as a quantitative indication of the quality ... Read more Citations for references updated on 2024-12-05 18:39 in 73 seconds. Note1: Web of Science® is a registered trademark of Clarivate Analytics. Note2: SCOPUS® is a registered trademark of Elsevier B.V. Disclaimer: All queries to the respective databases were made by using the DOI record of every reference (where available). Due to technical problems beyond our control, the information is not always accurate. Please use the CrossRef link to visit the respective publisher site. |
Faculty of Electrical Engineering and Computer Science
Stefan cel Mare University of Suceava, Romania
All rights reserved: Advances in Electrical and Computer Engineering is a registered trademark of the Stefan cel Mare University of Suceava. No part of this publication may be reproduced, stored in a retrieval system, photocopied, recorded or archived, without the written permission from the Editor. When authors submit their papers for publication, they agree that the copyright for their article be transferred to the Faculty of Electrical Engineering and Computer Science, Stefan cel Mare University of Suceava, Romania, if and only if the articles are accepted for publication. The copyright covers the exclusive rights to reproduce and distribute the article, including reprints and translations.
Permission for other use: The copyright owner's consent does not extend to copying for general distribution, for promotion, for creating new works, or for resale. Specific written permission must be obtained from the Editor for such copying. Direct linking to files hosted on this website is strictly prohibited.
Disclaimer: Whilst every effort is made by the publishers and editorial board to see that no inaccurate or misleading data, opinions or statements appear in this journal, they wish to make it clear that all information and opinions formulated in the articles, as well as linguistic accuracy, are the sole responsibility of the author.