Exploring the Impact of Data Augmentation Techniques on Automatic Speech Recognition System Development: A Comparative Study

doi:10.4316/AECE.2023.03001

FACTS & FIGURES

JCR Impact Factor: 0.700
JCR 5-Year IF: 0.700
SCOPUS CiteScore: 1.8
Issues per year: 4
Current issue: Aug 2024
Next issue: Nov 2024
Avg review time: 56 days
Avg accept to publ: 60 days
APC: 300 EUR

PUBLISHER

Stefan cel Mare
University of Suceava

Faculty of Electrical Engineering and
Computer Science

13, Universitatii Street
Suceava - 720229
ROMANIA

Print ISSN: 1582-7445
Online ISSN: 1844-7600
WorldCat: 643243560
doi: 10.4316/AECE

TRAFFIC STATS

2,984,984 unique visits 1,158,024 downloads
Since November 1, 2009

Robots online now
bingbot
SemrushBot
Googlebot

SCOPUS CiteScore

SJR SCImago RANK

LINKS

AECE on Wikipedia
DAS Conference
DAS on Wikipedia
EMCLab Laboratory
Hard & Soft Contest

TEXT LINKS

Anycast DNS Hosting

MOST RECENT ISSUES

Volume 24 (2024)

     »   Issue 3 / 2024

     »   Issue 2 / 2024

     »   Issue 1 / 2024

Volume 23 (2023)

     »   Issue 4 / 2023

     »   Issue 3 / 2023

     »   Issue 2 / 2023

     »   Issue 1 / 2023

Volume 22 (2022)

     »   Issue 4 / 2022

     »   Issue 3 / 2022

     »   Issue 2 / 2022

     »   Issue 1 / 2022

Volume 21 (2021)

     »   Issue 4 / 2021

     »   Issue 3 / 2021

     »   Issue 2 / 2021

     »   Issue 1 / 2021

View all issues

FEATURED ARTICLE

A Proposed Signal Reconstruction Algorithm over Bandlimited Channels for Wireless Communications, ASHOUR, A., KHALAF, A., HUSSEIN, A., HAMED, H., RAMADAN, A.
Issue 1/2023
AbstractPlus

SAMPLE ARTICLES

A Novel Enhanced Active Power Control Maximum Power Point Tracking Algorithm for Photovoltaic Grid Tied Systems, KOTLA, R. W., YARLAGADDA, S. R.
Issue 3/2021
AbstractPlus

A New Visual Cryptography Method Based on the Profile Hidden Markov Model, OZCAN, H., KAYA GULAGIZ, F., ALTUNCU, M. A., ILKIN, S., SAHIN, S.
Issue 1/2021
AbstractPlus

A Semi-automatic Heart Sounds Identification Model and Its Implementation in Internet of Things Devices, JUSAK, J., PUSPASARI, I., KUSUMAWATI, W. I.
Issue 1/2021
AbstractPlus

Semantic Segmentation and Reconstruction of Indoor Scene Point Clouds, HAO, W., WEI, H., WANG, Y.
Issue 3/2024
AbstractPlus

A New Motion Estimation Method using Modified Hexagonal Search Algorithm and Lucas-Kanade Optical Flow Technique, GHOUL, K., ZAIDI, S., LABOUDI, Z.
Issue 1/2024
AbstractPlus

Quadrature Signal Generator with Improved DC Offset Compensation, STOJIC, D.
Issue 3/2021
AbstractPlus

TOP ARTICLES

Most cited in WOS »

Most cited in SCOPUS »

Exploring the Impact of Data Augmentation Techniques on Automatic Speech Recognition System Development: A Comparative Study

GALIC, J. , GROZDIC, D.

Extra paper information in

Click to see author's profile in

SCOPUS,

IEEE Xplore,

Web of Science

Download PDF (1,307 KB) | Citation | Downloads: 1,076 | Views: 1,373

Author keywords
artificial neural networks, audio databases, automatic speech recognition, hidden markov models, support vector machines

References keywords
speech(22), recognition(15), data(13), augmentation(12), processing(7), audio(7), interspeech(6), signal(5), science(5), whispered(4)
Blue keywords are present in both the references section and the paper title.

About this article
Date of Publication: 2023-08-31
Volume 23, Issue 3, Year 2023, On page(s): 3 - 12
ISSN: 1582-7445, e-ISSN: 1844-7600
Digital Object Identifier: 10.4316/AECE.2023.03001
Web of Science Accession Number: 001062641900001
SCOPUS ID: 85172345871

Abstract

Full text preview

Automatic Speech Recognition (ASR) systems are notorious for their poor performance in adverse conditions, leading to high sensitivity and low robustness. Due to the costly and time-consuming nature of creating extensive speech databases, addressing the issue of low robustness has become a prominent area of research, focusing on the synthetic generation of speech data using pre-existing natural speech. This paper examines the impact of standard data augmentation techniques, including pitch shift, time stretch, volume control, and their combination, on the accuracy of isolated-word ASR systems. The performance of three machine learning models, namely Hidden Markov Models (HMM), Support Vector Machines (SVM), and Convolutional Neural Networks (CNN), is analyzed on two Serbian corpora of isolated words. The Whi-Spe speech database in neutral phonation is utilized for augmentation and training, and a specifically developed Python-based software tool is employed for the augmentation process in this research study. The conducted experiments demonstrate a statistically significant reduction in the Word Error Rate (WER) for the CNN-based recognizer on both testing datasets, achieved through a single augmentation technique based on pitch-shifting.

References

Cited By

Web of Science® Times Cited: 0
View record in Web of Science® [View]
View Related Records® [View]

Updated 2 days, 16 hours ago

SCOPUS® Times Cited: 2
View record in SCOPUS® [Free preview]
View citations in SCOPUS® [Free preview]

Updated 2 days, 16 hours ago

[1] Whispered Speech Recognition Based on Audio Data Augmentation and Inverse Filtering, Galić, Jovan, Marković, Branko, Grozdić, Đorđe, Popović, Branislav, Šajić, Slavko, Applied Sciences, ISSN 2076-3417, Issue 18, Volume 14, 2024.
Digital Object Identifier: 10.3390/app14188223 [CrossRef]

[2] An Analysis of Speech Emotion Recognition Based on Hybrid DNN-HMM Framework, Babić, Nebojša, Galić, Jovan, 2023 31st Telecommunications Forum (TELFOR), ISBN 979-8-3503-0313-1, 2023.
Digital Object Identifier: 10.1109/TELFOR59449.2023.10372735 [CrossRef]

Updated 2 days, 16 hours ago

Disclaimer: All information displayed above was retrieved by using remote connections to respective databases. For the best user experience, we update all data by using background processes, and use caches in order to reduce the load on the servers we retrieve the information from. As we have no control on the availability of the database servers and sometimes the Internet connectivity may be affected, we do not guarantee the information is correct or complete. For the most accurate data, please always consult the database sites directly. Some external links require authentication or an institutional subscription.

Web of Science^® is a registered trademark of Clarivate Analytics, Scopus^® is a registered trademark of Elsevier B.V., other product names, company names, brand names, trademarks and logos are the property of their respective owners.

Copyright ©2001-2024
Faculty of Electrical Engineering and Computer Science
Stefan cel Mare University of Suceava, Romania

All rights reserved: Advances in Electrical and Computer Engineering is a registered trademark of the Stefan cel Mare University of Suceava. No part of this publication may be reproduced, stored in a retrieval system, photocopied, recorded or archived, without the written permission from the Editor. When authors submit their papers for publication, they agree that the copyright for their article be transferred to the Faculty of Electrical Engineering and Computer Science, Stefan cel Mare University of Suceava, Romania, if and only if the articles are accepted for publication. The copyright covers the exclusive rights to reproduce and distribute the article, including reprints and translations.

Permission for other use: The copyright owner's consent does not extend to copying for general distribution, for promotion, for creating new works, or for resale. Specific written permission must be obtained from the Editor for such copying. Direct linking to files hosted on this website is strictly prohibited.

Disclaimer: Whilst every effort is made by the publishers and editorial board to see that no inaccurate or misleading data, opinions or statements appear in this journal, they wish to make it clear that all information and opinions formulated in the articles, as well as linguistic accuracy, are the sole responsibility of the author.

Website loading speed and performance optimization powered by:

PageSpeed

.ro

Menu:

Exploring the Impact of Data Augmentation Techniques on Automatic Speech Recognition System Development: A Comparative Study