The graphical method of pauses detection in English speech signals

Palabras clave: English, graphical method, language, speech. habla, idioma, inglés, método gráfico.



This paper is devoted to the problem of pauses detection in English speech signals. The aim of the current study is to create a new method of speech pauses detection that has no drawbacks other algorithms suffer from. The analysis of it suggests the opportunity to use the graphical method in real-time applications. The article provides a new vision and a new solution of pauses detection problem. The result of the study – the graphical method – may be applied to real-time signal processing, text-to-speech synthesis or used to enrich knowledge about the specified problems.


Este artículo está dedicado al problema de la detección de pausas en las señales de habla en inglés. El objetivo del estudio es crear un nuevo método de detección de pausas del habla que no tenga inconvenientes de otros algoritmos. Su análisis sugiere la oportunidad de utilizar el método gráfico en aplicaciones en tiempo real. El artículo proporciona una nueva visión y una nueva solución del problema de detección de pausas. El resultado del estudio, el método gráfico, puede aplicarse al procesamiento de señales en tiempo real, la síntesis de texto a voz o utilizarse para enriquecer el conocimiento sobre los problemas especificados.

Biografía del autor/a

E. V MARTYNOVA, Kazan Federal University

Ekaterina Vladimirovna Martynova. In 2006-2012 received higher education at KFU, FIA, teacher of a foreign language with an additional specialty second foreign language. Qualification is a foreign language with an additional specialty. Positions that held are Senior Lecturer, BS at KFU, Institute of International Relations, Department Higher School of Foreign Languages and Translation, Department of Foreign Languages (main employee). Knowledge of languages is English (Fluency), Spanish (Fluency).

G. R EREMEEVA, Kazan Federal University

Guzel Rinatovna Eremeeva. Born in 07/30/1980. Positions are Associate Professor (Associate Professor), Head University at Institute of International Relations, History and Oriental Studies, department Higher School of Foreign Languages and Translation, Department of Foreign Languages (main). Academic titles are Associate Professor (04/01/2019). Languages are Kazakh (Basic Speaker), English (Proficient Speaker), and Tatar (Independent Speaker).

G. F VALIEVA, Kazan Federal University

Gulnara Firdusovna Valieva. is a senior teacher of the department of foreign languages of the Institute of International Relations at Kazan Federal University. She devoted more than 8 years to work with future physicists, mathematicians and IT specialists. The author of the book "English for Information Security" and a lot of ELR’s. Certified teacher of EduScrum. Annually gives master classes in educational centers and attends workshops and webinars.



BOBYREVA, NN (2018). “Structure, Semantics, and Functions of Linguistic Signs in the Television Graphics of Sports Events Broadcasting”. The Journal of Social Sciences Research, pp. 417-420.

CAHN, JE (1990). "The generation of affect in synthesized speech." Journal of the American Voice I/O Society 8(1 ), pp.1-2.

CAMPIONE, E, & VÉRONIS, J (2002). “A large-scale multilingual study of silent pause duration”. In Speech prosody 2002, international conference. pp. 192-212.

FARSINEJAD, M, & ANALOUI, M (2008). “A new robust voice activity detection method based on genetic algorithm”. In 2008 Australasian Telecommunication Networks and Applications Conference, pp. 80-84.

KONDRATEVA, I, &NAZAROVA, M (2015). “Integration of science and language in teaching English”. Journal of English Language and Literature. 6(3), pp. 61-65.

LI, K, SWAMY, MNS, & AHMAD, MO (2005). “An improved voice activity detection using higher-order statistics”. IEEE Transactions on Speech and Audio Processing, 13(5), pp. 965-974.

LUTFULLINA, ANMGF, & MAKHMUTOVA, A (2017). Dependence of pragmatically implied meaning on aspectual-temporal semantics (based on the English and Russian language material). pp. 87-97.

MOATTAR, MH, & HOMAYOUNPOUR, MM (2009). “A simple but efficient real-time voice activity detection algorithm”. In the 2009 17th European Signal Processing Conference. pp. 2549-2553.

NASIBOV, Z, & KINNUNEN, T (2012). Decision fusion of voice activity detectors. pp. 8-11.

RABINER, LR, & SAMBUR, MR (1975). “An algorithm for determining the endpoints of isolated utterances”. Bell System Technical Journal, 54(2), pp.297-315.

SHEN, JL, HUNG, JW, & LEE, LS (1998). “Robust entropy-based endpoint detection for speech recognition in noisy environments”. In Fifth international conference on spoken language processing.

SHIN, WH (2000). “Speech/non-speech classification using multiple features for robust endpoint detection”. Acoustics, Speech, and Signal Processing, 3, pp. 1399-1402.

SOHN, J, & SUNG, W (1998). “A voice activity detector employing soft decision-based noise spectrum adaptation”. In Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP'98 (Cat. No. 98CH36181),1, pp. 365-368.

WU, GD, & LIN, CT (2000). “Word boundary detection with Mel-scale frequency bank in a noisy environment”. IEEE transactions on speech and audio processing, 8(5), pp. 541-554.

ZELLNER, B (1994).”Pauses and the temporal structure of speech”. In Zellner, B.(1994). Pauses and the temporal structure of speech, in E. Keller (Ed.) Fundamentals of speech synthesis and speech recognition. Chichester: John Wiley. pp. 41-62.

Cómo citar
MARTYNOVA, E. V., EREMEEVA, G. R., & VALIEVA, G. F. (2019). The graphical method of pauses detection in English speech signals. Utopía Y Praxis Latinoamericana, 24(1), 26-31. Recuperado a partir de