

Method of a voice source acoustic analysis in real time
https://doi.org/10.32446/0368-1025it.2025-4-64-73
Abstract
The problem of non-invasive research of the speech apparatus vocal function by the announcer's speech signal is considered. A new method of acoustic analysis of a pulse-type voice source based on a two-stage measurement procedure has been developed. The first stage of measurements provides for filtering the voice excitation signal of the vocal tract, and the second stage – converting this signal into a final pulse sequence synchronous with the main tone of the speech signal. An example of technical (software) implementation of the developed method is considered, estimates of its computational complexity and speed are given. The ability of the method to be used in the soft (with a delay of hundredths of a second) real time mode has been established. A full-scale experiment has been set up and conducted using the author's software. It is shown that at limited intervals of vocalization of the speech signal the developed method guarantees stability of the repetition rate and shape of excitation impulses, which is valuable from the point of view of the accuracy of measurements of all the main parameters of the speech vocal source: from the fundamental frequency to the amplitude disturbances (flickering) of the source pulses. The obtained results will be useful in developing new and upgrading existing algorithms and technologies for speech signal synthesis and digital speech transmission over low-speed communication channels, as well as medical diagnostics and voice therapy systems.
About the Authors
V. V. SavchenkoRussian Federation
Vladimir V. Savchenko
Nizhny Novgorod
L. V. Savchenko
Russian Federation
Lyudmila V. Savchenko
Nizhny Novgorod
References
1. Ternström S. Special issue on current trends and future directions in voice acoustics measurement. Applied Sciences, 13(6), 3514, (2023). https://doi.org/10.3390/app13063514
2. Englert M., Latoszek B. B., Behlau M. Exploring the validity of acoustic measurements and other voice assessments. Journal of Voice, 38(3), 567–571 (2024). https://doi.org/10.1016/j.jvoice.2021.12.014
3. Degottex G., Kane J., Drugman T., Raitio T., Scherer S. COVAREP – A collaborative voice analysis repository for speech technologies. 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Florence, Italy, 2014, pp. 960–964. https://doi.org/10.1109/ICASSP.2014.6853739
4. Rabiner L. R., Shafer R. V. Digital prosessing of speech signals. Trans. from English. Ed. by M. V. Nazarov and Yu. N. Prokhorov. Moscow, Radio i svyaz’ (1981). (In Russ.)
5. Gibson J. D. Mutual information, the linear prediction model, and CELP voice codecs. Information, 10(5), 179 (2019). https://doi.org/10.3390/info10050179
6. Gibson J. D. Speech Compression. Information, 7(2), 32 (2016). https://doi.org/10.3390/info7020032
7. Savchenko V. V., Savchenko L. V. A method for the asynchronous analysis of a voice source based on a two-level autoregressive model of speech signal. Measurement Techniques, 67(2), 151–161 (2024). https://doi.org/10.1007/s11018-024-02330-0
8. Yanushevskaya I., Murphy A., Gobl C., Ní-Chasaide A. Global waveshape parameter Rd in signaling focal prominence: Perceptual salience in the absence of f0 variation. Frontiers in Communication, 7, 1026222 (2022). https://doi.org/10.3389/fcomm.2022.1026222
9. Cabral J. P., Meireles A. R. Transformation of voice quality in singing using glottal source features. Proc. Workshop of Speech, Music and Mind (SMM19), 31–35 (2019). https://doi.org/10.21437/SMM.2019-7
10. Zhang Z. The effect of vocal tract semi-occlusion on the voice source and implications for voice therapy. The Journal of the Acoustical Society of America, 154(4), A353 (2023). https://doi.org/10.1121/10.0023772
11. Liu S., Shao J. Current methods of acoustic analysis of voice: a review. Journal of Clinical Otorhinolaryngology Head and Neck Surgery, 36(12), 966–976 (2022). https://doi.org/10.13201/j.issn.2096-7993.2022.12.016
12. Zalazar I. A., Alzamendi G. A., Zañartu M. and Schlotthauer G. Maximum correntropy linear prediction for voice inverse filtering: theoretical framework and practical implementation. IEEE Transactions on Audio, Speech and Language Processing, 33, 152–162 (2025). https://doi.org/10.1109/TASLP.2024.3512187
13. Mishra J., Sharma R. K. Vocal tract acoustic measurements for detection of pathological voice disorders. Journal of Circuits, Systems and Computers, 33(10), 2450173 (2024). https://doi.org/10.1142/S0218126624501731
14. Palaparthi A., Titze I. R. Analysis of glottal inverse filtering in the presence of source-filter interaction. Speech Communication, 123(10), 98–108 (2020). https://doi.org/10.1016/j.specom.2020.07.003
15. Cabral J. P., Richmond K., Yamagishi J. and Renals S. Glottal spectral separation for speech synthesis. IEEE Journal of Selected Topics in Signal Processing, 8(2), 195–208 (2014). https://doi.org/10.1109/JSTSP.2014.2307274
16. Zhang Z. The influence of source-filter interaction on the voice source in a three-dimensional computational model of voice production. The Journal of the Acoustical Society of America, 154(4), 2462–2475 (2023). https://doi.org/10.1121/10.0021879
17. Wang Z., Gobl Ch. Contribution of the glottal flow residual in affect-related voice transformation. Proc. Interspeech 2022, Incheon, Korea, 5288–5292 (2022). https://doi.org/10.21437/Interspeech.2022-11038
18. Bharath K., Muthu R. K. New replay attack detection using iterative adaptive inverse filtering and high frequency band. Expert Systems with Applications, 195, 116597 (2022). https://doi.org/10.1016/j.eswa.2022.116597
19. Alku P., Murtola T., Malinen J., Kuortti J., Story B., Airaksinen M., Salmi M., Vilkman E., Geneid A. OPENGLOT – An open environment for the evaluation of glottal inverse filtering. Speech Communication, 107, 38–47 (2019). https://doi.org/10.1016/j.specom.2019.01.005
20. Python G., Demierre C., Bourqui M., Bourbon A., Chardenon E., Trouville R., Laganaro M., Fougeron C. Comparison of In-Person and Online recordings in the clinical teleassessment of speech production: A Pilot Study. Brain Sciences, 13(2), 342 (2023). https://doi.org/10.3390/brainsci13020342
21. Van der Woerd B., Wu M., Parsa V., Doyle P. C., Fung K. Evaluation of acoustic analyses of voice in nonoptimized conditions. Journal of Speech, Language, and Hearing Research, 63(12), 3991–3999 (2020). https://doi.org/10.1044/2020_JSLHR-20-00212
22. Song W., Yue Y., Zhang Ya-jie et al. Multi-speaker Multi-style speech synthesis with timbre and style disentanglement. In: Zhenhua L., Jianqing G., Kai Y., Jia J. (eds). Man-Machine Speech Communication: NCMMSC-2022. Communications in Computer and Information Science, 1765, Springer, Singapore (2022). https://doi.org/10.1007/978-981-99-2401-1_12
23. Savchenko V. V., Savchenko L. V. Method of voice source coding with data compression based on a linear prediction model. Izmeritel’naya Tekhnika, 74(3), 67–78 (2025). (In Russ.) https://doi.org/10.32446/0368-1025it.2025-3-67-78
24. Savchenko V. V. Method for comparison testing of parametric power spectrum estimates: spectral analysis via time series synthesis. Measurement Techniques, 66(6), 430–438 (2023). https://doi.org/10.1007/s11018-023-02244-3
25. Savchenko V. V., Savchenko L. V. Two-stage algorithm of spectral analysis for the automatic speech recognition systems. Measurement Techniques, 67(7), 533–563 (2024). https://doi.org/10.1007/s11018-024-02376-0
26. Savchenko V. V. Hybrid method of speech signals spectral analysis based on the autoregressive model and Schuster periodogram. Measurement Techniques, 66(3), 203–210 (2023). https://doi.org/10.1007/s11018-023-02211-y
27. Kazuya Y., Ishikawa S., Koba Y., Kijimoto Sh. and Sugiki Sh. Inverse analysis of vocal sound source using an analytical model of the vocal tract. Applied Acoustics, 150, 89–103 (2019). https://doi.org/10.1016/j.apacoust.2019.02.005
28. Savchenko V. V., Savchenko L. V. Suboptimal algorithm for measuring pitch frequency using discrete fourier transform of a speech signal. Journal of Communications Technology and Electronics, 68(7), 757–764 (2023). https://doi.org/10.1134/S1064226923060128
29. Benesty J., Chen J., Huang Y. Linear Prediction. In: Benesty J., Sondhi M. M., Huang Y. A. (eds). Springer Handbook of Speech Processing. Springer, Berlin, Heidelberg (2008). https://doi.org/10.1007/978-3-540-49127-9_7
30. Marple S. L. Digital spectral analysis with applications. 2nd ed., Mineola, Dover Publications, New York (2019).
31. Savchenko V. V. A measure of differences in speech signals by the voice timbre. Measurement Techniques, 66(10), 803–812 (2024). https://doi.org/10.1007/s11018-024-02294-1
32. Savchenko A. V., Savchenko V. V. Method for measurement the intensity of speech vowel sounds flow for audiovisual dialogue information systems. Measurement Techniques, 65(3), 219–226 (2022). https://doi.org/10.1007/s11018-022-02072-x
33. Dzerjinsky R. I., Panov A. V., Sazonov A. I. Analysis and Forecasting of Microprocessor Performance Dynamics. In: Silhavy R., Silhavy P. (eds). Software engineering methods design and application. CSOC 2024. Lecture Notes in Networks and Systems, 1118, Springer, Cham. (2024). https://doi.org/10.1007/978-3-031-70285-3_55
Supplementary files
Review
For citations:
Savchenko V.V., Savchenko L.V. Method of a voice source acoustic analysis in real time. Izmeritel`naya Tekhnika. 2025;74(4):64-73. (In Russ.) https://doi.org/10.32446/0368-1025it.2025-4-64-73