Preview

Izmeritel`naya Tekhnika

Advanced search
Open Access Open Access  Restricted Access Subscription Access

Method for measuring the intensity of speech vowel sounds f ow for audiovisual dialogue information systems

https://doi.org/10.32446/0368-1025it.2022-3-65-72

Abstract

In this paper, we consider the audiovisual data processing and interaction of two modalities for user’s emotional state prediction in dialogue information systems. The audio modality is used for real-time detection of emotional speech segments. As an indicator of the level of speech emotionality, it is proposed to estimate the intensity of the fl ow of vowels in the user's speech signal at the input of the information system. A novel method for measuring this indicator is proposed by using the empirical probability of the appearance of vowels in the speech signal. An example of its practical implementation in soft real time is provided. Using the author's software, a full-scale experiment was set up and carried out. The advantages of our method in terms of its high speed and sensitivity to the level of users' speech emotionality are shown. The obtained results are intended for developers of modern information systems with an audiovisual user interface.

About the Authors

A. V. Savchenko
National Research University Higher School of Economics
Russian Federation

Andrey V. Savchenko

Nizhniy Novgorod



V. V. Savchenko
Nizhny Novgorod State Linguistic University
Russian Federation

Vladimir V. Savchenko

Nizhniy Novgorod



References

1. Davis S. K., Morningstar M., Dirks M. A., Qualter P., Personality and Individual Diff erences, 2020, vol. 160, 109938. https://doi.org/10.1016/j.paid.2020.109938

2. Arana J. M., Gordillo F., Darias J., Mestas L., Computers in Human Behavior, 2020, vol. 104, 106156. https://doi.org/10.1016/j.chb.2019.106156

3. Savchenko L. V., Savchenko A. V., Measurement Techniques, 2021, vol. 64, no. 4. https://doi.org/10.1007/s11018-021-01935-z

4. Shaqra F. A., Duwairi R., Al-Ayyoub M., Procedia Computer Science, 2019, vol. 151, pp. 37–44. https://doi.org/10.1016/j.procs.2019.04.009

5. Savchenko A. V., Savchenko V. V., Izmeritel’naya tekhnika, 2021, no. 11, pp. 60–66. (In Russ.) https://doi.org/10.32446/0368-1025it.2021-11-60-66

6. Srinivas N., Pradhan G., Kumar P. K., Integration, 2018, vol. 63, pp. 185–195. https://doi.org/10.1016/j.vlsi.2018.07.005

7. Rammohan R., Dhanabalsamy N., Dimov V., Eidelman F. J., Journal of Allergy and Clinical Immunology, 2017, vol. 139, no. 2, ab250. https://doi.org/10.1016/j.jaci.2016.12.804

8. Akçay M. B., Oğuz K., Speech Communication, 2020, vol. 116, pp. 56–76. https://doi.org/10.1016/j.specom.2019.12.001

9. Bourguignon M., Molinaro N., Lizarazu M. et al., NeuroImage, 2020, vol. 216, 116788. https://doi.org/10.1016/j.neuroimage.2020.

10. Cardona D. B., Nedjah N., Mourelle L. M., Neurocomputing, 2017, vol. 265, pp. 78–90. https://doi.org/10.1016/j.neucom.2016.09.140 11. Cui S., Li E., Kang X., IEEE International Conference on Multimedia and Expo (ICME), 6–10 July 2020, London, UK, IEEE, 2020, pp. 1–6. https://doi.org/10.1109/ICME46284.2020.9102765

11. Kashani H. B., Sayadiyan A., Sheikhzadeh H., Speech Communication, 2017, vol. 91, pp. 28–48.

12. Yongda D., Fang L., Huang X., Computers & Electrical Engineering, 2018, vol. 72, pp. 443–454. https://doi.org/10.1016/j.compeleceng.2018.09.014

13. Akbulut F. P., Perros Harry G., Shahzad M., Computer Methods and Programs in Biomedicine, 2020, vol. 195, 105571. https://doi.org/10.1016/j.cmpb.2020.105571

14. Stasak B., Epps J., Goecke R., Computer Speech & Language, 2019, vol. 53, pp. 140–155. https://doi.org/10.1016/j.csl.2018.08.001

15. Asada T., Adachi R., Takada S. et al., Proceedings of International Conference on Artifi cial Life and Robotics, 13–16 January 2020, Beppu, Oita, Japan, 2020, ALife Robotics Corporation Ltd., 2020, vol. 2, pp. 398–402. https://doi.org/10.5954/ICAROB.2020.OS16-3

16. Juan D. S., Senoussaoui M., Granger E. et al., Multimodal Fusion with Deep Neural Networks for Audio-Video Emotion Recognition, 2019. https://arxiv.org/abs/1907.03196v1 [cs.CV].

17. Borovkov A. A. Matematicheskaya statistika, St. Petersburg, Lan’ Publ., 2010, 704 p. (In Russ.)

18. Kumar A., Shahnawazuddin S., Pradhan G., Circuits Systems, Signal Process, 2017, vol. 36, pp. 2315–2340. https://doi.org/10.1007/s00034-016-0409-1

19. Savchenko V. V., Radioelectronics and Communications Systems, 2020, vol. 63, pp. 532–542. https://doi.org/10.3103/S0735272720100039

20. Savchenko A. V., Savchenko V. V., Savchenko L. V., Optimization Letters, 2021, no. 7. https://doi.org/10.1007/s11590-021-01790-5

21. Candan Ç., Signal Processing, 2020, vol. 166, 107256. https://doi.org/10.1016/j.sigpro.2019.107256

22. Savchenko V. V., Reshenie problemy mnozhestvennyh sravnenij v zadachah avtomaticheskogo raspoznavaniya signalov na vyhode trakta rechevoj svyazi, Elektrosvyaz’, 2017, no. 12, pp. 22–27. (In Russ.)

23. Savchenko V. V., Savchenko A. V., Journal of Communications Technology and Electronics, 2020, vol. 65, no. 11, pp. 1311– 1317. https://doi.org/10.1134/S1064226920110157

24. Kullback S., Information Theory and Statistics, N.Y., Dover Publications, 1997, 432 p.

25. Savchenko V. V., Journal of Communications Technology and Electronics, 2019, vol. 64, no. 6, pp. 590–596. https://doi.org/10.1134/S1064226919060093

26. Gray R. M., Buzo A., Gray A. H., Matsuyama Y., IEEE Transactions on Signal Processing, 1980, vol. 28, no. 4, pp. 367–377. https://doi.org/10.1109/TASSP.1980.1163421

27. Savchenko V. V., Savchenko А. V., Radioelectronics and Communications Systems, 2019, vol. 62, pp. 276–286. https://doi.org/10.3103/S0735272719050042

28. Marple S. L., Digital Spectral Analysis with Applications, 2nd ed. Mineola, NY, Dover Publications, 2019, 432 p.

29. Perepelkina O., Kazimirova E., Konstantinova M., Proceedings of International Conference on Speech and Computer (SPECOM 2018), 18–22 September 2018, Leipzig, Germany, Springer, Cham, 2018, pp. 501–510. https://doi.org/10.1007/978-3-319-99579-3_52


Review

For citations:


Savchenko A.V., Savchenko V.V. Method for measuring the intensity of speech vowel sounds f ow for audiovisual dialogue information systems. Izmeritel`naya Tekhnika. 2022;(3):65-72. (In Russ.) https://doi.org/10.32446/0368-1025it.2022-3-65-72

Views: 115


ISSN 0368-1025 (Print)
ISSN 2949-5237 (Online)