Vocal Source Contribution to Speaker Recognition
- Authors: Sorokin V.N.1
-
Affiliations:
- Institute for Information Transmission Problems
- Issue: Vol 28, No 3 (2018)
- Pages: 546-556
- Section: Applied Problems
- URL: https://bakhtiniada.ru/1054-6618/article/view/195440
- DOI: https://doi.org/10.1134/S1054661818030197
- ID: 195440
Cite item
Abstract
The vocal source and the pulse shape of the glottal flow are determined through the regularized ratio of the speech signal spectra at the intervals of the open and closed vocal slit within each period of the fundamental tone. Three databases were used: Russian numerals for 216 men and 177 women, the base obtained by converting the Russian database by the codec on 9.2 kbps, and the TIMIT database. The pitch period and 7 coefficients for the principal components of the glottal flow provide an average error of recognizing males below 8% for a sequence of 6 vowels. The minimum average recognition error for the initial base of Russian numerals for females makes about 15%, for males in the codec database makes about 15%, and for males in the TIMIT makes about 44%. The minimum average error of males’ recognition in the space of 7 coefficients for the principal components in the Russian database makes about 26%, but about 27% of the speakers have an average error of less than 10%.
About the authors
V. N. Sorokin
Institute for Information Transmission Problems
Author for correspondence.
Email: vns@iitp.ru
Russian Federation, Moscow, 127051
Supplementary files
