26.1.12.2 Speech Analysis, other than Recognition

Chapter Contents (Back)
Speech. Not so much what is said, but other analysis

Howard, Jr., J.H.[James H.],
Feature selection in human auditory perception,
PR(15), No. 5, 1982, pp. 397-403.
Elsevier DOI 0309
BibRef

Thomason, M.G., Granum, E., Blake, R.E.,
Experiments in dynamic programming inference of Markov networks with strings representing speech data,
PR(19), No. 5, 1986, pp. 343-352.
Elsevier DOI 0309
BibRef

Hochberg, J., Mniszewski, S.M., Calleja, T., Papcun, G.J.,
A default hierarchy for pronouncing English,
PAMI(13), No. 9, September 1991, pp. 957-964.
IEEE DOI 0401
BibRef

Carlson, B.A., Clements, M.A.,
A computationally compact divergence measure for speech processing,
PAMI(13), No. 12, December 1991, pp. 1255-1260.
IEEE DOI 0401
BibRef

Tacer, B.[Berkant], Loughlin, P.J.[Patrick J.],
Non-stationary signal classification using the joint moments of time-frequency distributions,
PR(31), No. 11, November 1998, pp. 1635-1641.
Elsevier DOI 0401
BibRef

Li, M., McAllister, H.G., Black, N.D., de Perez, T.A.,
Wavelet-based nonlinear AGC method for hearing aid loudness compensation,
VISP(147), No. 6, December 2000, pp. 502-507. 0101
BibRef

Gray, P., Hollier, M.P., Massara, R.E.,
Non-intrusive speech-quality assessment using vocal-tract models,
VISP(147), No. 6, December 2000, pp. 493-501. 0101
BibRef

Sarkar, S., Poor, H.V.,
Multirate signal processing on finite fields,
VISP(148), No. 4, August 2001, pp. 254-262. 0201
BibRef

Mumolo, E.[Enzo],
Spectral domain texture analysis for speech enhancement,
PR(35), No. 10, October 2002, pp. 2181-2191.
Elsevier DOI 0206
BibRef

Ding, Z.O., McLoughlin, I.V., Tan, E.C.,
Extension of proposal of standards for intelligibility tests of Chinese speech: CDRT-tone,
VISP(150), No. 1, February 2003, pp. 1-5.
IEEE Top Reference. 0304
BibRef

de Lamare, R.C., Alcaim, A.,
Strategies to improve the performance of very low bit rate speech coders and application to a variable rate 1.2 kb/s codec,
VISP(152), No. 1, February 2005, pp. 74-86.
IEEE Abstract. 0501
BibRef

Vera-Candeas, P., Ruiz-Reyes, N., Rosa-Zurera, M., Lopez-Ferreras, F., Curpian-Alonso, J.,
New matching pursuit based sinusoidal modelling method for audio coding,
VISP(151), No. 1, February 2004, pp. 21-28.
IEEE Abstract. 0403
BibRef

Vera-Candeas, P.[Pedro], Ruiz-Reyes, N.[Nicolás], Rosa-Zurera, M.[Manuel], Cuevas-Martinez, J.C.[Juan C.], López-Ferreras, F.[Francisco],
Adaptive Signal Models for Wide-Band Speech and Audio Compression,
IbPRIA05(II:571).
Springer DOI 0509
BibRef

Li, C., Li, S., Zhang, D., Chen, G.,
Cryptanalysis of a data securityp protection scheme for VoIP,
VISP(153), No. 1, February 2006, pp. 1-10.
DOI Link 0602
BibRef

Sandler, M., Black, D.,
Scalable audio coding for compression and loss resilient streaming,
VISP(153), No. 3, June 2006, pp. 331-339.
DOI Link 0608
BibRef

Guido, R.C.[Rodrigo Capobianco], Pereira, J.C.[Jose Carlos], Slaets, J.F.W.[Jan Frans Willem],
Introduction to the Special Issue: Advances on pattern recognition for speech and audio processing,
PRL(28), No. 11, 1 August 2007, pp. 1283-1284.
Elsevier DOI 0706
BibRef

Chang, J.H.[Joon-Hyuk], Gazor, S.[Saeed], Kim, N.S.[Nam Soo], Mitra, S.K.[Sanjit K.],
Multiple statistical models for soft decision in noisy speech enhancement,
PR(40), No. 3, March 2007, pp. 1123-1134.
Elsevier DOI 0611
Speech enhancement; DCT; Multiple statistical model; Gaussian; Laplacian; Gamma; GOF; PSFM; SAP; PESQ BibRef

Frankel, J.[Joe], King, S.[Simon],
Factoring Gaussian precision matrices for linear dynamic models,
PRL(28), No. 16, December 2007, pp. 2264-2272.
Elsevier DOI 0711
Linear dynamic model; Error distribution; Precision matrix Speech. BibRef

Arias-Londono, J.D.[Julian D.], Godino-Llorente, J.I.[Juan I.], Saenz-Lechon, N.[Nicolas], Osma-Ruiz, V.[Victor], Castellanos-Dominguez, C.G.[Cesar German],
An improved method for voice pathology detection by means of a HMM-based feature space transformation,
PR(43), No. 9, September 2010, pp. 3100-3112.
Elsevier DOI 1006
Pathological voice; Hidden Markov models; Minimum classification error; Dynamic feature space transformation BibRef

Mahdi, A.E.[Abdulhussain E.], Picovici, D.[Dorel],
New single-ended objective measure for non-intrusive speech quality evaluation,
SIViP(4), No. 1, March 2010, pp. xx-yy.
Springer DOI 1003
BibRef

Guijarrubia, V.G.[Víctor G.], Torres, M.I.[M. Inés],
Text- and speech-based phonotactic models for spoken language identification of Basque and Spanish,
PRL(31), No. 6, 15 April 2010, pp. 523-532.
Elsevier DOI 1004
BibRef
Earlier:
Comparative Study of Several Phonotactic-Based Approaches to Spanish-Basque Language Identification,
CIARP08(128-135).
Springer DOI 0809
BibRef
Earlier:
Phone-Segments Based Language Identification for Spanish, Basque and English,
CIARP07(106-114).
Springer DOI 0711
BibRef
And:
Language Identification Based on Phone Decoding for Basque and Spanish,
IbPRIA07(I: 233-240).
Springer DOI 0706
Language identification; Phone decoding; Pprlm BibRef

Shafiee, S.[Soheil], Almasganj, F.[Farshad], Vazirnezhad, B.[Bahram], Jafari, A.[Ayyoob],
A two-stage speech activity detection system considering fractal aspects of prosody,
PRL(31), No. 9, 1 July 2010, pp. 936-948.
Elsevier DOI 1004
Speech activity detection; Prosody; Fractal dimension BibRef

Yoon, J.Y.[Jae-Yul], Park, H.[Hochong],
Improving the Speech Quality of VoIP by Packet Prioritization,
SPLetters(18), No. 12, December 2011, pp. 725-728.
IEEE DOI 1112
BibRef

Dennis, J., Tran, H.D., Li, H.,
Spectrogram Image Feature for Sound Event Classification in Mismatched Conditions,
SPLetters(18), No. 2, February 2011, pp. 130-133.
IEEE DOI 1101
BibRef

Liang, Y.[Yuan], Liu, X.L.[Xiang-Long], Lou, Y.H.[Yi-Hua], Shan, B.S.[Bao-Song],
An improved noise-robust voice activity detector based on hidden semi-Markov models,
PRL(32), No. 7, 1 May 2011, pp. 1044-1053.
Elsevier DOI 1101
Voice activity detection; State duration; Observation distribution; Hidden semi-Markov model; Likelihood ratio test; Forward variable BibRef

Liu, X.L.[Xiang-Long], Liang, Y.[Yuan], Lou, Y.H.[Yi-Hua], Li, H.[He], Shan, B.S.[Bao-Song],
Noise-Robust Voice Activity Detector Based on Hidden Semi-Markov Models,
ICPR10(81-84).
IEEE DOI 1008
BibRef

Mohanty, M.N.[Mihir Narayan], Jena, B.[Bhagyalaxmi],
Analysis of stressed human speech,
IJCVR(2), No. 2, 2011, pp. 180-187.
DOI Link 1109
BibRef

Lopez-Moreno, I., Ramos, D., Gonzalez-Dominguez, J., Gonzalez-Rodriguez, J.,
Von Mises-Fisher Models in the Total Variability Subspace for Language Recognition,
SPLetters(18), No. 12, December 2011, pp. 705-708.
IEEE DOI 1112
BibRef

Jelassi, S.[Sofiene], Rubino, G.[Gerardo],
A study of artificial speech quality assessors of VoIP calls subject to limited bursty packet losses,
JIVP(2011), No. 1 2011, pp. xx-yy.
DOI Link 1203
BibRef

Ben Aicha, A.[Anis], Ben Jebara, S.[Sofia],
Reduction of musical residual noise using perceptual tools with classic speech denoising techniques,
SIViP(6), No. 1, March 2012, pp. 85-97.
WWW Link. 1203
BibRef

Pulakka, H., Laaksonen, L., Myllyla, V., Yrttiaho, Y., Alku, P.,
Conversational Evaluation of Speech Bandwidth Extension Using a Mobile Handset,
SPLetters(19), No. 4, April 2012, pp. 203-206.
IEEE DOI 1203
BibRef

Liang, S.[Shan], Liu, W.J.[Wen-Ju], Jiang, W.[Wei],
Integrating Binary Mask Estimation With MRF Priors of Cochleagram for Speech Separation,
SPLetters(19), No. 10, October 2012, pp. 627-630.
IEEE DOI 1209
BibRef

Esch, T., Rungeler, M., Heese, F., Vary, P.,
Estimation of Rapidly Time-Varying Harmonic Noise for Speech Enhancement,
SPLetters(19), No. 10, October 2012, pp. 659-662.
IEEE DOI 1209
BibRef

Safavi, S., Hanani, A., Russell, M., Jancovic, P., Carey, M.J.,
Contrasting the Effects of Different Frequency Bands on Speaker and Accent Identification,
SPLetters(19), No. 12, December 2012, pp. 829-832.
IEEE DOI 1212
BibRef

Safavi, S., Khan, U.A.,
Revisiting Finite-Time Distributed Algorithms via Successive Nulling of Eigenvalues,
SPLetters(22), No. 1, January 2015, pp. 54-57.
IEEE DOI 1410
directed graphs BibRef

Wu, Z.Z.[Zhi-Zheng], Kinnunen, T., Chng, E.S.[Eng Siong], Li, H.Z.[Hai-Zhou],
Mixture of Factor Analyzers Using Priors From Non-Parallel Speech for Voice Conversion,
SPLetters(19), No. 12, December 2012, pp. 914-917.
IEEE DOI 1212
BibRef

Valero, X., Alias, F.,
Gammatone Cepstral Coefficients: Biologically Inspired Features for Non-Speech Audio Classification,
MultMed(14), No. 6, 2012, pp. 1684-1689.
IEEE DOI 1212
BibRef

Weninger, F.[Felix], Krajewski, J.[Jarek], Batliner, A.[Anton], Schuller, B.[Björn],
The Voice of Leadership: Models and Performances of Automatic Analysis in Online Speeches,
AffCom(3), No. 4 2012, pp. 496-508.
IEEE DOI 1302
BibRef

Gerkmann, T., Krawczyk, M.,
MMSE-Optimal Spectral Amplitude Estimation Given the STFT-Phase,
SPLetters(20), No. 2, February 2013, pp. 129-132.
IEEE DOI 1302
BibRef

Kim, H.G.[Han-Gyu], Jang, G.J.[Gil-Jin], Park, J.S.[Jeong-Sik], Kim, J.H.[Ji-Hwan], Oh, Y.H.[Yung-Hwan],
Particle filtering based pitch sequence correction for monaural speech segregation,
IJIST(23), No. 1, March 2013, pp. 64-70.
DOI Link 1303
BibRef

Dessein, A., Cont, A.,
An Information-Geometric Approach to Real-Time Audio Segmentation,
SPLetters(20), No. 4, April 2013, pp. 331-334.
IEEE DOI 1303
BibRef

Drugman, T.,
Residual Excitation Skewness for Automatic Speech Polarity Detection,
SPLetters(20), No. 4, April 2013, pp. 387-390.
IEEE DOI 1303
BibRef

Yadav, J., Rao, K.S.,
Detection of Vowel Offset Point From Speech Signal,
SPLetters(20), No. 4, April 2013, pp. 299-302.
IEEE DOI 1303
BibRef

Mohammadiha, N., Martin, R., Leijon, A.,
Spectral Domain Speech Enhancement Using HMM State-Dependent Super-Gaussian Priors,
SPLetters(20), No. 3, March 2013, pp. 253-256.
IEEE DOI 1303
BibRef

Taal, C.H., Jensen, J., Leijon, A.,
On Optimal Linear Filtering of Speech for Near-End Listening Enhancement,
SPLetters(20), No. 3, March 2013, pp. 225-228.
IEEE DOI 1303
BibRef

Teng, P., Jia, Y.,
Voice Activity Detection Via Noise Reducing Using Non-Negative Sparse Coding,
SPLetters(20), No. 5, May 2013, pp. 475-478.
IEEE DOI 1304
BibRef

Romoli, L., Cecchi, S., Piazza, F.,
A Combined Approach for Channel Decorrelation in Stereo Acoustic Echo Cancellation Exploiting Time-Varying Frequency Shifting,
SPLetters(20), No. 7, 2013, pp. 717-720.
IEEE DOI 1307
BibRef

Szurley, J., Bertrand, A., Moonen, M.,
On the Use of Time-Domain Widely Linear Filtering for Binaural Speech Enhancement,
SPLetters(20), No. 7, 2013, pp. 649-652.
IEEE DOI 1307
speech enhancement BibRef

Sarria-Paja, M., Falk, T.H.,
Whispered Speech Detection in Noise Using Auditory-Inspired Modulation Spectrum Features,
SPLetters(20), No. 8, 2013, pp. 783-786.
IEEE DOI 1307
Gaussian processes BibRef

Ramirez, M.A.,
Intra-Predictive Switched Split Vector Quantization of Speech Spectra,
SPLetters(20), No. 8, 2013, pp. 791-794.
IEEE DOI 1307
Gaussian processes BibRef

Ying, D., Yan, Y.,
Robust and Fast Localization of Single Speech Source Using a Planar Array,
SPLetters(20), No. 9, 2013, pp. 909-912.
IEEE DOI 1308
Concave cost function BibRef

Moller, S., Heusdens, R.,
Objective Estimation of Speech Quality for Communication Systems,
PIEEE(101), No. 9, 2013, pp. 1955-1967.
IEEE DOI 1309
Prediction models BibRef

Mowlaee, P., Saeidi, R.,
Iterative Closed-Loop Phase-Aware Single-Channel Speech Enhancement,
SPLetters(20), No. 12, 2013, pp. 1235-1239.
IEEE DOI 1311
Delays BibRef

Kulmer, J., Mowlaee, P.,
Phase Estimation in Single Channel Speech Enhancement Using Phase Decomposition,
SPLetters(22), No. 5, May 2015, pp. 598-602.
IEEE DOI 1411
Harmonic analysis BibRef

Ganapathy, S., Pelecanos, J.,
Enhancing Frequency Shifted Speech Signals in Single Side-Band Communication,
SPLetters(20), No. 12, 2013, pp. 1231-1234.
IEEE DOI 1311
radio receivers BibRef

Traa, J., Smaragdis, P.,
A Wrapped Kalman Filter for Azimuthal Speaker Tracking,
SPLetters(20), No. 12, 2013, pp. 1257-1260.
IEEE DOI 1311
Approximation methods BibRef

Hu, P.F.[Peng-Fei], Liu, W.J.[Wen-Ju], Jiang, W.[Wei], Yang, Z.L.[Zhan-Lei],
Latent topic model for audio retrieval,
PR(47), No. 3, 2014, pp. 1138-1143.
Elsevier DOI 1312
Topic model BibRef

Drugman, T.,
Maximum Phase Modeling for Sparse Linear Prediction of Speech,
SPLetters(21), No. 2, February 2014, pp. 185-189.
IEEE DOI 1402
filtering theory BibRef

Xu, Y.[Yong], Du, J.[Jun], Dai, L.R.[Li-Rong], Lee, C.H.[Chin-Hui],
An Experimental Study on Speech Enhancement Based on Deep Neural Networks,
SPLetters(21), No. 1, January 2014, pp. 65-68.
IEEE DOI 1402
BibRef

Jin, Y.G.[Yu Gwang], Shin, J.W.[Jong Won], Kim, N.S.[Nam Soo],
Spectro-Temporal Filtering for Multichannel Speech Enhancement in Short-Time Fourier Transform Domain,
SPLetters(21), No. 3, March 2014, pp. 352-355.
IEEE DOI 1403
Fourier transforms BibRef

Kwon, K.[Kisoo], Shin, J.W.[Jong Won], Kim, N.S.[Nam Soo],
NMF-Based Speech Enhancement Using Bases Update,
SPLetters(22), No. 4, April 2015, pp. 450-454.
IEEE DOI 1411
matrix decomposition BibRef

Arsikere, H., Lulich, S.M., Alwan, A.,
Estimating Speaker Height and Subglottal Resonances Using MFCCs and GMMs,
SPLetters(21), No. 2, February 2014, pp. 159-162.
IEEE DOI 1402
Gaussian processes BibRef

He, L., Zhang, J., Liu, Q., Yin, H., Lech, M.,
Automatic Evaluation of Hypernasality and Consonant Misarticulation in Cleft Palate Speech,
SPLetters(21), No. 10, October 2014, pp. 1298-1301.
IEEE DOI 1407
Accuracy BibRef

Nathwani, K., Pandit, P., Hegde, R.M.,
Group Delay Based Methods for Speaker Segregation and its Application in Multimedia Information Retrieval,
MultMed(15), No. 6, 2013, pp. 1326-1339.
IEEE DOI 1309
Correlation BibRef

Xie, D.[Danhui], Zhang, W.B.[Wei-Bin],
Estimating Speech Spectral Amplitude Based on the Nakagami Approximation,
SPLetters(21), No. 11, November 2014, pp. 1375-1379.
IEEE DOI 1408
Gaussian distribution BibRef

Drugman, T., Stylianou, Y.,
Fast Inter-Harmonic Reconstruction for Spectral Envelope Estimation in High-Pitched Voices,
SPLetters(21), No. 11, November 2014, pp. 1418-1422.
IEEE DOI 1408
harmonic analysis BibRef

Drugman, T., Stylianou, Y., Kida, Y., Akamine, M.,
Voice Activity Detection: Merging Source and Filter-based Information,
SPLetters(23), No. 2, February 2016, pp. 252-256.
IEEE DOI 1602
filtering theory BibRef

Zheng, C.S.[Cheng-Shi], Peng, R.H.[Ren-Hua], Li, J.[Jian], Li, X.D.[Xiao-Dong],
A Constrained MMSE LP Residual Estimator for Speech Dereverberation in Noisy Environments,
SPLetters(21), No. 12, December 2014, pp. 1462-1466.
IEEE DOI 1410
least mean squares methods BibRef

Sarma, B.D., Prasanna, S.R.M.,
Analysis of Vocal Tract Constrictions using Zero Frequency Filtering,
SPLetters(21), No. 12, December 2014, pp. 1481-1485.
IEEE DOI 1410
filtering theory BibRef

Kim, M., Smaragdis, P.,
Mixtures of Local Dictionaries for Unsupervised Speech Enhancement,
SPLetters(22), No. 3, March 2015, pp. 293-297.
IEEE DOI 1410
Dictionaries BibRef

Kleijn, W.B., Hendriks, R.C.,
A Simple Model of Speech Communication and its Application to Intelligibility Enhancement,
SPLetters(22), No. 3, March 2015, pp. 303-307.
IEEE DOI 1410
Auditory system BibRef

Ko, Y.J.[Young-Joong],
New feature weighting approaches for speech-act classification,
PRL(51), No. 1, 2015, pp. 107-111.
Elsevier DOI 1412
Natural language processing BibRef

Degottex, G.,
A Time Regularization Technique for Discrete Spectral Envelopes Through Frequency Derivative,
SPLetters(22), No. 7, July 2015, pp. 978-982.
IEEE DOI 1412
Cepstral analysis BibRef

Mysore, G.J.,
Can we Automatically Transform Speech Recorded on Common Consumer Devices in Real-World Environments into Professional Production Quality Speech?: A Dataset, Insights, and Challenges,
SPLetters(22), No. 8, August 2015, pp. 1006-1010.
IEEE DOI 1502
audio recording BibRef

Doclo, S., Kellermann, W., Makino, S., Nordholm, S.E.,
Multichannel Signal Enhancement Algorithms for Assisted Listening Devices: Exploiting spatial diversity using multiple microphones,
SPMag(32), No. 2, March 2015, pp. 18-30.
IEEE DOI 1503
audio signal processing BibRef

Kowalczyk, K., Thiergart, O., Taseska, M., del Galdo, G., Pulkki, V., Habets, E.A.P.,
Parametric Spatial Sound Processing: A flexible and efficient solution to sound scene acquisition, modification, and reproduction,
SPMag(32), No. 2, March 2015, pp. 31-42.
IEEE DOI 1503
audio signal processing BibRef

Kleijn, W.B., Crespo, J.B., Hendriks, R.C., Petkov, P., Sauert, B., Vary, P.,
Optimizing Speech Intelligibility in a Noisy Environment: A unified view,
SPMag(32), No. 2, March 2015, pp. 43-54.
IEEE DOI 1503
speech enhancement BibRef

Gerkmann, T., Krawczyk-Becker, M., Le Roux, J.,
Phase Processing for Single-Channel Speech Enhancement: History and recent advances,
SPMag(32), No. 2, March 2015, pp. 55-66.
IEEE DOI 1503
array signal processing BibRef

Wouters, J., McDermott, H.J., Francart, T.,
Sound Coding in Cochlear Implants: From electric pulses to hearing,
SPMag(32), No. 2, March 2015, pp. 67-80.
IEEE DOI 1503
acoustic signal processing BibRef

Betlehem, T., Zhang, W.[Wen], Poletti, M.A., Abhayapala, T.D.,
Personal Sound Zones: Delivering interface-free audio to multiple listeners,
SPMag(32), No. 2, March 2015, pp. 81-91.
IEEE DOI 1503
audio signal processing BibRef

Valimaki, V., Franck, A., Ramo, J., Gamper, H., Savioja, L.,
Assisted Listening Using a Headset: Enhancing audio perception in real, augmented, and virtual environments,
SPMag(32), No. 2, March 2015, pp. 92-99.
IEEE DOI 1503
audio signal processing BibRef

Sunder, K., He, J.J.[Jian-Jun], Tan, E.L.[Ee Leng], Gan, W.S.[Woon-Seng],
Natural Sound Rendering for Headphones: Integration of signal processing techniques,
SPMag(32), No. 2, March 2015, pp. 100-113.
IEEE DOI 1503
audio signal processing BibRef

Falk, T.H., Parsa, V., Santos, J.F., Arehart, K., Hazrati, O., Huber, R., Kates, J.M., Scollie, S.,
Objective Quality and Intelligibility Prediction for Users of Assistive Listening Devices: Advantages and limitations of existing tools,
SPMag(32), No. 2, March 2015, pp. 114-124.
IEEE DOI 1503
hearing aids BibRef

Saeedi, J.[Jamal], Ahadi, S.M.[Seyed Mohammad], Faez, K.[Karim],
Robust voice activity detection directed by noise classification,
SIViP(9), No. 3, March 2015, pp. 561-572.
WWW Link. 1503
BibRef

Ozawa, K.[Kenji], Tsukahara, S.[Shota], Kinoshita, Y.[Yuichiro], Morise, M.[Masanori],
Instantaneous Evaluation of the Sense of Presence in Audio-Visual Content,
IEICE(E98-D), No. 1, January 2015, pp. 49-57.
WWW Link. 1503
BibRef

Ozawa, K.[Kenji], Tsukahara, S.[Shota], Kinoshita, Y.[Yuichiro], Morise, M.[Masanori],
Development of an Estimation Model for Instantaneous Presence in Audio-Visual Content,
IEICE(E99-D), No. 1, January 2016, pp. 120-127.
WWW Link. 1601
BibRef

Yao, X., Jitsuhiro, T., Miyajima, C., Kitaoka, N., Takeda, K.,
Modeling of Physical Characteristics of Speech under Stress,
SPLetters(22), No. 10, October 2015, pp. 1801-1805.
IEEE DOI 1506
Atmospheric modeling BibRef

Adiga, N., Prasanna, S.R.M.,
Detection of Glottal Activity Using Different Attributes of Source Information,
SPLetters(22), No. 11, November 2015, pp. 2107-2111.
IEEE DOI 1509
feature extraction BibRef

Tong, R.J.[Ren-Jie], Bao, G.Z.[Guang-Zhao], Ye, Z.F.[Zhong-Fu],
A Higher Order Subspace Algorithm for Multichannel Speech Enhancement,
SPLetters(22), No. 11, November 2015, pp. 2004-2008.
IEEE DOI 1509
AWGN BibRef

Tong, R.J.[Ren-Jie], Ye, Z.F.[Zhong-Fu],
Supplementations to the Higher Order Subspace Algorithm for Suppression of Spatially Colored Noise,
SPLetters(24), No. 5, May 2017, pp. 668-672.
IEEE DOI 1704
Colored noise BibRef

Meenakshi, G.N., Ghosh, P.K.,
Robust Whisper Activity Detection Using Long-Term Log Energy Variation of Sub-Band Signal,
SPLetters(22), No. 11, November 2015, pp. 1859-1863.
IEEE DOI 1509
signal detection BibRef

Hsu, C.C.[Chung-Chien], Cheong, K.M.[Kah-Meng], Chi, T.S.[Tai-Shih], Tsao, Y.[Yu],
Robust Voice Activity Detection Algorithm Based on Feature of Frequency Modulation of Harmonics and Its DSP Implementation,
IEICE(E98-D), No. 10, October 2015, pp. 1808-1817.
WWW Link. 1511
BibRef

Tavares, R., Coelho, R.,
Speech Enhancement with Nonstationary Acoustic Noise Detection in Time Domain,
SPLetters(23), No. 1, January 2016, pp. 6-10.
IEEE DOI 1601
speech enhancement BibRef

Lachachi, N.E.[Nour-Eddine], Adla, A.[Abdelkader],
Two approaches-based L2-SVMs reduced to MEB problems for dialect identification,
IJCVR(6), No. 1-2, 2016, pp. 1-18.
DOI Link 1601
BibRef

Gholami-Boroujeny, S.[Shiva], Fallatah, A.[Anwar], Heffernan, B.P.[Brian P.], Dajani, H.R.[Hilmi R.],
Neural network-based adaptive noise cancellation for enhancement of speech auditory brainstem responses,
SIViP(10), No. 1, February 2016, pp. 389-395.
Springer DOI 1601
BibRef

Luo, Y.[You], Bao, G.Z.[Guang-Zhao], Xu, Y.F.[Yang-Fei], Ye, Z.F.[Zhong-Fu],
Supervised Monaural Speech Enhancement Using Complementary Joint Sparse Representations,
SPLetters(23), No. 2, February 2016, pp. 237-241.
IEEE DOI 1602
BibRef

Braun, S., Habets, E.A.P.,
Online Dereverberation for Dynamic Scenarios Using a Kalman Filter With an Autoregressive Model,
SPLetters(23), No. 12, December 2016, pp. 1741-1745.
IEEE DOI 1612
Fourier transforms BibRef

Chakrabarty, S., Habets, E.A.P.,
On the Numerical Instability of an LCMV Beamformer for a Uniform Linear Array,
SPLetters(23), No. 2, February 2016, pp. 272-276.
IEEE DOI 1602
Fourier transforms BibRef

Cherkassky, D., Gannot, S.,
New Insights into the Kalman Filter Beamformer: Applications to Speech and Robustness,
SPLetters(23), No. 3, March 2016, pp. 376-380.
IEEE DOI 1603
Kalman filters BibRef

Chung, H., Plourde, E., Champagne, B.,
Discriminative Training of NMF Model Based on Class Probabilities for Speech Enhancement,
SPLetters(23), No. 4, April 2016, pp. 502-506.
IEEE DOI 1604
Convergence BibRef

Helmrich, C.R., Edler, B.,
Audio Coding Using Overlap and Kernel Adaptation,
SPLetters(23), No. 5, May 2016, pp. 590-594.
IEEE DOI 1604
audio coding BibRef

Eyben, F., Scherer, K.R., Schuller, B.W., Sundberg, J., André, E., Busso, C., Devillers, L.Y., Epps, J., Laukka, P., Narayanan, S.S., Truong, K.P.,
The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing,
AffCom(7), No. 2, April 2016, pp. 190-202.
IEEE DOI 1606
Frequency measurement BibRef

Wang, J., Shang, Y., Jiang, S., Gowda, D., Lv, K.,
Whispered Speech Detection Using Fusion of Group-Delay-Based Subband Modulation Spectrum and Correntropy Features,
SPLetters(23), No. 8, August 2016, pp. 1042-1046.
IEEE DOI 1608
entropy BibRef

Wang, S.S., Chern, A., Tsao, Y., Hung, J.W., Lu, X., Lai, Y.H., Su, B.,
Wavelet Speech Enhancement Based on Nonnegative Matrix Factorization,
SPLetters(23), No. 8, August 2016, pp. 1101-1105.
IEEE DOI 1608
Fourier transforms BibRef

López-Oller, D., Gomez, A.M., Pérez-Córdoba, J.L., Sánchez, V.,
An Error Mitigation Technique for Erasure Channels Based on a Wavelet Representation of the Speech Excitation Signal,
MultMed(18), No. 7, July 2016, pp. 1245-1256.
IEEE DOI 1608
Haar transforms BibRef

Strasser, F., Puder, H.,
Correlation Detection for Adaptive Feedback Cancellation in Hearing Aids,
SPLetters(23), No. 7, July 2016, pp. 979-983.
IEEE DOI 1608
Acoustics BibRef

Park, J., Jin, Y.G., Hwang, S., Shin, J.W.,
Dual Microphone Voice Activity Detection Exploiting Interchannel Time and Level Differences,
SPLetters(23), No. 10, October 2016, pp. 1335-1339.
IEEE DOI 1610
acoustic signal detection BibRef

Petkov, P.N., Stylianou, Y.,
Adaptive Gain Control for Enhanced Speech Intelligibility Under Reverberation,
SPLetters(23), No. 10, October 2016, pp. 1434-1438.
IEEE DOI 1610
adaptive control BibRef

Kobayashi, K.[Kazuhiro], Toda, T.[Tomoki], Nakano, T.[Tomoyasu], Goto, M.[Masataka], Nakamura, S.[Satoshi],
Improvements of Voice Timbre Control Based on Perceived Age in Singing Voice Conversion,
IEICE(E99-D), No. 11, November 2016, pp. 2767-2777.
WWW Link. 1611
BibRef

Wang, Y., Zhao, S., Li, J., Kuang, J.,
Speech Bandwidth Extension Using Recurrent Temporal Restricted Boltzmann Machines,
SPLetters(23), No. 12, December 2016, pp. 1877-1881.
IEEE DOI 1612
Boltzmann machines BibRef

Prathosh, A.P., Sujith, P., Ramakrishnan, A.G., Kumar Ghosh, P.,
Cumulative Impulse Strength for Epoch Extraction,
SPLetters(23), No. 4, April 2016, pp. 424-428.
IEEE DOI 1604
speech processing BibRef

Vignolo, L.D.[Leandro D.], Prasanna, S.R.M.[S.R. Mahadeva], Dandapat, S.[Samarendra], Rufiner, H.L.[H. Leonardo], Milone, D.H.[Diego H.],
Feature optimisation for stress recognition in speech,
PRL(84), No. 1, 2016, pp. 1-7.
Elsevier DOI 1612
Evolutionary algorithms BibRef

Sun, P., Qin, J.,
Low-Rank and Sparsity Analysis Applied to Speech Enhancement Via Online Estimated Dictionary,
SPLetters(23), No. 12, December 2016, pp. 1862-1866.
IEEE DOI 1612
expectation-maximisation algorithm BibRef

Jukic, A., van Waterschoot, T., Doclo, S.,
Adaptive Speech Dereverberation Using Constrained Sparse Multichannel Linear Prediction,
SPLetters(24), No. 1, January 2017, pp. 101-105.
IEEE DOI 1702
minimisation BibRef

Jiao, Y., Berisha, V., Liss, J., Hsu, S.C., Levy, E., McAuliffe, M.,
Articulation Entropy: An Unsupervised Measure of Articulatory Precision,
SPLetters(24), No. 4, April 2017, pp. 485-489.
IEEE DOI 1704
Acoustic measurements BibRef

Airaksinen, M., Bollepalli, B., Pohjalainen, J., Alku, P.,
Glottal Vocoding With Frequency-Warped Time-Weighted Linear Prediction,
SPLetters(24), No. 4, April 2017, pp. 446-450.
IEEE DOI 1704
speech coding BibRef

Chetupalli, S.R., Sreenivas, T.V.,
Joint Bayesian Estimation of Time-Varying LP Parameters and Excitation for Speech,
SPLetters(24), No. 4, April 2017, pp. 357-361.
IEEE DOI 1704
Gaussian processes BibRef

Chollet, M., Scherer, S.,
Assessing Public Speaking Ability from Thin Slices of Behavior,
FG17(310-316)
IEEE DOI 1707
Feature extraction, Interviews, Public speaking, Speech, Training, Videos, Visualization BibRef

de-la-Calle-Silos, F., Stern, R.M.,
Synchrony-Based Feature Extraction for Robust Automatic Speech Recognition,
SPLetters(24), No. 8, August 2017, pp. 1158-1162.
IEEE DOI 1708
feature extraction, speech recognition, auditory-nerve activity, feature extraction schemes, generalized synchrony detector, robust automatic speech recognition, BibRef

Zhang, Q., Chen, Z., Yin, F.,
Speaker Tracking Based on Distributed Particle Filter in Distributed Microphone Networks,
SMCS(47), No. 9, September 2017, pp. 2433-2443.
IEEE DOI 1708
Bayes methods, Cybernetics, Estimation, Kalman filters, Microphones, Particle filters, Reverberation, Average consensus filter, distributed microphone networks, distributed particle filter (DPF), multiple-hypothesis model, speaker tracking. BibRef

Ávila, F.R., Tcheou, M.P., Biscainho, L.W.P.,
Audio Soft Declipping Based on Constrained Weighted Least Squares,
SPLetters(24), No. 9, September 2017, pp. 1348-1352.
IEEE DOI 1708
Cost function, Discrete cosine transforms, Frequency-domain analysis, Nonlinear distortion, Predistortion, Speech, Audio declipping, nonlinear signal processing, sparsity, weighted least squares (WLS) BibRef

Huang, Z.[Zhen], Siniscalchi, S.M.[Sabato Marco], Lee, C.H.[Chin-Hui],
Hierarchical Bayesian combination of plug-in maximum a posteriori decoders in deep neural networks-based speech recognition and speaker adaptation,
PRL(98), No. 1, 2017, pp. 1-7.
Elsevier DOI 1710
System, combination BibRef

Reddy, C.K.A.[C. Karadagur Ananda], Shankar, N., Bhat, G.S.[G. Shreedhar], Charan, R., Panahi, I.,
An Individualized Super-Gaussian Single Microphone Speech Enhancement for Hearing Aid Users With Smartphone as an Assistive Device,
SPLetters(24), No. 11, November 2017, pp. 1601-1605.
IEEE DOI 1710
hearing aids, maximum likelihood estimation, signal denoising BibRef

Nishimura, R.[Ryouichi], Enomoto, S.[Seigo], Kato, H.[Hiroaki],
Speech Privacy for Sound Surveillance Using Super-Resolution Based on Maximum Likelihood and Bayesian Linear Regression,
IEICE(E101-D), No. 1, January 2018, pp. 53-63.
WWW Link. 1801
BibRef

van Kuyk, S., Kleijn, W.B., Hendriks, R.C.,
An Instrumental Intelligibility Metric Based on Information Theory,
SPLetters(25), No. 1, January 2018, pp. 115-119.
IEEE DOI 1801
information theory, speech enhancement, speech intelligibility, SIIB, information theoretic intelligibility metrics, mutual information BibRef

Chee, K.Y.[Kong-Yik], Jin, Z.[Zhe], Cai, D.[Danwei], Li, M.[Ming], Yap, W.S.[Wun-She], Lai, Y.L.[Yen-Lung], Goi, B.M.[Bok-Min],
Cancellable speech template via random binary orthogonal matrices projection hashing,
PR(76), No. 1, 2018, pp. 273-287.
Elsevier DOI 1801
Cancellable biometrics BibRef

Zão, L., Coelho, R.,
On the Estimation of Fundamental Frequency From Nonstationary Noisy Speech Signals Based on the Hilbert-Huang Transform,
SPLetters(25), No. 2, February 2018, pp. 248-252.
IEEE DOI 1802
Hilbert transforms, speech enhancement, HHT-Amp, Hilbert-Huang transform, decomposition modes, nonstationary acoustic noises BibRef

Bernardini, A., Antonacci, F., Sarti, A.,
Wave Digital Implementation of Robust First-Order Differential Microphone Arrays,
SPLetters(25), No. 2, February 2018, pp. 253-257.
IEEE DOI 1802
acoustic signal processing, array signal processing, delays, microphone arrays, multiplying circuits, time-domain analysis, wave digital filters (WDFs) BibRef

Liu, Q., Wang, W., de Campos, T.E., Jackson, P.J.B., Hilton, A.,
Multiple Speaker Tracking in Spatial Audio via PHD Filtering and Depth-Audio Fusion,
MultMed(20), No. 7, July 2018, pp. 1767-1780.
IEEE DOI 1806
Azimuth, Clutter, Metadata, Microphones, Target tracking, Trajectory, Multi-person tracking, spatial audio BibRef

Lee, J., Skoglund, J., Shabestary, T., Kang, H.,
Phase-Sensitive Joint Learning Algorithms for Deep Learning-Based Speech Enhancement,
SPLetters(25), No. 8, August 2018, pp. 1276-1280.
IEEE DOI 1808
learning (artificial intelligence), speech enhancement, time-frequency analysis, single-channel speech enhancement BibRef

Lu, R., Duan, Z., Zhang, C.,
Listen and Look: Audio-Visual Matching Assisted Speech Source Separation,
SPLetters(25), No. 9, September 2018, pp. 1315-1319.
IEEE DOI 1809
image matching, source separation, speech processing, speaker-independent speech source separation BibRef

Martín-Doñas, J.M., Gomez, A.M., Gonzalez, J.A., Peinado, A.M.,
A Deep Learning Loss Function Based on the Perceptual Evaluation of the Speech Quality,
SPLetters(25), No. 11, November 2018, pp. 1680-1684.
IEEE DOI 1811
distortion, learning (artificial intelligence), mean square error methods, neural nets, speech enhancement, DNN BibRef

Wu, K.B.[Ke-Bin], Zhang, D.[David], Lu, G.M.[Guang-Ming], Guo, Z.H.[Zhen-Hua],
Joint learning for voice based disease detection,
PR(87), 2019, pp. 130-139.
Elsevier DOI 1812
Joint learning, Ridge regression, Low-rank regression, ?-dragging technique, Voice based pathology detection BibRef

Kumar, R.K.[R. Kishore], Birla, L.[Lokendra], Rao, K.S.[K. Sreenivasa],
A robust unsupervised pattern discovery and clustering of speech signals,
PRL(116), 2018, pp. 254-261.
Elsevier DOI 1812
Speech processing, Unsupervised pattern discovery, Clustering of speech utterances BibRef

Kim, G., Lee, H., Kim, B., Oh, S., Lee, S.,
Unpaired Speech Enhancement by Acoustic and Adversarial Supervision for Speech Recognition,
SPLetters(26), No. 1, January 2019, pp. 159-163.
IEEE DOI 1901
learning (artificial intelligence), natural language processing, neural nets, signal classification, generative adversarial network BibRef

Lei, P.[Peng], Chen, M.L.[Mei-Ling], Wang, J.[Jun],
Speech enhancement for in-vehicle voice control systems using wavelet analysis and blind source separation,
IET-ITS(13), No. 4, April 2019, pp. 693-702.
DOI Link 1903
BibRef

Ram, R.[Rashmirekha], Mohanty, M.N.[Mihir Narayan],
Use of radial basis function network with discrete wavelet transform for speech enhancement,
IJCVR(9), No. 2, 2019, pp. 207-223.
DOI Link 1904
BibRef

Gong, C.[Chen], Yi, X.W.[Xiao-Wei], Zhao, X.F.[Xian-Feng],
Pitch Delay Based Adaptive Steganography for AMR Speech Stream,
IWDW18(275-289).
Springer DOI 1905
BibRef

Kim, J., Hahn, M.,
Speech Enhancement Using a Two-Stage Network for an Efficient Boosting Strategy,
SPLetters(26), No. 5, May 2019, pp. 770-774.
IEEE DOI 1905
adaptive filters, computational complexity, learning (artificial intelligence), neural net architecture, two-stage network BibRef

Skovranek, T., Despotovic, V., Peric, Z.,
Optimal Fractional Linear Prediction With Restricted Memory,
SPLetters(26), No. 5, May 2019, pp. 760-764.
IEEE DOI 1905
approximation theory, frequency-domain analysis, least squares approximations, optimisation, prediction theory, speech processing BibRef

Zhang, J., Koutrouvelis, A.I., Heusdens, R., Hendriks, R.C.,
Distributed Rate-Constrained LCMV Beamforming,
SPLetters(26), No. 5, May 2019, pp. 675-679.
IEEE DOI 1905
acoustic communication (telecommunication), array signal processing, correlation methods, acoustic sensor networks BibRef

Nakatani, T., Kinoshita, K.,
A Unified Convolutional Beamformer for Simultaneous Denoising and Dereverberation,
SPLetters(26), No. 6, June 2019, pp. 903-907.
IEEE DOI 1906
array signal processing, signal denoising, speech enhancement, speech recognition, simultaneous denoising, robust speech recognition BibRef

Li, X., Leglaive, S., Girin, L., Horaud, R.,
Audio-Noise Power Spectral Density Estimation Using Long Short-Term Memory,
SPLetters(26), No. 6, June 2019, pp. 918-922.
IEEE DOI 1906
audio signal processing, Fourier transforms, learning (artificial intelligence), speech enhancement, speech enhancement BibRef

Keerthana, Y.M., Reddy, M.K., Rao, K.S.,
CWT-Based Approach for Epoch Extraction From Telephone Quality Speech,
SPLetters(26), No. 8, August 2019, pp. 1107-1111.
IEEE DOI 1908
speech processing, telephone sets, wavelet transforms, vocal tract system, clean speech signals, Hilbert transform BibRef

Gurugubelli, K., Vuppala, A.K.,
Stable Implementation of Zero Frequency Filtering of Speech Signals for Efficient Epoch Extraction,
SPLetters(26), No. 9, September 2019, pp. 1310-1314.
IEEE DOI 1909
feature extraction, filtering theory, resonator filters, speech processing, identification accuracy, false alarm rate, zero phase BibRef

Deb, S., Dandapat, S.,
Emotion Classification Using Segmentation of Vowel-Like and Non-Vowel-Like Regions,
AffCom(10), No. 3, July 2019, pp. 360-373.
IEEE DOI 1909
Speech, Feature extraction, Switches, Speech recognition, Mel frequency cepstral coefficient, Speech processing, binary-cascade multi-class classification BibRef

Kotropoulos, C.L.[Constantine L.],
Source phone identification using sketches of features,
IET-Bio(3), No. 2, June 2014, pp. 75-83.
DOI Link 1407
Speech based. BibRef

Rajan, V., Brutti, A., Cavallaro, A.,
ConflictNET: End-to-End Learning for Speech-Based Conflict Intensity Estimation,
SPLetters(26), No. 11, November 2019, pp. 1668-1672.
IEEE DOI 1911
Estimation, Feature extraction, Convolution, Metadata, Support vector machines, convolutional-recurrent network BibRef

Lotfian, R., Busso, C.,
Building Naturalistic Emotionally Balanced Speech Corpus by Retrieving Emotional Speech from Existing Podcast Recordings,
AffCom(10), No. 4, October 2019, pp. 471-483.
IEEE DOI 1912
Information retrieval, Speech recognition, Digital audio broadcasting, Speech processing, emotion ranking BibRef

Lee, Y., Min, J., Han, D.K., Ko, H.,
Spectro-Temporal Attention-Based Voice Activity Detection,
SPLetters(27), 2020, pp. 131-135.
IEEE DOI 2001
Deep neural networks, attention mechanism, voice activity detection, speech activity detection, speech detection BibRef

Fu, S., Liao, C., Tsao, Y.,
Learning With Learned Loss Function: Speech Enhancement With Quality-Net to Improve Perceptual Evaluation of Speech Quality,
SPLetters(27), 2020, pp. 26-30.
IEEE DOI 2001
Perception optimization, PESQ, speech enhancement, speech quality assessment BibRef

Wu, J., Yu, C., Fu, S., Liu, C., Chien, S., Tsao, Y.,
Increasing Compactness of Deep Learning Based Speech Enhancement Models With Parameter Pruning and Quantization Techniques,
SPLetters(26), No. 12, December 2019, pp. 1887-1891.
IEEE DOI 2001
distributed processing, learning (artificial intelligence), neural nets, quantisation (signal), signal denoising, Low Computational Cost BibRef

Lim, H., Kim, Y., Goo, J., Kim, H.,
Interlayer Selective Attention Network for Robust Personalized Wake-Up Word Detection,
SPLetters(27), 2020, pp. 126-130.
IEEE DOI 2001
Interlayer selective attention network (ISAN), acoustic word embedding BibRef

Yang, H., Yang, Z., Bao, Y., Liu, S., Huang, Y.,
Fast Steganalysis Method for VoIP Streams,
SPLetters(27), 2020, pp. 286-290.
IEEE DOI 2003
Speech steganography, speech steganalysis, code-word correlation BibRef

Zhang, L.W.[Li-Wen], Shi, Z.Q.[Zi-Qiang], Han, J.Q.[Ji-Qing], Shi, A.[Anyan], Ma, D.[Ding],
Furcanext: End-to-end Monaural Speech Separation with Dynamic Gated Dilated Temporal Convolutional Networks,
MMMod20(I:653-665).
Springer DOI 2003
BibRef

Sun, Z.B.[Zhong-Bo], Wang, Y.N.[Yan-Nan], Cao, L.[Li],
An Attention Based Speaker-independent Audio-visual Deep Learning Model for Speech Enhancement,
MMMod20(II:722-728).
Springer DOI 2003
BibRef

Lin, X., Zhu, J., Chen, D.,
Subband Aware CNN for Cell-Phone Recognition,
SPLetters(27), 2020, pp. 605-609.
IEEE DOI 2005
Microphones, Training, Spectrogram, Audio recording, Task analysis, Fingerprint recognition, Noise measurement, attention mechanism BibRef

Tagliasacchi, M., Gfeller, B., Quitry, F.d.C., Roblek, D.,
Pre-Training Audio Representations With Self-Supervision,
SPLetters(27), 2020, pp. 600-604.
IEEE DOI 2005
Task analysis, Decoding, Training, Spectrogram, Predictive models, Time-frequency analysis, audio processing BibRef

Yatabe, K.,
Consistent ICA: Determined BSS Meets Spectrogram Consistency,
SPLetters(27), 2020, pp. 870-874.
IEEE DOI 2006
Spectrogram, Time-frequency analysis, Time-domain analysis, Matrix converters, Blind source separation, Smoothing methods, short-time Fourier transform BibRef

Yu, C., Hung, K., Wang, S., Tsao, Y., Hung, J.,
Time-Domain Multi-Modal Bone/Air Conducted Speech Enhancement,
SPLetters(27), 2020, pp. 1035-1039.
IEEE DOI 2007
Noise measurement, Speech enhancement, Task analysis, Training, Time-domain analysis, Convolution, Multi-modal, fusion strategy BibRef

Muralishankar, R., Ghosh, D., Gurugopinath, S.,
A Novel Modified Mel-DCT Filter Bank Structure With Application to Voice Activity Detection,
SPLetters(27), 2020, pp. 1240-1244.
IEEE DOI 2007
Frequency domain long-term differential entropy, Mel-DCT, Mel-frequency, modified Mel-DCT, voice activity detection BibRef

Jiang, F., Duan, Z.,
Speaker Attractor Network: Generalizing Speech Separation to Unseen Numbers of Sources,
SPLetters(27), 2020, pp. 1859-1863.
IEEE DOI 2011
Training, Decoding, Convolution, Spectrogram, Estimation, Speech processing, Testing, Speech separation, speaker attractor BibRef

Kim, J., Lee, Y., Kim, E.,
Accelerating RNN Transducer Inference via Adaptive Expansion Search,
SPLetters(27), 2020, pp. 2019-2023.
IEEE DOI 2012
Decoding, Speech recognition, Acoustic beams, Acceleration, Acoustics, Speech processing, Indexes, Beam search, RNN transducer BibRef

Hsieh, T.A., Wang, H.M., Lu, X., Tsao, Y.,
WaveCRN: An Efficient Convolutional Recurrent Neural Network for End-to-End Speech Enhancement,
SPLetters(27), 2020, pp. 2149-2153.
IEEE DOI 2012
Speech enhancement, Feature extraction, Task analysis, Noise reduction, Convolution, Noise measurement, Training, simple recurrent unit BibRef

Janbakhshi, P., Kodrasi, I., Bourlard, H.,
Subspace-Based Learning for Automatic Dysarthric Speech Detection,
SPLetters(28), 2021, pp. 96-100.
IEEE DOI 2101
Voice activity detection, Feature extraction, Manifolds, Databases, Acoustics, Pathology, Kernel, Spectral subspace, temporal subspace, SVD BibRef

Siniscalchi, S.M.,
Vector-to-Vector Regression via Distributional Loss for Speech Enhancement,
SPLetters(28), 2021, pp. 254-258.
IEEE DOI 2102
Noise measurement, Speech enhancement, Predictive models, Data models, Training, Histograms, Linear programming, speech enhancement BibRef

Cui, Z.[Zihao], Bao, C.C.[Chang-Chun],
Power Exponent Based Weighting Criterion for DNN-Based Mask Approximation in Speech Enhancement,
SPLetters(28), 2021, pp. 618-622.
IEEE DOI 2104
Speech enhancement, Noise measurement, Linear programming, Training, Databases, Indexes, Time-frequency analysis, speech enhancement BibRef

Witkowski, M.[Marcin], Kowalczyk, K.[Konrad],
Split Bregman Approach to Linear Prediction Based Dereverberation With Enforced Speech Sparsity,
SPLetters(28), 2021, pp. 942-946.
IEEE DOI 2106
Microphones, Reverberation, Cost function, Time-frequency analysis, Standards, Speech enhancement, Mathematical model, speech sparsity BibRef

Gimeno, P.[Pablo], Mingote, V.[Victoria], Ortega, A.[Alfonso], Miguel, A.[Antonio], Lleida, E.[Eduardo],
Generalizing AUC Optimization to Multiclass Classification for Audio Segmentation With Limited Training Data,
SPLetters(28), 2021, pp. 1135-1139.
IEEE DOI 2106
Measurement, Training, Task analysis, Optimization, Training data, Multiple signal classification, Deep learning, multiclass AUC optimisation BibRef

Queiroz, A., Coelho, R.,
F0-Based Gammatone Filtering for Intelligibility Gain of Acoustic Noisy Signals,
SPLetters(28), 2021, pp. 1225-1229.
IEEE DOI 2106
Noise measurement, Harmonic analysis, Estimation, Speech processing, Power harmonic filters, Signal to noise ratio, intelligibility improvement BibRef

Pan, N.N.[Ning-Ning], Wang, Y.Z.[Yu-Zhu], Chen, J.D.[Jing-Dong], Benesty, J.[Jacob],
A Single-Input/Binaural-Output Antiphasic Speech Enhancement Method for Speech Intelligibility Improvement,
SPLetters(28), 2021, pp. 1445-1449.
IEEE DOI 2108
Convolution, Rendering (computer graphics), Ear, Speech enhancement, Training, Noise measurement, Decoding, intelligibility BibRef

Vrbík, D.[Daniel], Lábus, V.[Václav],
Crowdsourcing of Popular Toponyms: How to Collect and Preserve Toponyms in Spoken Use,
IJGI(10), No. 5, 2021, pp. xx-yy.
DOI Link 2106
BibRef

Xiang, X.X.[Xiao-Xiao], Zhang, X.J.[Xiao-Juan], Chen, H.Z.[Hao-Zhe],
A Convolutional Network With Multi-Scale and Attention Mechanisms for End-to-End Single-Channel Speech Enhancement,
SPLetters(28), 2021, pp. 1455-1459.
IEEE DOI 2108
Convolution, Speech enhancement, Noise measurement, Decoding, Training, Feature extraction, Time-domain analysis, dense connectivity BibRef

Ikeshita, R.[Rintaro], Kinoshita, K.[Keisuke], Kamo, N.[Naoyuki], Nakatani, T.[Tomohiro],
Online Speech Dereverberation Using Mixture of Multichannel Linear Prediction Models,
SPLetters(28), 2021, pp. 1580-1584.
IEEE DOI 2108
Switches, Time-frequency analysis, Reverberation, Signal processing algorithms, Optimization, Additive noise, sparsity BibRef

Jiang, Y.[Yuechi], Leung, F.H.F.[Frank H. F.],
Vector-Based Feature Representations for Speech Signals: From Supervector to Latent Vector,
MultMed(23), 2021, pp. 2641-2655.
IEEE DOI 2109
Acoustics, Probabilistic logic, Computational modeling, Adaptation models, Computational efficiency, Task analysis, vector-based feature representation BibRef

Xiang, X.X.[Xiao-Xiao], Zhang, X.J.[Xiao-Juan], Chen, H.Z.[Hao-Zhe],
Two-Stage Learning and Fusion Network With Noise Aware for Time-Domain Monaural Speech Enhancement,
SPLetters(28), 2021, pp. 1754-1758.
IEEE DOI 2109
Convolution, Decoding, Logic gates, Speech enhancement, Training, Noise measurement, Signal to noise ratio, Speech enhancement, dilated dense block BibRef

Esmaeilpour, M.[Mohammad], Cardinal, P.[Patrick], Koerich, A.L.[Alessandro Lameiras],
Cyclic Defense GAN Against Speech Adversarial Attacks,
SPLetters(28), 2021, pp. 1769-1773.
IEEE DOI 2109
Spectrogram, Discrete wavelet transforms, Generative adversarial networks, Generators, adversarial defense BibRef

Li, G.[Gang], Wang, X.C.[Xiao-Chen], Hu, R.M.[Rui-Min], Zhang, H.Y.[Hu-Yin], Ke, S.F.[Shan-Fa],
Intelligibility Enhancement Via Normal-to-Lombard Speech Conversion With Long Short-Term Memory Network and Bayesian Gaussian Mixture Model,
MultMed(23), 2021, pp. 3035-3047.
IEEE DOI 2109
Vocoders, Speech enhancement, Working environment noise, Real-time systems, Delays, speech conversion BibRef

Kodrasi, I.[Ina],
Temporal Envelope and Fine Structure Cues for Dysarthric Speech Detection Using CNNs,
SPLetters(28), 2021, pp. 1853-1857.
IEEE DOI 2109
Voice activity detection, Indexes, Convolutional neural networks, Phonetics, Databases, Band-pass filters, convolutional neural network BibRef

Ikeshita, R., Kamo, N., Nakatani, T.,
Blind Signal Dereverberation Based on Mixture of Weighted Prediction Error Models,
SPLetters(28), 2021, pp. 399-403.
IEEE DOI 2103
Reverberation, Finite impulse response filters, Switches, Time-frequency analysis, Speech recognition, Estimation, microphone array BibRef

Liu, Z.T.[Zhen-Tao], Rehman, A.[Abdul], Wu, M.[Min], Cao, W.H.[Wei-Hua], Hao, M.[Man],
Speech Personality Recognition Based on Annotation Classification Using Log-Likelihood Distance and Extraction of Essential Audio Features,
MultMed(23), 2021, pp. 3414-3426.
IEEE DOI 2109
Feature extraction, Speech recognition, Reliability, Training, Emotion recognition, Human computer interaction, Task analysis, annotation clustering BibRef

Cheng, L.[Longbiao], Li, J.F.[Jun-Feng], Yan, Y.H.[Yong-Hong],
FSCNet: Feature-Specific Convolution Neural Network for Real-Time Speech Enhancement,
SPLetters(28), 2021, pp. 1958-1962.
IEEE DOI 2110
Convolution, Speech enhancement, Power capacitors, Kernel, Feature extraction, Time-frequency analysis, Decoding, speech enhancement BibRef

Tai, W.X.[Wen-Xin], Lan, T.[Tian], Wang, Q.H.[Qian-Hui], Liu, Q.[Qiao],
IDANet: An Information Distillation and Aggregation Network for Speech Enhancement,
SPLetters(28), 2021, pp. 1998-2002.
IEEE DOI 2110
Convolution, Feature extraction, Speech enhancement, Noise measurement, Decoding, Encoding, Training, Speech enhancement, deformable convolution BibRef

Wang, Z.Q.[Zhong-Qiu], Wichern, G.[Gordon], Le Roux, J.[Jonathan],
On the Compensation Between Magnitude and Phase in Speech Separation,
SPLetters(28), 2021, pp. 2018-2022.
IEEE DOI 2110
Time-domain analysis, Measurement, Training, Speech enhancement, Spectrogram, Signal to noise ratio, Task analysis, deep learning BibRef

Kim, H.Y.[Hyung Yong], Yoon, J.W.[Ji Won], Cho, W.I.[Won Ik], Kim, N.S.[Nam Soo],
Neurally Optimized Decoder for Low Bitrate Speech Codec,
SPLetters(29), 2022, pp. 244-248.
IEEE DOI 2202
Decoding, Speech coding, Speech codecs, Bit rate, Encoding, Convolution, Knowledge engineering, Speech codecs, attention mechanism BibRef

Xiang, X.X.[Xiao-Xiao], Zhang, X.J.[Xiao-Juan], Chen, H.Z.[Hao-Zhe],
A Nested U-Net With Self-Attention and Dense Connectivity for Monaural Speech Enhancement,
SPLetters(29), 2022, pp. 105-109.
IEEE DOI 2202
Convolution, Speech enhancement, Feature extraction, Decoding, Time-domain analysis, Signal to noise ratio, Sensors, time-domain BibRef

Cohen, E.[Eyal], Kreuk, F.[Felix], Keshet, J.[Joseph],
Speech Time-Scale Modification With GANs,
SPLetters(29), 2022, pp. 1067-1071.
IEEE DOI 2205
Spectrogram, Generators, Signal processing algorithms, Decoding, Training, Vocoders, Time-domain analysis, Deep neural networks, time-scale modification BibRef

Wang, Z.Q.[Zhong-Qiu], Watanabe, S.[Shinji],
Improving Frame-Online Neural Speech Enhancement With Overlapped-Frame Prediction,
SPLetters(29), 2022, pp. 1422-1426.
IEEE DOI 2207
Prediction algorithms, Speech enhancement, Discrete Fourier transforms, Spectrogram, Predictive models, online speech enhancement BibRef

Choi, J.[Jeonghwan], Chang, J.H.[Joon-Hyuk],
Supervised Learning Approach for Explicit Spatial Filtering of Speech,
SPLetters(29), 2022, pp. 1412-1416.
IEEE DOI 2207
Microphones, Reflection, Gain, Convolution, Filtering, Direction-of-arrival estimation, Training data, sound source localization BibRef

Fu, M.J.[Mei-Jun], Wang, X.M.[Xiao-Min], Wang, J.[Jun],
Polynomial-Decomposition-Based LPC for Formant Estimation,
SPLetters(29), 2022, pp. 1392-1396.
IEEE DOI 2207
LPC: linear prediction coding. Corporate acquisitions, Estimation, Signal processing algorithms, Prediction algorithms, Statistical analysis, division algorithm for polynomial BibRef

Kim, M.S.[Min Sik], Kim, H.S.[Hyung Soon],
Attentive Pooling-Based Weighted Sum of Spectral Decay Rates for Blind Estimation of Reverberation Time,
SPLetters(29), 2022, pp. 1639-1643.
IEEE DOI 2208
Reverberation, Estimation, Feature extraction, Speech processing, Training data, Training, Signal to noise ratio, reverberation time BibRef

Reddy, M.K.[Mittapalle Kiran], Keerthana, Y.M.[Yagnavajjula Madhu], Alku, P.[Paavo],
End-to-End Pathological Speech Detection Using Wavelet Scattering Network,
SPLetters(29), 2022, pp. 1863-1867.
IEEE DOI 2209
Wireless sensor networks, Scattering, Pathology, Feature extraction, Task analysis, Convolutional neural networks, MP3 compression BibRef

Kim, H.[Hansol], Kang, K.[Kyeongmuk], Shin, J.W.[Jong Won],
Factorized MVDR Deep Beamforming for Multi-Channel Speech Enhancement,
SPLetters(29), 2022, pp. 1898-1902.
IEEE DOI 2209
Speech enhancement, Estimation, Artificial neural networks, MISO communication, Array signal processing, Deep learning, factorized MVDR beamformer BibRef

Fras, M.[Mieszko], Kowalczyk, K.[Konrad],
Convolutional Weighted Parametric Multichannel Wiener Filter for Reverberant Source Separation,
SPLetters(29), 2022, pp. 1928-1932.
IEEE DOI 2209
Wiener filters, Convolution, Reverberation, Microphones, Distortion, Speech enhancement, Transfer functions, Source separation, Wiener filter BibRef

Hwang, S.[Soojoong], Lee, E.[Eunkyun], Jang, I.[Inseon], Shin, J.W.[Jong Won],
Alias-and-Separate: Wideband Speech Coding Using Sub-Nyquist Sampling and Speech Separation,
SPLetters(29), 2022, pp. 2003-2007.
IEEE DOI 2210
Speech coding, Bit rate, Wideband, Encoding, Narrowband, Decoding, Speech enhancement, Frequency aliasing, speech codec, audio codec, coded signal enhancement BibRef

Yadav, S.K.[Shekhar Kumar], George, N.V.[Nithin V.],
Sparse Distortionless Modal Beamforming for Spherical Microphone Arrays,
SPLetters(29), 2022, pp. 2068-2072.
IEEE DOI 2211
Array signal processing, Microphone arrays, Harmonic analysis, Acoustic distortion, Speech enhancement, Power harmonic filters, sparse priors BibRef

Lee, J.Y.[Jin-Young], Kang, H.G.[Hong-Goo],
Two-Stage Refinement of Magnitude and Complex Spectra for Real-Time Speech Enhancement,
SPLetters(29), 2022, pp. 2188-2192.
IEEE DOI 2212
Convolution, Speech enhancement, Noise measurement, Estimation, Training, Time-frequency analysis, Kernel, two-stage network BibRef

Yu, R.X.[Run-Xiang], Zhao, Z.W.[Zi-Wei], Ye, Z.F.[Zhong-Fu],
PFRNet: Dual-Branch Progressive Fusion Rectification Network for Monaural Speech Enhancement,
SPLetters(29), 2022, pp. 2358-2362.
IEEE DOI 2212
Feature extraction, Transformers, Speech enhancement, Tensors, Convolution, Decoding, Time-frequency analysis, monaural speech enhancement BibRef

Karamatli, E.[Ertug], Kirbiz, S.[Serap],
MixCycle: Unsupervised Speech Separation via Cyclic Mixture Permutation Invariant Training,
SPLetters(29), 2022, pp. 2637-2641.
IEEE DOI 2301
Training, Recording, Source separation, Time-domain analysis, Task analysis, Optimized production technology, unsupervised learning BibRef

McKinney, A.F.[Alex F.], Cauchi, B.[Benjamin],
Non-Intrusive Binaural Speech Intelligibility Prediction From Discrete Latent Representations,
SPLetters(29), 2022, pp. 987-991.
IEEE DOI 2205
Feature extraction, Training, Indexes, Speech processing, Speech coding, Speech recognition, Predictive models, self-supervised representation learning BibRef

Rosenbaum, T.[Tomer], Cohen, I.[Israel], Winebrand, E.[Emil], Gabso, O.[Ofri],
Differentiable Mean Opinion Score Regularization for Perceptual Speech Enhancement,
PRL(166), 2023, pp. 159-163.
Elsevier DOI 2302
Speech enhancement, Mean opinion score, Speech quality assessment, Speech naturalness assessment BibRef

de Lacerda-Pataca, C.[Caluã], Costa, P.D.P.[Paula Dornhofer Paro],
Hidden Bawls, Whispers, and Yelps: Can Text Convey the Sound of Speech, Beyond Words?,
AffCom(14), No. 1, January 2023, pp. 6-16.
IEEE DOI 2303
Visualization, Acoustics, Speech recognition, Linguistics, Auditory system, Automobiles, Modulation, Affective computing, speech analysis BibRef

Lee, D.[Dongheon], Choi, J.W.[Jung-Woo],
DeFT-AN: Dense Frequency-Time Attentive Network for Multichannel Speech Enhancement,
SPLetters(30), 2023, pp. 155-159.
IEEE DOI 2303
Speech enhancement, Transformers, Noise measurement, Convolution, Time-frequency analysis, Time-domain analysis, transformer BibRef

Wang, T.T.[Ting-Ting], Pan, Z.[Zexu], Ge, M.[Meng], Yang, Z.[Zhen], Li, H.Z.[Hai-Zhou],
Time-Domain Speech Separation Networks With Graph Encoding Auxiliary,
SPLetters(30), 2023, pp. 110-114.
IEEE DOI 2303
Time-domain analysis, Encoding, Speech recognition, Convolution, Speech enhancement, Signal to noise ratio, Transforms, graph neural networks BibRef

Chen, G.[Gang], Li, X.G.[Xiang-Ge], Xiao, S.Y.[Shuai-Yong], Zhang, C.H.[Cheng-Hong], Lu, X.H.[Xiang-Hua],
RACL: A robust adaptive contrastive learning method for conversational satisfaction prediction,
PR(138), 2023, pp. 109386.
Elsevier DOI 2303
BibRef

Cheng, J.M.[Jia-Ming], Liang, R.[Ruiyu], Zhao, L.[Li], Huang, C.W.[Cheng-Wei], Schuller, B.W.[Björn W.],
Speech Denoising and Compensation for Hearing Aids Using an FTCRN-Based Metric GAN,
SPLetters(30), 2023, pp. 374-378.
IEEE DOI 2305
Auditory system, Measurement, Generators, Noise reduction, Noise measurement, Training, Hearing aids, Hearing aid, metric generative adversarial network BibRef

Shu, Y.C.[Yu-Chun], Luo, H.N.[Hao-Neng], Zhang, S.L.[Shi-Liang], Wang, L.B.[Long-Biao], Dang, J.[Jianwu],
A CIF-Based Speech Segmentation Method for Streaming E2E ASR,
SPLetters(30), 2023, pp. 344-348.
IEEE DOI 2305
Acoustics, Decoding, Training, Semantics, Earth Observing System, Real-time systems, Convolution, Continuous integrate-and-fire, two-pass ASR BibRef

Zhou, Y.[Yi], Wu, Z.Z.[Zhi-Zheng], Zhang, M.Y.[Ming-Yang], Tian, X.H.[Xiao-Hai], Li, H.Z.[Hai-Zhou],
TTS-Guided Training for Accent Conversion Without Parallel Data,
SPLetters(30), 2023, pp. 533-537.
IEEE DOI 2305
Acoustics, Training, Decoding, Feature extraction, Data models, Phonetics, Error analysis, Accent conversion (AC), text-to-speech (TTS) BibRef

Koepke, A.S.[A. Sophia], Oncescu, A.M.[Andreea-Maria], Henriques, J.F.[João F.], Akata, Z.[Zeynep], Albanie, S.[Samuel],
Audio Retrieval With Natural Language Queries: A Benchmark Study,
MultMed(25), 2023, pp. 2675-2685.
IEEE DOI 2307
Task analysis, Benchmark testing, Natural languages, Visualization, Metadata, Grounding, Visual databases, Audio retrieval, datasets BibRef

Duan, Y.[Yicun], Ren, J.F.[Jian-Feng], Yu, H.[Heng], Jiang, X.D.[Xu-Dong],
GAN-in-GAN for Monaural Speech Enhancement,
SPLetters(30), 2023, pp. 853-857.
IEEE DOI 2307
Spectrogram, Generative adversarial networks, Training, Noise measurement, Generators, Noise reduction, Decoding, speech enhancement BibRef

Park, D.[Dongkeon], Yu, Y.[Yechan], Katabi, D.[Dina], Kim, H.K.[Hong Kook],
Adversarial Continual Learning to Transfer Self-Supervised Speech Representations for Voice Pathology Detection,
SPLetters(30), 2023, pp. 932-936.
IEEE DOI 2308
Task analysis, Pathology, Adaptation models, Feature extraction, Context modeling, Data models, Support vector machines, wav2vec 2.0 BibRef

Kim, H.[Hyeonseung], Shin, J.W.[Jong Won],
On Training Speech Separation Models With Various Numbers of Speakers,
SPLetters(30), 2023, pp. 1202-1206.
IEEE DOI 2310
BibRef

Ai, Y.[Yang], Lu, Y.X.[Ye-Xin], Ling, Z.H.[Zhen-Hua],
Long-Frame-Shift Neural Speech Phase Prediction With Spectral Continuity Enhancement and Interpolation Error Compensation,
SPLetters(30), 2023, pp. 1097-1101.
IEEE DOI 2310
BibRef

Xiong, J.W.[Jun-Wen], Zhou, Y.[Yu], Zhang, P.[Peng], Xie, L.[Lei], Huang, W.[Wei], Zha, Y.F.[Yu-Fei],
Look&listen: Multi-Modal Correlation Learning for Active Speaker Detection and Speech Enhancement,
MultMed(25), 2023, pp. 5800-5812.
IEEE DOI 2311
BibRef

Joglekar, A.[Aditya], Hansen, J.H.L.[John H. L],
DeepComboSAD: Spectro-Temporal Correlation Based Speech Activity Detection for Naturalistic Audio Streams,
SPLetters(30), 2023, pp. 1472-1476.
IEEE DOI 2311
BibRef

Cai, Y.Q.[Yun-Qi], Li, L.[Lantian], Abel, A.[Andrew], Zhu, X.Y.[Xiao-Yan], Wang, D.[Dong],
Maximum Gaussianality training for deep speaker vector normalization,
PR(145), 2024, pp. 109977.
Elsevier DOI 2311
Speaker embedding, Normalization flow, Gaussianality training BibRef

Liang, X.W.[Xing-Wei], Zhang, L.[Lu], Wu, Z.Y.[Zhi-Yong], Xu, R.F.[Rui-Feng],
Lite-RTSE: Exploring a Cost-Effective Lite DNN Model for Real-Time Speech Enhancement in RTC Scenarios,
SPLetters(30), 2023, pp. 1697-1701.
IEEE DOI 2312
BibRef

Raman, C.[Chirag], Prabhu, N.R.[Navin Raj], Hung, H.[Hayley],
Perceived Conversation Quality in Spontaneous Interactions,
AffCom(14), No. 4, October 2023, pp. 2901-2912.
IEEE DOI 2312
BibRef

Atito, S.[Sara], Awais, M.[Muhammed], Alex, T.[Tony], Kittler, J.V.[Josef V.],
Group Masked Model Learning for General Audio Representation,
ICIP23(2600-2604)
IEEE DOI 1806
BibRef

Yechuri, S.[Sivaramakrishna], Vanabathina, S.D.[Sunny Dayal],
Genetic Algorithm-Based Adaptive Wiener Gain for Speech Enhancement Using an Iterative Posterior NMF,
IJIG(23), No. 6 2023, pp. 2350054.
DOI Link 2312
BibRef

Lee, H.[Harlin], Saeed, A.[Aaqib],
Distilled non-semantic speech embeddings with binary neural networks for low-resource devices,
PRL(177), 2024, pp. 15-19.
Elsevier DOI 2401
Speech representations, Knowledge distillation, Paralinguistic tasks, Binary neural networks, Digital health, Internet-of-things BibRef

Ye, L.X.[Ling-Xuan], Gao, C.F.[Chang-Feng], Cheng, G.F.[Gao-Feng], Luo, L.P.[Liu-Ping], Zhao, Q.W.[Qing-Wei],
ASQ: An Ultra-Low Bit Rate ASR-Oriented Speech Quantization Method,
SPLetters(31), 2024, pp. 221-225.
IEEE DOI 2401
BibRef

Li, C.T.[Chang-Tao], Yang, F.[Feiran], Yang, J.[Jun],
Restoration of Bone-Conducted Speech With U-Net-Like Model and Energy Distance Loss,
SPLetters(31), 2024, pp. 166-170.
IEEE DOI 2401
BibRef

O'Shaughnessy, D.[Douglas],
Speech Enhancement: A Review of Modern Methods,
HMS(54), No. 1, February 2024, pp. 110-120.
IEEE DOI 2402
Survey, Speech Enhancement. Acoustic distortion, Acoustics, Speech enhancement, Speech coding, Reverberation, Noise measurement, Microphones. BibRef

Xu, X.[Xinmeng],
Improving Monaural Speech Enhancement by Mapping to Fixed Simulation Space With Knowledge Distillation,
SPLetters(31), 2024, pp. 386-390.
IEEE DOI 2402
Feature extraction, Speech enhancement, Spectrogram, Noise measurement, Training, Recording, Convolution, knowledge distillation BibRef

Xiang, B.[Bajian], Mao, W.Y.[Wen-Yu], Tan, K.J.[Kai-Jun], Lu, H.X.[Hua-Xiang],
CAT-DUnet: Enhancing Speech Dereverberation via Feature Fusion and Structural Similarity Loss,
SPLetters(31), 2024, pp. 456-460.
IEEE DOI 2402
Spectrogram, Convolution, Time-frequency analysis, Measurement, Training, Feature extraction, Deep learning, Attention mechanism, speech dereverberation BibRef

Rababaah, A.R.[Aaron Rasheed],
Intelligent classification model for holy Quran recitation Maqams,
IJCVR(14), No. 2, 2024, pp. 170-190.
DOI Link 2403
BibRef


Wani, T.M.[Taiba Majid], Amerini, I.[Irene],
Deepfakes Audio Detection Leveraging Audio Spectrogram and Convolutional Neural Networks,
CIAP23(II:156-167).
Springer DOI 2312
BibRef

Choi, S.[Sunmook], Oh, S.[Seungsang], Yang, J.[Jonghoon], Lee, Y.[Yerin], Kwak, I.Y.[Il-Youp],
Light-weight Frequency Information Aware Neural Network Architecture for Voice Spoofing Detection,
ICPR22(477-483)
IEEE DOI 2212
Loudspeakers, Convolution, Error analysis, Virtual assistants, Neural networks, Feature extraction, Complexity theory BibRef

Li, X.[Xiao], Hu, X.[Xiao], Chen, X.[Xiao], Pan, H.[Hang], Niu, K.[Kun],
Deep Speaker Embedding Using Hybrid Network of Multi-Feature Aggregation and Multi-Loss Fusion for TI-SV,
ICPR22(506-512)
IEEE DOI 2212
Training, Adaptive systems, Fuses, Frequency-domain analysis, Aggregates, Feature extraction BibRef

Zhang, B.[Bowen], Sim, T.[Terence],
Localizing Fake Segments in Speech,
ICPR22(3224-3230)
IEEE DOI 2212
Location awareness, Cloning, Speech recognition, Detectors, Telephony, Benchmark testing, Feature extraction BibRef

Li, X.S.[Xin-Shu], Tan, Z.H.[Zhen-Hua], Xia, Z.C.[Zhen-Che], Wu, D.[Danke], Zhang, B.[Bin],
Single-Channel Speech Separation Focusing on Attention DE,
ICPR22(3204-3209)
IEEE DOI 2212
Training, Convolution, Particle separators, Focusing, Speech recognition, Speech enhancement, Feature extraction, SepFormer Block BibRef

Xu, X.M.[Xin-Meng], Hao, J.J.[Jian-Jun],
U-Former: Improving Monaural Speech Enhancement with Multi-head Self and Cross Attention,
ICPR22(663-369)
IEEE DOI 2212
Training, Time-frequency analysis, Target tracking, Neural networks, Speech recognition, Speech enhancement, multi-head cross-attention BibRef

Teng, Z.[Zhongwei], Fu, Q.[Quchen], White, J.[Jules], Powell, M.E.[Maria E.], Schmidt, D.C.[Douglas C.],
ARawNet: A Lightweight Solution for Leveraging Raw Waveforms in Spoof Speech Detection,
ICPR22(692-698)
IEEE DOI 2212
Voice activity detection, Representation learning, Backpropagation, Computational modeling, Speech recognition, spoof speech detection BibRef

Li, D.S.[Deng-Shi], Zhao, L.X.[Lan-Xin], Xiao, J.[Jing], Liu, J.Q.[Jia-Qi], Guan, D.Z.[Duan-Zheng], Wang, Q.R.[Qian-Rui],
Adaptive Speech Intelligibility Enhancement for Far-and-Near-end Noise Environments Based on Self-attention StarGAN,
MMMod22(II:205-217).
Springer DOI 2203
BibRef

Xiao, J.[Jing], Liu, J.Q.[Jia-Qi], Li, D.S.[Deng-Shi], Zhao, L.X.[Lan-Xin], Wang, Q.R.[Qian-Rui],
Speech Intelligibility Enhancement By Non-Parallel Speech Style Conversion Using CWT and iMetricGAN Based CycleGAN,
MMMod22(I:544-556).
Springer DOI 2203
BibRef

Hegde, S.B.[Sindhu B.], Prajwal, K.R., Mukhopadhyay, R.[Rudrabha], Namboodiri, V.[Vinay], Jawahar, C.V.,
Visual Speech Enhancement Without A Real Visual Stream,
WACV21(1925-1934)
IEEE DOI 2106
Visualization, Lips, Speech enhancement, Streaming media, Filtering algorithms, Information filters, Noise measurement BibRef

Stefanov, K.[Kalin], Adiban, M.[Mohammad], Salvi, G.[Giampiero],
Spatial Bias in Vision-Based Voice Activity Detection,
ICPR21(10433-10440)
IEEE DOI 2105
Voice activity detection, Performance evaluation, Visualization, Analytical models, Magnetic heads, Spatial databases, Data models, spatial bias BibRef

Wang, Y.,
Research Progress in Speech Enhancement Technology,
CVIDL20(222-226)
IEEE DOI 2102
neural nets, speech enhancement, abstract original speech signals, pure speech signals, deep learning BibRef

Dendani, B.[Bilal], Bahi, H.[Halima], Sari, T.[Toufik],
Speech Enhancement Based on Deep Autoencoder for Remote Arabic Speech Recognition,
ICISP20(221-229).
Springer DOI 2009
BibRef

Barros, F.[Fábio], Conde, Â.[Ângelo], Soares, S.C.[Sandra C.], Neves, A.J.R.[António J. R.], Silva, S.[Samuel],
Understanding Public Speakers' Performance: First Contributions to Support a Computational Approach,
ICIAR20(I:343-355).
Springer DOI 2007
BibRef

Coto-Jiménez, M.[Marvin],
Experimental Study on Transfer Learning in Denoising Autoencoders for Speech Enhancement,
MCPR20(307-317).
Springer DOI 2007
BibRef

Bílková, Z.[Zuzana], Novozámský, A.[Adam], Domínec, A.[Adam], Greško, Š.[Šimon], Zitová, B.[Barbara], Paroubková, M.[Markéta],
Automatic Evaluation of Speech Therapy Exercises Based on Image Data,
ICIAR19(I:397-404).
Springer DOI 1909
BibRef

Zhang, R.[Rui], Hu, R.[Ruimin], Li, G.[Gang], Wang, X.C.[Xiao-Chen],
Spectral Tilt Estimation for Speech Intelligibility Enhancement Using RNN Based on All-Pole Model,
MMMod19(II:144-156).
Springer DOI 1901
BibRef

Dai, J.J.[Jia-Jie], Dixon, S.[Simon],
Understanding Intonation Trajectories and Patterns of Vocal Notes,
MMMod19(II:243-253).
Springer DOI 1901
BibRef

Zheng, S., Wang, J., Xiao, J., Hsu, W., Glass, J.,
A Noise-Robust Self-Adaptive Multitarget Speaker Detection System,
ICPR18(1068-1072)
IEEE DOI 1812
Blacklisting, Feature extraction, Noise measurement, Detectors, Acoustics, Data models BibRef

Athanasopoulos, G., Hagihara, K., Cierro, A., Guérit, R., Chatelain, J., Lucas, C., Macq, B.,
3D immersive karaoke for the learning of foreign language pronunciation,
IC3D17(1-8)
IEEE DOI 1804
computer based training, data visualisation, linguistics, natural language processing, virtual reality, pronunciation training BibRef

Samui, S.[Suman], Chakrabarti, I.[Indrajit], Ghosh, S.K.[Soumya K.],
Improving the Performance of Deep Learning Based Speech Enhancement System Using Fuzzy Restricted Boltzmann Machine,
PReMI17(534-542).
Springer DOI 1711
BibRef

Serras, M.[Manex], Torres, M.I.[María Inés], del Pozo, A.[Arantza],
Online Learning of Attributed Bi-Automata for Dialogue Management in Spoken Dialogue Systems,
IbPRIA17(22-31).
Springer DOI 1706
BibRef

Nagpal, A.[Ankit], Patil, H.A.[Hemant A.],
Novel Gammatone Filterbank Based Spectro-Temporal Features for Robust Phoneme Recognition,
PReMI17(342-350).
Springer DOI 1711
BibRef

Grachev, A.M.[Artem M.], Ignatov, D.I.[Dmitry I.], Savchenko, A.V.[Andrey V.],
Neural Networks Compression for Language Modeling,
PReMI17(351-357).
Springer DOI 1711
BibRef

Zhang, L., Chen, J.X.[Jia-Xu], Luo, Y.[You], Fu, J.F.[Jia-Fei], Ye, Z.F.[Zhong-Fu],
Supervised single-channel speech dereverberation and denoising using a two-stage processing,
ICIVC17(818-822)
IEEE DOI 1708
Adaptive filters, Noise measurement, Speech, non-negative matrix factorization, room impulse response, speech dereverberation and denoising, two-stage, processing BibRef

Bedoui, A., Ben Jebara, S.,
On the use of opening phase slopes of the glottal signal to characterize unilateral vocal folds paralysis,
ISIVC16(41-46)
IEEE DOI 1704
Estimation BibRef

Ben Ali, F., Djaziri-Larbi, S.,
A very low bit rate codec for wide band speech based on a long-term perceptual harmonic plus noise model,
ISIVC16(71-76)
IEEE DOI 1704
Bit rate BibRef

Ferreira, A.,
Implantation of voicing on whispered speech using frequency-domain parametric modelling of source and filter information,
ISIVC16(159-166)
IEEE DOI 1704
Estimation BibRef

Pozzebon, A.[Alessandro], Biliotti, F.[Francesca], Calamai, S.[Silvia],
Places Speaking with Their Own Voices. A Case Study from the Gra.fo Archives,
EuroMed16(II: 232-239).
Springer DOI 1611
BibRef

Vlaj, D., Kos, M., Kacic, Z.,
Quick and efficient definition of hangbefore and hangover criteria for voice activity detection,
WSSIP16(1-4)
IEEE DOI 1608
speech processing BibRef

Ballesteros L, D.M.[Dora M.], Renza, D.[Diego], Camacho, S.[Steven],
High Scrambling Degree in Audio Through Imitation of an Unintelligible Signal,
MCPR16(251-259).
Springer DOI 1608
BibRef

Onchis, D.M.[Darian M.], Real, P.[Pedro],
On Homotopy Continuation for Speech Restoration,
CTIC16(152-156).
Springer DOI 1608
BibRef

Dubey, M.L., Shultz, P.F., Kenyon, G.T.,
Learning phase-rich features from streaming auditory images,
Southwest16(73-76)
IEEE DOI 1605
Convolution BibRef

Montalvo, A.[Ana], Costa, Y.M.G.[Yandre M. G.], Calvo, J.R.[José Ramón],
Language Identification Using Spectrogram Texture,
CIARP15(543-550).
Springer DOI 1511
BibRef

Aizezi, Y.[Yasen], Jamal, A.[Anwar], Mamat, D.[Dilxat], Abdurexit, R.[Ruxianguli], Ubul, K.[Kurban],
Analytical Method and Research of Uyghur Language Chunks Based on Digital Forensics,
ISCA15(258-266).
Springer DOI 1511
BibRef

Hammami, N., Bedda, M., Farah, N., Mansouri, S.,
R-Letter disorder diagnosis (R-LDD): Arabic speech database development for automatic diagnosis of childhood speech disorders (Case study),
ISCV15(1-7)
IEEE DOI 1506
acoustic signal processing BibRef

Nakajima, J.[Jiro], Kimura, A.[Akisato], Sugimoto, A.[Akihiro], Kashino, K.[Kunio],
Visual Attention Driven by Auditory Cues,
MMMod15(II: 74-86).
Springer DOI 1501
BibRef

Ishikura, K.[Kazumasa], Uemura, A.[Aiko], Katto, J.[Jiro],
Live Version Identification with Audio Scene Detection,
MMMod15(I: 408-417).
Springer DOI 1501
BibRef

Xie, S.B.[Song-Bo], Yang, Y.H.[Yu-Hong], Hu, R.M.[Rui-Min], Wang, Y.Y.[Yan-Ye], Yu, H.J.[Hong-Jiang], Dong, S.L.[Shao-Long], Gao, L.[Li], Yang, C.[Cheng],
Signal-Aware Parametric Quality Model for Audio and Speech over IP Networks,
MMMod15(I: 487-497).
Springer DOI 1501
BibRef

Xue, L.[Like], Su, F.[Feng],
Auditory Scene Classification with Deep Belief Network,
MMMod15(I: 348-359).
Springer DOI 1501
BibRef

Tu, M.[Ming], Xie, X.[Xiang], Na, X.Y.[Xing-Yu],
Computational Auditory Scene Analysis Based Voice Activity Detection,
ICPR14(797-802)
IEEE DOI 1412
Feature extraction BibRef

Lu, T.[Tong], Weng, Y.B.[Yang-Bing], Wang, G.Y.[Gong-You],
Audiotory Movie Summarization by Detecting Scene Changes and Sound Events,
ICPR14(756-760)
IEEE DOI 1412
Awards activities BibRef

Nguyen-Son, H.Q.[Hoang-Quoc], Hoang, A.T.[Anh-Tu], Tran, M.T.[Minh-Triet], Yoshiura, H.[Hiroshi], Sonehara, N.[Noboru], Echizen, I.[Isao],
Anonymizing Temporal Phrases in Natural Language Text to be Posted on Social Networking Services,
IWDW13(437-451).
Springer DOI 1407
BibRef

Maka, T.[Tomasz], Dziurzanski, P.[Piotr],
Feature contours fusion for determining segment boundaries in audio data,
WSSIP14(111-114) 1406
Educational institutions BibRef

Souza, D.[Danilo], Saturnino, L.[Levi], Maciel, A.M.A.[Alexandre M.A.],
A portability evaluation of Brazilian Portuguese voices produced with MARY TTS,
WSSIP14(95-98) 1406
BibRef

Frid, A.[Alex], Lavner, Y.Z.[Yi-Zhar],
Spectral and textural features for automatic classification of fricatives using SVM,
WSSIP14(99-102) 1406
Auditory system BibRef

Savchenko, A.V.[Andrey V.],
Semi-automated Speaker Adaptation: How to Control the Quality of Adaptation?,
ICISP14(638-646).
Springer DOI 1406
BibRef

Merazka, F.[Fatiha],
Wideband Speech Encryption Based Arnold Cat Map for AMR-WB G.722.2 Codec,
ICISP14(658-664).
Springer DOI 1406
BibRef

Souli, S.[Sameh], Lachiri, Z.[Zied], Kuznietsov, A.[Alexander],
Using Three Reassigned Spectrogram Patches and Log-Gabor Filter for Audio Surveillance Application,
CIARP13(I:527-534).
Springer DOI 1311
BibRef

Joseph, S.M.[Shijo M.], Babu, A.P.[Anto P.],
Continuous speech coding using coiflets wavelet,
ICSIPR13(253-257).
IEEE DOI 1304
BibRef

Nivedita, D.[Deshpande], Kavita, T.[Thakur], Zadgaonkar, A.S.,
First degree heart block determination from speech analysis,
ICSIPR13(103-106).
IEEE DOI 1304
BibRef

Sadjadi, S.O., Hansen, J.H.L.,
Unsupervised Speech Activity Detection Using Voicing Measures and Perceptual Spectral Flux,
SPLetters(20), No. 3, March 2013, pp. 197-200.
IEEE DOI 1303
BibRef

Zhang, L.[Long], Li, H.F.[Hai-Feng], Ma, L.[Lin],
An adaptive unsupervised clustering of pronunciation errors for automatic pronunciation error detection,
ICPR12(1521-1525).
WWW Link. 1302
BibRef

Rosales-Pérez, A.[Alejandro], Reyes-García, C.A.[Carlos A.], Gonzalez, J.A.[Jesus A.], Arch-Tirado, E.[Emilio],
Infant Cry Classification Using Genetic Selection of a Fuzzy Model,
CIARP12(212-219).
Springer DOI 1209
BibRef

González, D.C.[Diana Cristina], Ling, L.L.[Lee Luan], Violaro, F.[Fábio],
Analysis of the Multifractal Nature of Speech Signals,
CIARP12(740-748).
Springer DOI 1209
BibRef

Tanveer, S.[Saad], Muhammad, A.[Aslam], Martinez-Enriquez, A.M., Escalada-Imaz, G.,
Phonetic Unification of Multiple Accents for Spanish and Arabic Languages,
MCPR12(323-333).
Springer DOI 1208
BibRef

Falek, L.[Leila], Teffahi, H.[Hocine], Djeradi, A.[Amar],
Methodology for Acoustic Characterization of a Labial Constraint in Speech Production,
ICISP12(131-141).
Springer DOI 1208
BibRef

Krum, D.M.[David M.], Suma, E.A.[Evan A.], Bolas, M.[Mark],
Spatial misregistration of virtual human audio: Implications of the precedence effect,
3DUI12(147-148).
IEEE DOI 1204
BibRef

Yang, Y.J.[Ying-Jie], Zhang, H.H.[Huan-Huan], Guo, X.[Xiue],
A pitch tracking method mixing ACF and AMDF algorithms based on correlations,
IASP11(553-556).
IEEE DOI 1112
autocorrelation functions; average magnitude difference functions. Speech BibRef

Guo, S.[Shuni], Gao, L.[Lu], Yu, H.Z.[Hong-Zhi],
Research on Lhasa Tibetan prosodic model of journalese based on respiratory signal,
IASP11(26-30).
IEEE DOI 1112
BibRef

Resmi, K., Kumar, S.[Satish], Sardana, H.K., Chhabra, R.[Radhika],
Graphical Speech Training system for hearing impaired,
ICIIP11(1-6).
IEEE DOI 1112
BibRef

Gómez, J.A.[Jon Ander], Calvo, M.[Marcos],
Improvements on Automatic Speech Segmentation at the Phonetic Level,
CIARP11(557-564).
Springer DOI 1111
BibRef

Le, P.N.[Phu Ngoc], Epps, J.[Julien], Choi, E.H.C.[Eric H.C.], Ambikairajah, E.[Eliathamby],
A Study of Voice Source and Vocal Tract Filter Based Features in Cognitive Load Classification,
ICPR10(4516-4519).
IEEE DOI 1008
BibRef

Stark, M.[Michael], Wohlmayr, M.[Michael], Pernkopf, F.[Franz],
Single Channel Speech Separation Using Source-Filter Representation,
ICPR10(826-829).
IEEE DOI 1008
BibRef

Stadelmann, T.[Thilo], Wang, Y.H.[Ying-Hui], Smith, M.[Matthew], Ewerth, R.[Ralph], Freisleben, B.[Bernd],
Rethinking Algorithm Design and Development in Speech Processing,
ICPR10(4476-4479).
IEEE DOI 1008
BibRef

Gonzalez-Caravaca, G.[Guillermo], Toledano, D.T.[Doroteo Torre], Puertas, M.[Maria],
Phone-Conditioned Suboptimal Wiener Filtering,
ICPR10(4480-4483).
IEEE DOI 1008
BibRef

Sepehr, H.[Hamid], Nooralahiyan, A.Y.[Amir Y.], Brennan, P.V.[Paul V.],
Improving Performance of a Noise Reduction Algorithm by Switching the Analysis Filter Bank,
ICISP10(262-271).
Springer DOI 1006
for speech BibRef

Kos, M., Grasic, M., Vlaj, D., Kacic, Z.,
On-Line Speech/Music Segmentation for Broadcast News Domain,
WSSIP09(1-4).
IEEE DOI 0906
BibRef

Grasic, M., Kos, M., Vlaj, D., Kacic, Z.,
The Influence of Speech/Non-Speech Segmentation on On-Line and Off-Line Speaker Segmentation Accuracy,
WSSIP09(1-4).
IEEE DOI 0906
BibRef

Zuta, V.[Vivien],
Voice Pleasantness of Female Voices and the Assessment of Physical Characteristics,
COST08(116-125).
Springer DOI 0810
BibRef

Pignotti, A.[Alessio], Marcozzi, D.[Daniele], Cifani, S.[Simone], Squartini, S.[Stefano], Piazza, F.[Francesco],
A Blind Source Separation Based Approach for Speech Enhancement in Noisy and Reverberant Environment,
COST08(356-367).
Springer DOI 0810
BibRef

Stadelmann, T., Heinzl, S., Unterberger, M., Freisleben, B.,
WebVoice: A Toolkit for Perceptual Insights into Speech Processing,
CISP09(1-5).
IEEE DOI 0910
BibRef

Tang, Y.B.[Yi-Bin], Huang, R.[Rong], Wu, Z.Y.[Zhen-Yang],
A 2.4kbps Multiband Characteristic Waveform Interpolation Speech Coding Algorithm,
CISP09(1-4).
IEEE DOI 0910
BibRef

Zou, X.[Xia], Zhang, X.W.[Xiong-Wei],
A 450bps Speech Coding Algorithm Based on Multi-Mode Matrix Quantization,
CISP09(1-3).
IEEE DOI 0910
BibRef

Kuhnapfel, T.[Thorsten], Tan, T.[Tele], Venkatesh, S.[Svertha], Igel, B.[Burkhard],
Distributed Audio Network for Speech Enhancement in Challenging Noise Backgrounds,
AVSBS09(308-313).
IEEE DOI 0909
BibRef

Kuhnapfel, T.[Thorsten], Tan, T.[Tele], Venkatesh, S.[Svetha], Nordholm, S.E.[Sven Erik], Igel, B.[Burkhard],
Adaptive speech enhancement with varying noise backgrounds,
ICPR08(1-4).
IEEE DOI 0812
BibRef

Li, X.K.[Xiao-Kun], Deng, Y.[Yunbin],
Combining speech energy and edge information for fast and efficient voice activity detection in noisy environments,
ICPR08(1-4).
IEEE DOI 0812
BibRef

Kukharchik, P., Kheidorov, I., Bovbel, E., Ladeev, D.,
Speech Signal Processing Based on Wavelets and SVM for Vocal Tract Pathology Detection,
ICISP08(192-199).
Springer DOI 0807
BibRef

Nagesha, Kumar, G.H.[G. Hemantha],
Signal Resampling Technique Combining Level Crossing and Auditory Features,
PReMI07(447-454).
Springer DOI 0712
BibRef

Ferrer, C.A.[Carlos A.], González, E.[Eduardo], Hernández-Díaz, M.E.[María E.],
Evaluation of Time and Frequency Domain-Based Methods for the Estimation of Harmonics-to-Noise-Ratios in Voice Signals,
CIARP06(406-415).
Springer DOI 0611
BibRef

Li, W.H.[Wei-Hong], Liu, M.[Ming], Zhu, Z.G.[Zhi-Gang], Huang, T.S.[Thomas S.],
LDV Remote Voice Acquisition and Enhancement,
ICPR06(IV: 262-265).
IEEE DOI 0609
BibRef

Xue, W.[Wei], Du, S.[Sidan], Fang, C.Z.[Cheng-Zhi], Ye, Y.X.[Ying-Xian],
Voice Activity Detection Using Wavelet-Based Multiresolution Spectrum and Support Vector Machines and Audio Mixing Algorithm,
CVHCI06(78-88).
Springer DOI 0605
BibRef

García-Perera, L.P.[L. Paola], Nolazco-Flores, J.A.[Juan A.], Mex-Perera, C.[Carlos],
Cryptographic-Speech-Key Generation Architecture Improvements,
IbPRIA05(II:579).
Springer DOI 0509
BibRef

Welk, M.[Martin], Bergmeister, A.[Achim], Weickert, J.[Joachim],
Denoising of Audio Data by Nonlinear Diffusion,
ScaleSpace05(598-609).
Springer DOI 0505
BibRef

Cristani, M., Bicego, M., Murino, V.,
On-line adaptive background modelling for audio surveillance,
ICPR04(II: 399-402).
IEEE DOI 0409
BibRef

Chapter on New Unsorted Entries, and Other Miscellaneous Papers continues in
Speech Synthesis, Synthetic Speech .


Last update:Mar 16, 2024 at 20:36:19