Publications

Filter by:

2013

conference
A. Hurmalainen and T. Virtanen. "Acquiring Variable Length Speech Bases for Factorisation-Based Noise Robust Speech Recognition". Proceedings of the 21st European Signal Processing Conference (EUSIPCO). 2013.
conference
J. Geiger et al.. "The TUM+TUT+KUL Approach to the CHiME Challenge 2013: Multi-Stream ASR Exploiting BLSTM Networks and Sparse NMF". proceedings of the 2nd CHiME workshop. 2013. pp. 25-30.
conference
J. Gemmeke, T. Virtanen and A. Hurmalainen. "HMM-Regularization for NMF-Based Noise Robust ASR". Proceedings of the 2nd CHiME workshop. 2013. pp. 47-52.
conference
A. Hurmalainen, J. Gemmeke and T. Virtanen. "Compact Long Context Spectral Factorisation Models for Noise Robust Recognition of Medium Vocabulary Speech". Proceedings of the 2nd CHiME workshop. 2013. pp. 13-18.
article
P. Pertilä. "Online Blind Speech Separation using Multiple Acoustic Speaker Tracking and Time-Frequency Masking", Computer Speech & Language, Vol. 27, May, 2013, pp. 683–702.
article
A. Hurmalainen, J. Gemmeke and T. Virtanen. "Modelling Non-stationary Noise with Spectral Factorisation in Automatic Speech Recognition", Computer Speech & Language, Vol. 27, May, 2013, pp. 763-779.
conference
J. Geiger et al.. "The TUM+TUT+KUL Approach to the 2nd CHiME Challenge: Multi-Stream ASR Exploiting BLSTM Networks and Sparse NMF". The 2nd International Workshop on Machine Listening in Multisource Environments CHiME Workshop, 1st June 2013, Vancouver, Canada (in conjuction with ICASSP). 2013. pp. 25-30.
conference
J. Gemmeke, A. Hurmalainen and T. Virtanen. "HMM-regularization for NMF-based noise robust ASR". The 2nd International Workshop on Machine Listening in Multisource Environments CHiME Workshop, 1st June 2013, Vancouver, Canada (in conjuction with ICASSP). 2013. pp. 47-52.
conference
J. Nurminen, H. Silen, E. Helander and M. Gabbouj. "Evaluation of detailed modeling of the LP residual in statistical speech synthesis". 2013 IEEE International Symposium on Circuits and Systems, May 19-23,2013, Beijing, China. 2013. pp. 313-316.
conference
J. Nurminen, H. Silen and M. Gabbouj. "Speaker-specific retraining for enhanced compression of unit selection text-to-speech databases". Proceedings of the 14th Annual Conference of the International Speech Communication Association (Interspeech 2013), 25-29 August, Lyon, France. 2013. pp. 388-391.
conference
H. Silen, J. Nurminen, E. Helander and M. Gabbouj. "Voice Conversion for Non-Parallel Datasets Using Dynamic Kernel Partial Least Squares Regression". Proceedings of the 14th Annual Conference of the International Speech Communication Association (Interspeech 2013), 25-29 August, Lyon, France. 2013. pp. 373-377.
conference
K. Mahkonen et al.. "Music Dereverberation by Spectral Linear Prediction in Live Recordings". 16th International Conference on Digital Audio Effects, Ireland, 2-5.9,2013. 2013.
conference
T. Barker and T. Virtanen. "Non-negative Tensor Factorisation of Modulation Spectrograms for Monaural Sound Source Separation". Proceedings of the 14th Annual Conference of the International Speech Communication Association (Interspeech 2013), 25-29 August, Lyon, France. 2013. pp. 827 - 831.
conference
A. Mesaros, T. Heittola and K. Palomäki. "Query-by-example retrieval of sound events using an integrated similarity measure of content and label". 14th International Workshop on Image and Audio Analysis for Multimedia Interactive Services (WIA2MIS). 2013. pp. 1-4.
article
T. Heittola, A. Mesaros, A. Eronen and T. Virtanen. "Context-Dependent Sound Event Detection", EURASIP Journal on Audio, Speech and Music Processing. 2013.
conference
A. Mesaros, T. Heittola and K. Palomäki. "Analysis of acoustic-semantic relationship for diversely annotated real-world audio data". Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. 2013. pp. 813-817.
conference
T. Heittola, A. Mesaros, T. Virtanen and M. Gabbouj. "Supervised Model Training for Overlapping Sound Events Based on Unsupervised Source Separation". Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. 2013.
inbook
F. Briggs et al.. "The 9th Annual MLSP Competition: New Methods For Acoustic Classification Of Multiple Simultaneous Bird Species In a Noisy Environment". Institute of Electrical and Electronics Engineers IEEE. 2013.
conference
P. Pertilä and A. Tinakari. "Time-of-Arrival Estimation for Blind Beamforming". 2013.
conference
A. Diment, R. Padmanabhan, T. Heittola and T. Virtanen. "Modified Group Delay Feature for Musical Instrument Recognition". 10th International Symposium on Computer Music Multidisciplinary Research (CMMR). 2013.
conference
J. Gemmeke, T. Virtanen and K. Demuynck. "Exemplar-based joint channel and noise compensation". In Proc. International Conference on Acoustics, Speech, and Signal Processing. 2013.
conference
Z. Wu, T. Virtanen, T. Kinnunen, E. S. Chng and H. Li. "Exemplar-based Voice Conversion using Non-negative Spectrogram Deconvolution". in proc. 8th ISCA Speech Synthesis Workshop. 2013.
conference
Z. Wu, T. Virtanen, T. Kinnunen, E. S. Chng and H. Li. "Exemplar-based unit selection for voice conversion utilizing temporal information". In proc. Interspeech. 2013.
conference
J. Kauppinen, A. Klapuri and T. Virtanen. "Music Self-Similarity Modeling Using Augmented Nonnegative Matrix Factorization of Block and Stripe Patterns". In proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). 2013.
article
T. Virtanen, J. Gemmeke and B. Raj. "Active-Set Newton Algorithm for Overcomplete Non-Negative Representations of Audio", IEEE Transactions on Audio, Speech, and Language Processing, Vol. 21. 2013.
article
A. Diment, R. Padmanabhan, T. Heittola and T. Virtanen. "Modified Group Delay Feature for Musical Instrument Recognition", 10th International Symposium on Computer Music Multidisciplinary Research (CMMR). 2013.

2012

article
J. Nikunen, T. Virtanen and M. Vilermo. "Multichannel Audio Upmixing by Time-Frequency Filtering Using Non-Negative Tensor Factorization", Journal of the Audio Engineering Society, Vol. 60, October, 2012, pp. 794-806.
conference
A. Hurmalainen, J. Gemmeke and T. Virtanen. "Detection, Separation and Recognition of Speech From Continuous Signals Using Spectral Factorisation". 20th European Signal Processing Conference (EUSIPCO). 2012. pp. 2649-2653.
article
E. Helander, H. Silén, T. Virtanen and M. Gabbouj. "Voice Conversion Using Dynamic Kernel Partial Least Squares Regression", IEEE Transactions on Audio, Speech, and Language Processing, Vol. 3, March, 2012, pp. 806 - 817.
conference
V. Popa, H. Silén, J. Nurminen and M. Gabbouj. "Local Linear Transformation for Voice Conversion". ICASSP. 2012.
article
S. Kiranyaz, T. Mäkinen and M. Gabbouj. "Dynamic and scalable audio classification by collective network of binary classifiers framework: An evolutionary approach", Neural Networks, Vol. 34. 2012, pp. 80-95.
conference
H. Silen, E. Helander, J. Nurminen and M. Gabbouj. "Ways to Implement Global Variance in Statistical Speech Synthesis". Proceedings of 13th Annual Conference of the International Speech Communication Association, Interspeech 2012, September 9 - 13, Portland, Oregon, USA. 2012. pp. 1-4.
conference
J. Nurminen, H. Silen, V. Popa, E. Helander and M. Gabbouj. "Voice Conversion". Speech Enhancement, Modeling and Recognition: Algorithms and Applications. S. Ramakrishnan ed. 2012. pp. 1-27.
article
T. Mäkinen, S. Kiranyaz, J. Raitoharju and M. Gabbouj. "An evolutionary feature synthesis approach for content-based audio retrieval", EURASIP Journal on Audio, Speech, and Music Processing. 2012.
book
T. Virtanen, R. Singh and B. Raj. Techniques for Noise Robustness in Automatic Speech Recognition, John Wiley & Sons, 2012.
conference
J. Nurminen, H. Silén, V. Popa, E. Helander and M. Gabbouj. "Ch. Voice Conversion in Speech Enhancement, Modeling and Recognition - Algorithms and Applications". S. Ramakrishnan ed. 2012.
conference
F. Weninger et al.. "Non-Negative Matrix Factorization for Highly Noise-Robust ASR: to Enhance or to Recognize?". In proc. 37th International Conference on Acoustics, Speech, and Signal Processing (ICASSP). 2012.
conference
A. Hurmalainen and T. Virtanen. "Modelling spectro-temporal dynamics in factorisation-based noise-robust automatic speech recognition". Proc. 37th International Conference on Acoustics, Speech, and Signal Processing (ICASSP). 2012.
conference
F. Mazhar, T. Heittola, T. Virtanen and J. Holm. "Automatic Scoring of Guitar Chords". Proc. AES 45th International Conference. 2012.
conference
T. Mäkinen, S. Kiranyaz, J. Pulkkinen and M. Gabbouj. "Evolutionary Feature Generation for Content-based Audio Classification and Retrieval". 20th European Signal Processing Conference (EUSIPCO). 2012.
article
J. Nikunen, T. Virtanen and M. Vilermo. "Multichannel Audio Upmixing by Time-Frequency Filtering Using Non-Negative Tensor Factorization", Journal of the Audio Engineering Society, Vol. 60. 2012, pp. 794-806.
article
D. Korpi, T. Heittola, T. Partala, A. Eronen, A. Mesaros and T. Virtanen. "On the human ability to discriminate audio ambiances from similar locations of an urban environment", Personal and Ubiquitous Computing, Vol. November 2012. 2012.
conference
P. Pertilä, M. Mieskolainen and M. Hämäläinen. "Passive Self-Localization of Microphones Using Ambient Sounds". Proc. 20th European Signal Processing Conference (EUSIPCO-2012). 2012.
conference
R. Saeidi, A. Hurmalainen, T. Virtanen and D. van Leeuwen. "Exemplar-based Sparse Representation and Sparse Discrimination for Noise Robust Speaker Identification". Proc. Odyssey 2012: The Speaker and Language Recognition Workshop. 2012.
conference
A. B. Rad and T. Virtanen. "Phase spectrum prediction of audio signals". 5th International Symposium on Communications, Control and Signal Processing. 2012.
conference
J. Nikunen, T. Virtanen, P. Pertilä and M. Vilermo. "Permutation Alignment Of Frequency-Domain Ica By The Maximization Of Intra-Source Envelope Correlations". European Signal Processing Conference (EUSIPCO). 2012.
conference
A. Hurmalainen, R. Saeidi and T. Virtanen. "Group Sparsity for Speaker Identity Discrimination in Factorisation-based Speech Recognition". 13th Interspeech. 2012.
conference
F. Rodriguez-Serrano, J. J. Orti, P. Vera-Candeas, T. Virtanen and N. Ruiz-Reyes. "Multiple Instrument Mixtures Source Separation Evaluation Using Instrument-Dependent NMF Models". The 10th International Conference on Latent Variable Analysis and Source Separation. 2012.

2011

article
J. Gemmeke, T. Virtanen and A. Hurmalainen. "Exemplar-based Sparse Representations for Noise Robust Automatic Speech Recognition", IEEE Transactions on Audio, Speech, and Language Processing, Vol. 19, September, 2011, pp. 2067-2080.
conference
A. Hurmalainen, K. Mahkonen, J. Gemmeke and T. Virtanen. "Exemplar-based Recognition of Speech in Highly Variable Noise". Proc. International Workshop on Machine Listening in Multisource Environments (CHiME). 2011. pp. 1-5.
conference
J. Gemmeke, T. Virtanen and A. Hurmalainen. "Exemplar-Based Speech Enhancement and its Application to Noise-Robust Automatic Speech Recognition". Proc. International Workshop on Machine Listening in Multisource Environments (CHiME). 2011. pp. 53-57.
conference
H. Silén, E. Helander and M. Gabbouj. "Prediction of voice aperiodicity based on spectral representations in HMM speech synthesis". Interspeech. 2011. pp. 105 - 108.
conference
J. Gemmeke, A. Hurmalainen, T. Virtanen and S. Yang. "Toward a Practical Implementation of Exemplar-Based Noise Robust ASR". European Signal Processing Conference (EUSIPCO). 2011. pp. 1490-1494.
conference
A. Hurmalainen, J. Gemmeke and T. Virtanen. "Non-negative matrix deconvolution in noise robust speech recognition". Proceedings of International Conference on Audio, Speech and Signal Processing. 2011.
conference
J. Gemmeke, A. Hurmalainen, T. Virtanen and Y. Sun. "Toward A Practical Implementation Of Exemplar-Based Noise Robust ASR". EUSIPCO 2011: 19th European Signal Processing Conference, August 29 - September 2, 2011, Barcelona, Spain. 2011. pp. 1490-1494.
conference
K. Mahkonen, A. Hurmalainen, T. Virtanen and J. Gemmeke. "Mapping Sparse Representation to State Likelihoods in Noise-Robust Automatic Speech Recognition". Speech Science and Technology for Real Life, Conference Proceedings of Interspeech 2011, 27 - 31 August, 2011, Florence, Italy. 2011. pp. 465-468.
conference
A. Hurmalainen, T. Virtanen, J. Gemmeke and K. Mahkonen. "Esimerkkipohjainen meluisan puheen automaattinen tunnistus". Akustiikkapäivät 2011, Tampere, 11.-12.5.2011, Akustinen Seura ry. 2011. pp. 1-5.
conference
T. Heittola, A. Mesaros, T. Virtanen and A. Eronen. "Sound event detection and context recognition". Akustiikkapäivät 2011. 2011. pp. 51-56.
article
V. Popa, J. Nurminen and M. Gabbouj. "A Study of Bilinear Models in Voice Conversion", Journal of Signal and Information Processing, Vol. 2. 2011, pp. 125-139.
conference
T. Mäkinen, S. Kiranyaz and M. Gabbouj. "Content-based Audio Classification using Collective Network of Binary Classifiers". IEEE Workshop on Evolving and Adaptive Intelligent Systems. 2011. pp. 116 - 123.
conference
T. Heittola, A. Mesaros, T. Virtanen and A. Eronen. "Sound Event Detection in Multisource Environments Using Source Separation". CHiME 2011 - Workshop on Machine Listening in Multisource Environments. 2011. pp. 36-40.
conference
A. Mesaros, T. Heittola and A. Klapuri. "Latent Semantic Analysis in Sound Event Detection". European Signal Processing Conference (EUSIPCO-2011). 2011. pp. 1307-1311.
article
J. J. Orti, T. Virtanen, P. Vera-Candeas, N. Ruiz-Reyes and F. J. Canadas-Quesada. "Musical Instrument Sound Multi-Excitation Model for Non-Negative Spectrogram Factorization", . IEEE Journal of Selected Topics in Signal Processing, Vol. 5. 2011.
conference
P. Pertilä, M. Mieskolainen and M. S. Hämäläinen. "Closed-Form Self-Localization of Asynchronous Microphone Arrays". In Proc. The Third Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA'11). 2011.
conference
H. Kallasjoki, U. Remes, J. Gemmeke, T. Virtanen and K. Palomäki. "Uncertainty measures for improving exemplar-based source separation". 12th Annual Conference of the International Speech Communication Association. 2011.
conference
B. Raj, R. Singh and T. Virtanen. "Phoneme-dependent NMF for speech enhancement in monaural mixtures". In proc. 12th Annual Conference of the International Speech Communication Association. 2011.
conference
J. Nikunen, T. Virtanen and M. Vilermo. "Multichannel audio upmixing based on non-negative tensor factorization representation". IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. 2011.

2010

article
T. Mäkinen and P. Pertilä. "Shooter localization and bullet trajectory, caliber, and speed estimation based on detected firing sounds", Applied Acoustics, Vol. 10, October, 2010, pp. 902–913.
conference
E. Helander, H. Silén, J. Miguez and M. Gabbouj. "Maximum a posteriori voice conversion using sequential Monte Carlo methods". Interspeech. 2010.
conference
H. Silén, E. Helander, J. Nurminen and M. Gabbouj. "Analysis of Duration Prediction Accuracy in HMM-Based Speech Synthesis". The Fifth International Conference on Speech Prosody. 2010.
article
A. Eronen and A. Klapuri. "Music Tempo Estimation with k-NN regression", IEEE Trans. Audio, Speech and Language Processing, Vol. 18, January, 2010, pp. 50-57.
conference
J. Gemmeke and T. Virtanen. "Artificial and online acquired noise dictionaries for noise robust ASR". Proceedings of the 11th Annual Conference of the International Speech Communication Association, Makuhari, Chiba, Japan, September 26-30, 2010. 2010. pp. 2082-2085.
conference
S. Tervo and T. Korhonen. "Estimation of reflective surfaces from continuous signals". Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing ICASSP, Dallas, Texas, USA, March 14-19, 2010. 2010. pp. 153-156.
conference
T. Heittola, A. Mesaros, A. Eronen and T. Virtanen. "Audio context recognition using audio event histograms". In Proc. European Signal Processing Conference. 2010.
conference
A. Mesaros, T. Heittola, A. Eronen and T. Virtanen. "Acoustic event detection in real life recordings". In Proc. European Signal Processing Conference. 2010. pp. 1267-1271.
conference
H. Silén, E. Helander, J. Nurminen, K. Koppinen and M. Gabbouj. "Using Robust Viterbi Algorithm and HMM-Modeling in Unit Selection TTS to Replace Units of Poor Quality". Interspeech 2010. 2010.
conference
T. Virtanen, J. Gemmeke and A. Hurmalainen. "State-based labelling for a sparse representation of speech and its application to robust speech recognition". Interspeech 2010. 2010.
conference
J. Gemmeke and T. Virtanen. "Artificial and online acquired noise dictionaries for noise robust ASR". Interspeech 2010. 2010.
article
E. Helander, T. Virtanen, J. Nurminen and M. Gabbouj. "Voice Conversion Using Partial Least Squares Regression", IEEE Transactions on Audio, Speech, and Language Processing. 2010.
conference
A. Klapuri, T. Virtanen and T. Heittola. "Sound source separation in monaural music signals using excitation-filter model and EM algorithm". IEEE International Conference on Acoustics, Speech, and Signal Processing. 2010.
conference
P. Pertilä and M. S. Hämäläinen. "A Track Before Detect Approach for Sequential Bayesian Tracking of Multiple Speech Sources". In Proc. IEEE 35th International Conference on Acoustics, Speech, and Signal Processing (ICASSP). 2010.
conference
B. Raj, T. Virtanen, S. Chaudhure and R. Singh. "Non-negative matrix factorization based compensation of music for automatic speech recognition". Interspeech 2010. 2010.
conference
S. Keronen, U. Remes, K. Palomäki, T. Virtanen and M. Kurimo. "Comparison of Noise Robust Methods in Large Vocabulary Speech Recognition". n Proc. European Signal Processing Conference. 2010.
conference
J. Nikunen and T. Virtanen. "Object-Based Audio Coding Using Non-Negative Matrix Factorization for the Spectrogram Representation". in proc. 128th Audio Engineering Society Convention. 2010.
conference
B. Raj, T. Virtanen, S. Chaudhure and R. Singh. "Non-negative matrix factorization based compensation of music for automatic speech recognition". Proceedings of Interspeech 2010. 2010.
article
A. Klapuri and T. Virtanen. "Representing Musical Sounds with an Interpolating State Model", IEEE Trans. Audio, Speech and Language Processing, Vol. 18. 2010.
article
M. Helén and T. Virtanen. "Audio query by example using similarity measures between probability density functions of features", EURASIP Journal on Audio, Speech and Music Processing, Vol. 2010. 2010.
article
A. Mesaros and T. Virtanen. "Automatic recognition of lyrics in singing", EURASIP Journal on Audio, Speech and Music Processing, Vol. 2010. 2010.
conference
J. Nikunen and T. Virtanen. "Noise-to-Mask Ratio Minimization by Weighted Non-negative Matrix Factorization". IEEE International Conference on Acoustics, Speech, and Signal Processing. 2010.
conference
A. Mesaros and T. Virtanen. "Recognition of phonemes and words in singing". proc. of the 35th International Conference on Acoustics, Speech, and Signal Processing. 2010.
conference
J. Gemmeke and T. Virtanen. "Noise robust exemplar-based connected digit recognition". proc. of the 35th International Conference on Acoustics, Speech, and Signal Processing (ICASSP). 2010.

2009

conference
T. Mäkinen, P. Pertilä and P. Auranen. "Supersonic bullet state estimation using particle filtering". Proceedings of 2009 IEEE International Conference on Signal and Image Processing Applications, ICSIPA. 2009.
conference
J. Paulus and A. Klapuri. "Music structure analysis with a probabilistic fitness function in MIREX2009". Proc. of the Fifth Annual Music Information Retrieval Evaluation eXchange. 2009.
conference
H. Silén, E. Helander, J. Nurminen and M. Gabbouj. "Parameterization of vocal fry in HMM-based speech synthesis". Proceedings of the 10th Annual Conference of the International Speech Communication Associationa, Interspeech. 2009. pp. 1775-1778.
conference
A. Löytynoja and P. Pertilä. "A real-time talker localization implementation using multi-PHAT and particle filter". Proceedings of the 17th European Signal Processing Conference, Eusipco. 2009. pp. 1418-1422.
article
J. Paulus and A. Klapuri. "Music Structure Analysis Using a Probabilistic Fitness Measure and a Greedy Search Algorithm", IEEE Transactions on Audio, Speech, and Language Processing, Vol. 17, Aug, 2009, pp. 1159-1170.
conference
D. D. Alves, J. Paulus and J. Fonseca. "Drum transcription from multichannel recordings with non-negative matrix factorization". Proc. of the 17th European Signal Processing Conference. 2009. pp. 894-898.
conference
M. Parviainen. "Robust self-localization solutions for meeting room environments". Proceedings of the 13th IEEE International Symposium on consumer Electronics, ISCE 2009, Kyoto, Japan, 25-28 May 2009. 2009. pp. 237-240.
conference
V. Popa, J. Nurminen and M. Gabbouj. "A novel technique for voice conversion based on style and content decomposition with bilinear models". Proceedings of the 10th Annual Conference of the International Speech Communication Associationa, Interspeech 2009, Brighton, UK, 6-10 September 2009. 2009. pp. 2655-2658.
conference
M. Helén, T. Lahti and A. Klapuri. "Tools for automatic audio management". Open Information Management: applications of interconnectivity and collaboration. S. Niiranen ed. 2009. pp. 244-265.
Results 101 - 200 of 366