Publications

Filter by:

2015

conference
S. Drgas and T. Virtanen. "Speaker verification using adaptive dictionaries in non-negative spectrogram deconvolution". 12th International Conference on Latent Variable Analysis and Signal Separation. 2015.
conference
T. Barker, T. Virtanen and N. H. Pontoppidan. "Low-Latency Sound-Source-Separation using Non-Negative Matrix Factorisation with Coupled Analysis and Synthesis Dictionaries". ICASSP 2015. 2015.
conference
A. Diment and T. Virtanen. "Archetypal analysis for audio dictionary learning". IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). 2015.
conference
A. Hurmalainen, R. Saeidi and T. Virtanen. "Similarity Induced Group Sparsity for Non-negative Matrix Factorisation". Proceedings of 40th IEEE International Conference on Audio, Speech and Signal Processing (ICASSP). 2015. pp. 4425-4429.

2014

article
J. Nikunen and T. Virtanen. "Direction of Arrival Based Spatial Covariance Model for Blind Sound Source Separation", IEEE/ACM Transactions on Audio, Speech & Language Processing, Vol. 22, March, 2014, pp. 727-739.
conference
T. Barker, H. V. Hamme and T. Virtanen. "Modelling Primitive Streaming of Simple Tone Sequences Through Factorisation of Modulation Pattern Tensors". INTERSPEECH2014, 15th Annual Conference of the International Speech Communication Association, 14-18 September 2014, Singapore. 2014. pp. 1371-1375.
conference
T. Barker, T. Virtanen and O. Delhomme. "Ultrasound-Coupled Semi-Supervised Nonnegative Matrix Factorisation for Speech Enhancement". 2014 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP), Florence, Italy, May 4-9.2014. 2014. pp. 2148-2152.
conference
D. Baby, T. Virtanen, T. Barker and H. V. Hamme. "Coupled Dictionary Training for Exemplar-Based Speech Enhancement". 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 4-9 May 2014, Florence. 2014. pp. 2883 - 2887.
conference
G. Sanchez, H. Silén, J. Nurminen and M. Gabbouj. "Hierarchical modeling of F0 contours for voice conversion". INTERSPEECH 2014, Proceedings of the15th Annual Conference of the International Speech Communication Association, 14-18, September 2014, Singapore. 2014. pp. 2318-2321.
conference
O. Gencoglu, T. Virtanen and H. Huttunen. "Recognition of Acoustic Events Using Deep Neural Networks". 2014.
conference
T. Barker and T. Virtanen. "Semi-supervised non-negative tensor factorisation of modulation spectrograms for monaural speech separation". Neural Networks (IJCNN), 2014 International Joint Conference on. 2014. pp. 3556-3561.
article
T. Heittola, A. Mesaros, D. Korpi, A. Eronen and T. Virtanen. "Method for creating location-specific audio textures", EURASIP Journal on Audio, Speech and Music Processing, Vol. 2014. 2014.
conference
T. Virtanen, B. Raj, J. Gemmeke and H. V. Hamme. "Active-set Newton algorithm for non-negative sparse coding of audio". In Proc. International Conference on Acoustics, Speech, and Signal Processing. 2014.
article
Z. Wu, T. Virtanen, E. S. Chng and H. Li. "Exemplar-based sparse representation with residual compensation for voice conversion", IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 22. 2014.
conference
D. Baby, T. Virtanen, T. Barker and H. V. Hamme. "Coupled Dictionary Training for Exemplar-based Speech Enhancement". International Conference on Acoustics, Speech, and Signal Processing. 2014.
conference
D. Baby, T. Virtanen, J. Gemmeke, T. Barker and H. V. Hamme. "Exemplar-based noise robust automatic speech recognition using modulation spectrogram features". IEEE Spoken Language Technology Workshop. 2014.
conference
T. Barker, H. V. Hamme and T. Virtanen. "Modelling Primitive Streaming of Simple Tone Sequences Through Factorisation of Modulation Pattern Tensors". INTERSPEECH 2014. 2014.
conference
J. Nikunen and T. Virtanen. "Multichannel audio separation by Direction of Arrival Based Spatial Covariance Model and Non-negative Matrix Factorization". Proceedings of 39th International Conference on Acoustics, Speech, and Signal Processing (ICASSP). 2014. pp. 6727-6731.
conference
P. Pertilä and J. Nikunen. "Microphone Array Post-Filtering Using Supervised Machine Learning for Speech Enhancement". INTERSPEECH 2014 - 15th Annual Conference of the International Speech Communication Association. 2014.
conference
M. Parviainen, P. Pertilä and M. S. Hämäläinen. "Self-localization of Wireless Acoustic Sensors in Meeting Rooms". 4th Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA). 2014.
incollection
A. Diment, P. Rajan, T. Heittola and T. Virtanen. "Group Delay Function from All-Pole Models for Musical Instrument Recognition". Aramaki et al eds. Springer International Publishing. 2014. pp. 606-618.

2013

conference
A. Hurmalainen and T. Virtanen. "Learning State Labels for Sparse Classification of Speech with Matrix Deconvolution". Proceedings of the Automatic Speech Recognition and Understanding Workshop (ASRU). 2013.
article
P. Pertilä, M. S. Hämäläinen and M. Mieskolainen. "Passive temporal offset estimation of multichannel recordings of an ad-hoc microphone array", IEEE Transactions on Audio, Speech, and Language Processing, Vol. 21, Nov., 2013, pp. 2393-2402.
conference
A. Diment, T. Heittola and T. Virtanen. "Semi-supervised Learning for Musical Instrument Recognition". 21st European Signal Processing Conference 2013 (EUSIPCO 2013). 2013.
conference
A. Hurmalainen and T. Virtanen. "Acquiring Variable Length Speech Bases for Factorisation-Based Noise Robust Speech Recognition". Proceedings of the 21st European Signal Processing Conference (EUSIPCO). 2013.
conference
J. Geiger et al.. "The TUM+TUT+KUL Approach to the CHiME Challenge 2013: Multi-Stream ASR Exploiting BLSTM Networks and Sparse NMF". proceedings of the 2nd CHiME workshop. 2013. pp. 25-30.
conference
J. Gemmeke, T. Virtanen and A. Hurmalainen. "HMM-Regularization for NMF-Based Noise Robust ASR". Proceedings of the 2nd CHiME workshop. 2013. pp. 47-52.
conference
A. Hurmalainen, J. Gemmeke and T. Virtanen. "Compact Long Context Spectral Factorisation Models for Noise Robust Recognition of Medium Vocabulary Speech". Proceedings of the 2nd CHiME workshop. 2013. pp. 13-18.
article
P. Pertilä. "Online Blind Speech Separation using Multiple Acoustic Speaker Tracking and Time-Frequency Masking", Computer Speech & Language, Vol. 27, May, 2013, pp. 683–702.
article
A. Hurmalainen, J. Gemmeke and T. Virtanen. "Modelling Non-stationary Noise with Spectral Factorisation in Automatic Speech Recognition", Computer Speech & Language, Vol. 27, May, 2013, pp. 763-779.
conference
J. Geiger et al.. "The TUM+TUT+KUL Approach to the 2nd CHiME Challenge: Multi-Stream ASR Exploiting BLSTM Networks and Sparse NMF". The 2nd International Workshop on Machine Listening in Multisource Environments CHiME Workshop, 1st June 2013, Vancouver, Canada (in conjuction with ICASSP). 2013. pp. 25-30.
conference
J. Gemmeke, A. Hurmalainen and T. Virtanen. "HMM-regularization for NMF-based noise robust ASR". The 2nd International Workshop on Machine Listening in Multisource Environments CHiME Workshop, 1st June 2013, Vancouver, Canada (in conjuction with ICASSP). 2013. pp. 47-52.
conference
J. Nurminen, H. Silen, E. Helander and M. Gabbouj. "Evaluation of detailed modeling of the LP residual in statistical speech synthesis". 2013 IEEE International Symposium on Circuits and Systems, May 19-23,2013, Beijing, China. 2013. pp. 313-316.
conference
J. Nurminen, H. Silen and M. Gabbouj. "Speaker-specific retraining for enhanced compression of unit selection text-to-speech databases". Proceedings of the 14th Annual Conference of the International Speech Communication Association (Interspeech 2013), 25-29 August, Lyon, France. 2013. pp. 388-391.
conference
H. Silen, J. Nurminen, E. Helander and M. Gabbouj. "Voice Conversion for Non-Parallel Datasets Using Dynamic Kernel Partial Least Squares Regression". Proceedings of the 14th Annual Conference of the International Speech Communication Association (Interspeech 2013), 25-29 August, Lyon, France. 2013. pp. 373-377.
conference
K. Mahkonen et al.. "Music Dereverberation by Spectral Linear Prediction in Live Recordings". 16th International Conference on Digital Audio Effects, Ireland, 2-5.9,2013. 2013.
conference
T. Barker and T. Virtanen. "Non-negative Tensor Factorisation of Modulation Spectrograms for Monaural Sound Source Separation". Proceedings of the 14th Annual Conference of the International Speech Communication Association (Interspeech 2013), 25-29 August, Lyon, France. 2013. pp. 827 - 831.
conference
A. Mesaros, T. Heittola and K. Palomäki. "Query-by-example retrieval of sound events using an integrated similarity measure of content and label". 14th International Workshop on Image and Audio Analysis for Multimedia Interactive Services (WIA2MIS). 2013. pp. 1-4.
article
T. Heittola, A. Mesaros, A. Eronen and T. Virtanen. "Context-Dependent Sound Event Detection", EURASIP Journal on Audio, Speech and Music Processing. 2013.
conference
A. Mesaros, T. Heittola and K. Palomäki. "Analysis of acoustic-semantic relationship for diversely annotated real-world audio data". Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. 2013. pp. 813-817.
conference
T. Heittola, A. Mesaros, T. Virtanen and M. Gabbouj. "Supervised Model Training for Overlapping Sound Events Based on Unsupervised Source Separation". Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. 2013.
inbook
F. Briggs et al.. "The 9th Annual MLSP Competition: New Methods For Acoustic Classification Of Multiple Simultaneous Bird Species In a Noisy Environment". Institute of Electrical and Electronics Engineers IEEE. 2013.
conference
P. Pertilä and A. Tinakari. "Time-of-Arrival Estimation for Blind Beamforming". 2013.
conference
A. Diment, R. Padmanabhan, T. Heittola and T. Virtanen. "Modified Group Delay Feature for Musical Instrument Recognition". 10th International Symposium on Computer Music Multidisciplinary Research (CMMR). 2013.
conference
J. Gemmeke, T. Virtanen and K. Demuynck. "Exemplar-based joint channel and noise compensation". In Proc. International Conference on Acoustics, Speech, and Signal Processing. 2013.
conference
Z. Wu, T. Virtanen, T. Kinnunen, E. S. Chng and H. Li. "Exemplar-based Voice Conversion using Non-negative Spectrogram Deconvolution". in proc. 8th ISCA Speech Synthesis Workshop. 2013.
conference
Z. Wu, T. Virtanen, T. Kinnunen, E. S. Chng and H. Li. "Exemplar-based unit selection for voice conversion utilizing temporal information". In proc. Interspeech. 2013.
conference
J. Kauppinen, A. Klapuri and T. Virtanen. "Music Self-Similarity Modeling Using Augmented Nonnegative Matrix Factorization of Block and Stripe Patterns". In proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). 2013.
article
T. Virtanen, J. Gemmeke and B. Raj. "Active-Set Newton Algorithm for Overcomplete Non-Negative Representations of Audio", IEEE Transactions on Audio, Speech, and Language Processing, Vol. 21. 2013.
article
A. Diment, R. Padmanabhan, T. Heittola and T. Virtanen. "Modified Group Delay Feature for Musical Instrument Recognition", 10th International Symposium on Computer Music Multidisciplinary Research (CMMR). 2013.

2012

article
J. Nikunen, T. Virtanen and M. Vilermo. "Multichannel Audio Upmixing by Time-Frequency Filtering Using Non-Negative Tensor Factorization", Journal of the Audio Engineering Society, Vol. 60, October, 2012, pp. 794-806.
conference
A. Hurmalainen, J. Gemmeke and T. Virtanen. "Detection, Separation and Recognition of Speech From Continuous Signals Using Spectral Factorisation". 20th European Signal Processing Conference (EUSIPCO). 2012. pp. 2649-2653.
article
E. Helander, H. Silén, T. Virtanen and M. Gabbouj. "Voice Conversion Using Dynamic Kernel Partial Least Squares Regression", IEEE Transactions on Audio, Speech, and Language Processing, Vol. 3, March, 2012, pp. 806 - 817.
conference
V. Popa, H. Silén, J. Nurminen and M. Gabbouj. "Local Linear Transformation for Voice Conversion". ICASSP. 2012.
article
S. Kiranyaz, T. Mäkinen and M. Gabbouj. "Dynamic and scalable audio classification by collective network of binary classifiers framework: An evolutionary approach", Neural Networks, Vol. 34. 2012, pp. 80-95.
conference
H. Silen, E. Helander, J. Nurminen and M. Gabbouj. "Ways to Implement Global Variance in Statistical Speech Synthesis". Proceedings of 13th Annual Conference of the International Speech Communication Association, Interspeech 2012, September 9 - 13, Portland, Oregon, USA. 2012. pp. 1-4.
conference
J. Nurminen, H. Silen, V. Popa, E. Helander and M. Gabbouj. "Voice Conversion". Speech Enhancement, Modeling and Recognition: Algorithms and Applications. S. Ramakrishnan ed. 2012. pp. 1-27.
article
T. Mäkinen, S. Kiranyaz, J. Raitoharju and M. Gabbouj. "An evolutionary feature synthesis approach for content-based audio retrieval", EURASIP Journal on Audio, Speech, and Music Processing. 2012.
book
T. Virtanen, R. Singh and B. Raj. Techniques for Noise Robustness in Automatic Speech Recognition, John Wiley & Sons, 2012.
conference
J. Nurminen, H. Silén, V. Popa, E. Helander and M. Gabbouj. "Ch. Voice Conversion in Speech Enhancement, Modeling and Recognition - Algorithms and Applications". S. Ramakrishnan ed. 2012.
conference
F. Weninger et al.. "Non-Negative Matrix Factorization for Highly Noise-Robust ASR: to Enhance or to Recognize?". In proc. 37th International Conference on Acoustics, Speech, and Signal Processing (ICASSP). 2012.
conference
A. Hurmalainen and T. Virtanen. "Modelling spectro-temporal dynamics in factorisation-based noise-robust automatic speech recognition". Proc. 37th International Conference on Acoustics, Speech, and Signal Processing (ICASSP). 2012.
conference
F. Mazhar, T. Heittola, T. Virtanen and J. Holm. "Automatic Scoring of Guitar Chords". Proc. AES 45th International Conference. 2012.
conference
T. Mäkinen, S. Kiranyaz, J. Pulkkinen and M. Gabbouj. "Evolutionary Feature Generation for Content-based Audio Classification and Retrieval". 20th European Signal Processing Conference (EUSIPCO). 2012.
article
J. Nikunen, T. Virtanen and M. Vilermo. "Multichannel Audio Upmixing by Time-Frequency Filtering Using Non-Negative Tensor Factorization", Journal of the Audio Engineering Society, Vol. 60. 2012, pp. 794-806.
article
D. Korpi, T. Heittola, T. Partala, A. Eronen, A. Mesaros and T. Virtanen. "On the human ability to discriminate audio ambiances from similar locations of an urban environment", Personal and Ubiquitous Computing, Vol. November 2012. 2012.
conference
P. Pertilä, M. Mieskolainen and M. Hämäläinen. "Passive Self-Localization of Microphones Using Ambient Sounds". Proc. 20th European Signal Processing Conference (EUSIPCO-2012). 2012.
conference
R. Saeidi, A. Hurmalainen, T. Virtanen and D. van Leeuwen. "Exemplar-based Sparse Representation and Sparse Discrimination for Noise Robust Speaker Identification". Proc. Odyssey 2012: The Speaker and Language Recognition Workshop. 2012.
conference
A. B. Rad and T. Virtanen. "Phase spectrum prediction of audio signals". 5th International Symposium on Communications, Control and Signal Processing. 2012.
conference
J. Nikunen, T. Virtanen, P. Pertilä and M. Vilermo. "Permutation Alignment Of Frequency-Domain Ica By The Maximization Of Intra-Source Envelope Correlations". European Signal Processing Conference (EUSIPCO). 2012.
conference
A. Hurmalainen, R. Saeidi and T. Virtanen. "Group Sparsity for Speaker Identity Discrimination in Factorisation-based Speech Recognition". 13th Interspeech. 2012.
conference
F. Rodriguez-Serrano, J. J. Orti, P. Vera-Candeas, T. Virtanen and N. Ruiz-Reyes. "Multiple Instrument Mixtures Source Separation Evaluation Using Instrument-Dependent NMF Models". The 10th International Conference on Latent Variable Analysis and Source Separation. 2012.

2011

article
J. Gemmeke, T. Virtanen and A. Hurmalainen. "Exemplar-based Sparse Representations for Noise Robust Automatic Speech Recognition", IEEE Transactions on Audio, Speech, and Language Processing, Vol. 19, September, 2011, pp. 2067-2080.
conference
A. Hurmalainen, K. Mahkonen, J. Gemmeke and T. Virtanen. "Exemplar-based Recognition of Speech in Highly Variable Noise". Proc. International Workshop on Machine Listening in Multisource Environments (CHiME). 2011. pp. 1-5.
conference
J. Gemmeke, T. Virtanen and A. Hurmalainen. "Exemplar-Based Speech Enhancement and its Application to Noise-Robust Automatic Speech Recognition". Proc. International Workshop on Machine Listening in Multisource Environments (CHiME). 2011. pp. 53-57.
conference
H. Silén, E. Helander and M. Gabbouj. "Prediction of voice aperiodicity based on spectral representations in HMM speech synthesis". Interspeech. 2011. pp. 105 - 108.
conference
J. Gemmeke, A. Hurmalainen, T. Virtanen and S. Yang. "Toward a Practical Implementation of Exemplar-Based Noise Robust ASR". European Signal Processing Conference (EUSIPCO). 2011. pp. 1490-1494.
conference
A. Hurmalainen, J. Gemmeke and T. Virtanen. "Non-negative matrix deconvolution in noise robust speech recognition". Proceedings of International Conference on Audio, Speech and Signal Processing. 2011.
conference
J. Gemmeke, A. Hurmalainen, T. Virtanen and Y. Sun. "Toward A Practical Implementation Of Exemplar-Based Noise Robust ASR". EUSIPCO 2011: 19th European Signal Processing Conference, August 29 - September 2, 2011, Barcelona, Spain. 2011. pp. 1490-1494.
conference
K. Mahkonen, A. Hurmalainen, T. Virtanen and J. Gemmeke. "Mapping Sparse Representation to State Likelihoods in Noise-Robust Automatic Speech Recognition". Speech Science and Technology for Real Life, Conference Proceedings of Interspeech 2011, 27 - 31 August, 2011, Florence, Italy. 2011. pp. 465-468.
conference
A. Hurmalainen, T. Virtanen, J. Gemmeke and K. Mahkonen. "Esimerkkipohjainen meluisan puheen automaattinen tunnistus". Akustiikkapäivät 2011, Tampere, 11.-12.5.2011, Akustinen Seura ry. 2011. pp. 1-5.
conference
T. Heittola, A. Mesaros, T. Virtanen and A. Eronen. "Sound event detection and context recognition". Akustiikkapäivät 2011. 2011. pp. 51-56.
article
V. Popa, J. Nurminen and M. Gabbouj. "A Study of Bilinear Models in Voice Conversion", Journal of Signal and Information Processing, Vol. 2. 2011, pp. 125-139.
conference
T. Mäkinen, S. Kiranyaz and M. Gabbouj. "Content-based Audio Classification using Collective Network of Binary Classifiers". IEEE Workshop on Evolving and Adaptive Intelligent Systems. 2011. pp. 116 - 123.
conference
T. Heittola, A. Mesaros, T. Virtanen and A. Eronen. "Sound Event Detection in Multisource Environments Using Source Separation". CHiME 2011 - Workshop on Machine Listening in Multisource Environments. 2011. pp. 36-40.
conference
A. Mesaros, T. Heittola and A. Klapuri. "Latent Semantic Analysis in Sound Event Detection". European Signal Processing Conference (EUSIPCO-2011). 2011. pp. 1307-1311.
article
J. J. Orti, T. Virtanen, P. Vera-Candeas, N. Ruiz-Reyes and F. J. Canadas-Quesada. "Musical Instrument Sound Multi-Excitation Model for Non-Negative Spectrogram Factorization", . IEEE Journal of Selected Topics in Signal Processing, Vol. 5. 2011.
conference
P. Pertilä, M. Mieskolainen and M. S. Hämäläinen. "Closed-Form Self-Localization of Asynchronous Microphone Arrays". In Proc. The Third Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA'11). 2011.
conference
H. Kallasjoki, U. Remes, J. Gemmeke, T. Virtanen and K. Palomäki. "Uncertainty measures for improving exemplar-based source separation". 12th Annual Conference of the International Speech Communication Association. 2011.
conference
B. Raj, R. Singh and T. Virtanen. "Phoneme-dependent NMF for speech enhancement in monaural mixtures". In proc. 12th Annual Conference of the International Speech Communication Association. 2011.
conference
J. Nikunen, T. Virtanen and M. Vilermo. "Multichannel audio upmixing based on non-negative tensor factorization representation". IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. 2011.

2010

article
T. Mäkinen and P. Pertilä. "Shooter localization and bullet trajectory, caliber, and speed estimation based on detected firing sounds", Applied Acoustics, Vol. 10, October, 2010, pp. 902–913.
conference
E. Helander, H. Silén, J. Miguez and M. Gabbouj. "Maximum a posteriori voice conversion using sequential Monte Carlo methods". Interspeech. 2010.
conference
H. Silén, E. Helander, J. Nurminen and M. Gabbouj. "Analysis of Duration Prediction Accuracy in HMM-Based Speech Synthesis". The Fifth International Conference on Speech Prosody. 2010.
article
A. Eronen and A. Klapuri. "Music Tempo Estimation with k-NN regression", IEEE Trans. Audio, Speech and Language Processing, Vol. 18, January, 2010, pp. 50-57.
conference
J. Gemmeke and T. Virtanen. "Artificial and online acquired noise dictionaries for noise robust ASR". Proceedings of the 11th Annual Conference of the International Speech Communication Association, Makuhari, Chiba, Japan, September 26-30, 2010. 2010. pp. 2082-2085.
conference
S. Tervo and T. Korhonen. "Estimation of reflective surfaces from continuous signals". Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing ICASSP, Dallas, Texas, USA, March 14-19, 2010. 2010. pp. 153-156.
conference
T. Heittola, A. Mesaros, A. Eronen and T. Virtanen. "Audio context recognition using audio event histograms". In Proc. European Signal Processing Conference. 2010.
conference
A. Mesaros, T. Heittola, A. Eronen and T. Virtanen. "Acoustic event detection in real life recordings". In Proc. European Signal Processing Conference. 2010. pp. 1267-1271.
conference
H. Silén, E. Helander, J. Nurminen, K. Koppinen and M. Gabbouj. "Using Robust Viterbi Algorithm and HMM-Modeling in Unit Selection TTS to Replace Units of Poor Quality". Interspeech 2010. 2010.
Results 101 - 200 of 395