Publications

Filter by:

2018

conference
S. I. Mimilakis, K. Drossos, J. F. Santos, G. Schuller, T. Virtanen and Y. Bengio. "Monaural Singing Voice Separation with Skip-Filtering Connections and Recurrent Inference of Time-Frequency Mask". 2018.
article
G. Naithani, J. Kivinummi, T. Virtanen, O. Tammela, M. J. Peltola and J. M. Leppänen. "Automatic segmentation of infant cry signals using hidden Markov models", Eurasip Journal on Audio, Speech, and Music Processing, Vol. 2018. 2018.
conference
K. Drossos, S. I. Mimilakis, D. Serdyuk, G. Schuller, T. Virtanen and Y. Bengio. "MaD TwinNet: Masker-Denoiser Architecture with Twin Networks for Monaural Sound Source Separation". Proceedings of the IEEE World Congress on Computational Intelligence (WCCI)/International Joint Conference on Neural Networks (IJCNN). 2018.
article
A. Mesaros et al. "Detection and Classification of Acoustic Scenes and Events: Outcome of the DCASE 2016 Challenge", IEEE-ACM Transactions on Audio Speech and Language Processing, 11, 2018.
inbook
A. Mesaros, T. Heittola and D. Ellis. "Datasets and Evaluation". T. Virtanen, M. D. Plumbley and D. Ellis eds. Springer. 2018. pp. 147-179.
inbook
T. Heittola, E. Cakir and T. Virtanen. "The machine learning approach for analysis of sound scenes and events". T. Virtanen, Plumbley, M. D. and D. Ellis eds. Springer. 2018. pp. 13-40.

2017

article
M. Parviainen and P. Pertilä. "Self-localization of dynamic user-worn microphones from observed speech", Applied Acoustics, Vol. Volume 117, Part A, February, 2017, pp. 76 - 85.
conference
K. Drossos, S. Adavanne and T. Virtanen. "Automated Audio Captioning with Recurrent Neural Networks". IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. 2017.
conference
D. Caballero et al.. "ASR in classroom today: Automatic visualization of conceptual network in science classrooms". Data Driven Approaches in Digital Education - 12th European Conference on Technology Enhanced Learning, EC-TEL 2017, Proceedings. 2017. pp. 541-544.
conference
K. Drossos, S. I. Mimilakis, A. Floros, T. Virtanen and G. Schuller. "Close Miking Empirical Practice Verification: A Source Separation Approach". Audio Engineering Society Convention 142. 2017.
conference
P. Magron, J. L. Roux and T. Virtanen. "Consistent Anisotropic Wiener Filtering for Audio Source Separation". IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. 2017. pp. 269-273.
conference
E. Cakir and T. Virtanen. "Convolutional Recurrent Neural Networks for Rare Sound Event Detection". Proceedings of the Detection and Classification of Acoustic Scenes and Events 2017 Workshop (DCASE2017). 2017. pp. 27-31.
conference
J. Nikunen and T. Virtanen. "Time-difference of arrival model for spherical microphone arrays and application to direction of arrival estimation". Proceedings of 25th European Signal Processing Conference. 2017. pp. 1255-1259.
conference
A. Diment and T. Virtanen. "Transfer Learning of Weakly Labelled Audio". IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. 2017. pp. 6-10.
conference
S. I. Mimilakis, K. Drossos, T. Virtanen and G. Schuller. "A Recurrent Encoder-Decoder Approach With Skip-Filtering Connections for Monaural Singing Voice Separation". 27th IEEE International Workshop on Machine Learning for Signal Processing (MLSP). 2017.
conference
E. Cakir, S. Adavanne, G. Parascandolo, K. Drossos and T. Virtanen. "Convolutional recurrent neural networks for bird audio detection". European Signal Processing Conference. 2017. pp. 1744-1748.
conference
S. Adavanne and T. Virtanen. "Sound event detection using weakly labeled dataset with stacked convolutional and recurrent neural network". Proceedings of the Detection and Classification of Acoustic Scenes and Events 2017 Workshop (DCASE2017). 2017. pp. 12-16.
conference
S. Adavanne, K. Drossos, E. Cakir and T. Virtanen. "Stacked convolutional and recurrent neural networks for bird audio detection". 2017 25th European Signal Processing Conference (EUSIPCO). 2017. pp. 1729-1733.
conference
Z. Shuyang, T. Heittola and T. Virtanen. "Learning vocal mode classifiers from heterogeneous data sources". 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). 2017. pp. 16–20.
conference
A. Mesaros et al.. "DCASE 2017 challenge setup: tasks, datasets and baseline system". Proceedings of the Detection and Classification of Acoustic Scenes and Events 2017 Workshop (DCASE2017). 2017. pp. 85-92.
conference
A. Mesaros, T. Heittola and T. Virtanen. "Assessment of human and machine performance in acoustic scene classification: DCASE 2016 case study". 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). 2017. pp. 319–323.
conference
S. Adavanne, P. Pertila and T. Virtanen. "Sound event detection using spatial features and convolutional recurrent neural network". IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2017). 2017.
conference
M. Malik, S. Adavanne, K. Drossos, T. Virtanen, D. Ticha and R. Jarina. "Stacked convolutional and recurrent neural networks for music emotion recognition". Sound and Music Computing Conference. 2017.
conference
Z. Shuyang, T. Heittola and T. Virtanen. "Active Learning for Sound Event Classification by Clustering Unlabeled Data". 2017.
conference
M. Parviainen and P. Pertilä. "Obtaining an optimal set of head-related transfer functions with a small amount of measurements". 2017 IEEE International Workshop on Signal Processing Systems (SiPS). 2017.
conference
P. Magron, R. Badeau and A. Liutkus. "Lévy NMF : un modèle robuste de séparation de sources non-négatives". Actes du XXVIème Colloque GRETSI. 2017.
conference
P. Magron, R. Badeau and A. Liutkus. "Lévy NMF for robust nonnegative source separation". 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). 2017. pp. 259-263.
conference
P. Magron, R. Badeau and B. David. "Phase-dependent anisotropic Gaussian model for audio source separation". 42nd International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2017. pp. 531-535.
inbook
D. Ellis, T. Virtanen, M. D. Plumbley and B. Raj. "Future Perspective". T. Virtanen, M. D. Plumbley and D. Ellis eds. Springer. 2017. pp. 401-415.
book
T. Virtanen, M. D. Plumbley and D. Ellis. Computational analysis of sound scenes and events, Springer, 2017.
inbook
T. Virtanen, M. D. Plumbley and D. Ellis. "Introduction to sound scene and event analysis". T. Virtanen, M. D. Plumbley and D. Ellis eds. Springer. 2017. pp. 3-12.
article
G. Richard, T. Virtanen, J. P. Bello, N. Ono and H. Glotin. "Introduction to the Special Section on Sound Scene and Event Analysis", Ieee-Acm transactions on audio speech and language processing, Vol. 25, 6, 2017, pp. 1169-1171.
article
J. Nikunen, A. Diment and T. Virtanen. "Separation of Moving Sound Sources Using Multichannel NMF and Acoustic Tracking", IEEE-ACM Transactions on Audio Speech and Language Processing, 11, 2017.
inbook
J. Nikunen and T. Virtanen. "Source Separation and Reconstruction of Spatial Audio Using Spectrogram Factorization". V. Pulkki, S. Delikaris-Manias and A. Politis eds. John Wiley & Sons. 2017.
conference
J. M. Perez-Macias, S. Adavanne, J. Viik, A. Värri, S.-L. Himanen and M. Tenhunen. "Assessment of support vector machines and convolutional neural networks to detect snoring using Emfit mattress". 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). 2017. pp. 2883-2886.
article
S. Drgas, T. Virtanen, J. Lücke and A. Hurmalainen. "Binary Non-Negative Matrix Deconvolution for Audio Dictionary Learning", IEEE-ACM Transactions on Audio Speech and Language Processing, Vol. 25, 8, 2017, pp. 1644-1656.
book
T. Virtanen et al.. Proceedings of the Detection and Classification of Acoustic Scenes and Events 2017 Workshop (DCASE2017), Tampere University of Technology. Laboratory of Signal Processing, 2017.
conference
M. Valenti, S. Squartini, A. Diment, G. Parascandolo and T. Virtanen. "A convolutional neural network approach for acoustic scene classification". 2017 International Joint Conference on Neural Networks, IJCNN 2017. 2017. pp. 1547-1554.
article
E. Cakir, G. Parascandolo, T. Heittola, H. Huttunen and T. Virtanen. "Convolutional Recurrent Neural Networks for Polyphonic Sound Event Detection", IEEE-ACm Transactions on Audio Speech and Language Processing, Vol. 25, 6, 2017, pp. 1291-1303.
article
P. Maijala, Z. Shuyang, T. Heittola and T. Virtanen. "Environmental noise monitoring using source classification in sensors", Applied Acoustics, Vol. 129, 8, 2017, pp. 258-267.
conference
G. Naithani, T. Barker, G. Parascandolo, L. Bramsløw, N. H. Pontoppidan and T. Virtanen. "Low Latency Sound Source Separation using Convolutional Recurrent Neural Networks". IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. 2017.
conference
P. Pertilä and E. Cakir. "Robust Direction Estimation with Convolutional Neural Networks-based Steered Response Power". ICASSP. 2017.
techreport
E. Cakir, K. Drossos and T. Virtanen. "QMUL bird audio detection challenge 2016". 2017.

2016

conference
K. Mahkonen, A. Hurmalainen, T. Virtanen and J.-K. Kämäräinen. "Cascade processing for speeding up sliding window sparse classification". European Signal Processing Conference (EUSIPCO), 2016. 2016.
article
A. Mesaros, T. Heittola and T. Virtanen. "Metrics for polyphonic sound event detection", Applied Sciences, Vol. 6. 2016, pp. 162.
conference
S. Adavanne, G. Parascandolo, P. Pertila, T. Heittola and T. Virtanen. "Sound event detection in multichannel audio using spatial and harmonic features". Detection and Classification of Acoustic Scenes and Events. 2016.
conference
A. Mesaros, T. Heittola and T. Virtanen. "TUT Database for Acoustic Scene Classification and Sound Event Detection". 2016.
techreport
G. Parascandolo, P. Pertila, T. Heittola and T. Virtanen. "Sound event detection in real life audio". 2016.
article
K. Drossos, M. Kaliakatsos-Papakostas, A. Floros and T. Virtanen. "On the Impact of The Semantic Content of Sound Events in Emotion Elicitation", Journal of the Audio Engineering Society, Vol. 64, 8, 2016, pp. 525-532.
conference
E. Cakir, E. C. Ozan and T. Virtanen. "Filterbank Learning for Deep Neural Network Based Polyphonic Sound Event Detection". 2016 International Joint Conference on Neural Networks (IJCNN). 2016.
book
T. Virtanen et al.. Proceedings of the Detection and Classification of Acoustic Scenes and Events 2016 Workshop (DCASE2016), Tampere University of Technology. Department of Signal Processing, 2016.
inbook
A. Diment, T. Virtanen, M. Parviainen, R. Zelov and A. Glasman. "Noise-Robust Detection of Whispering in Telephone Calls Using Deep Neural Networks". IEEE. 2016.
inbook
S. I. Mimilakis, K. Drossos, T. Virtanen and G. Schuller. "Deep Neural Networks for Dynamic Range Compression in Mastering Applications". AES Audio Engineering Society. 2016.
conference
G. Parascandolo, H. Huttunen and T. Virtanen. "Recurrent Neural Networks for Polyphonic Sound Event Detection in Real Life Recordings". 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2016. pp. 6440-6444.
conference
M. Valenti, A. Diment, G. Parascandolo, S. Squartini and T. Virtanen. "DCASE 2016 Acoustic Scene Classification Using Convolutional Neural Networks". Proceedings of the Detection and Classification of Acoustic Scenes and Events 2016 Workshop (DCASE2016). 2016.
article
T. Barker and T. Virtanen. "Blind Separation of Audio Mixtures Through Nonnegative Tensor Factorization of Modulation Spectrograms", Ieee-Acm transactions on audio speech and language processing, Vol. 24, 12, 2016, pp. 2377-2389.
article
J. Nikunen, A. Diment, T. Virtanen and M. Vilermo. "Binaural rendering of microphone array captures based on source separation", Speech Communication, Vol. 76. 2016, pp. 157-169.
conference
P. Pertilä and A. Brutti. "Increasing the environment-awareness of rake beamforming for directive acoustic sources". 15th International Workshop on Acoustic Signal Enhancement (IWAENC). 2016.
conference
G. Naithani, G. Parascandolo, T. Barker, N. H. Pontoppidan and T. Virtanen. "Low-Latency Sound Source Separation Using Deep Neural Networks". IEEE Global Conference on Signal and Information Processing. 2016.

2015

conference
A. Hurmalainen, R. Saeidi and T. Virtanen. "Noise Robust Speaker Recognition with Convolutive Sparse Coding". Proceedings of 16th Interspeech. 2015.
article
T. Virtanen, J. Gemmeke, B. Raj and P. Smaragdis. "Compositional Models for Audio Processing", IEEE Signal Processing Magazine, March, 2015.
article
K. Drossos, A. Floros and K. L. Kermanidis. "Evaluating the Impact of Sound Events’ Rhythm Characteristics to Listener’s Valence", Journal of the Audio Engineering Society, Vol. 63. 2015, pp. 139-153.
article
T. Virtanen, J. Gemmeke, B. Raj and P. Smaragdis. "Compositional Models for Audio Processing: Uncovering the structure of sound mixtures", IEEE Signal Processing Magazine, Vol. 32. 2015, pp. 125 - 144.
conference
D. Battaglino, A. Mesaros, L. Lepauloux, L. Pilati and N. Evans. "Acoustic context recognition for mobile devices using a reduced complexity SVM". European Signal Processing Conference (EUSIPCO-2015). 2015. pp. 534-538.
article
P. Pertilä and J. Nikunen. "Distant speech separation using predicted time–frequency masks from spatial featur", Speech Communication, Vol. 68. 2015, pp. 97 - 106.
article
U. Simsekli, T. Virtanen and A. T. Cemgil. "Non-negative Tensor Factorization Models for Bayesian Audio Processing", Digital Signal Processing. 2015.
conference
E. Cakir, T. Heittola, H. Huttunen and T. Virtanen. "Multi-label vs. combined single-label sound event detection with deep neural networks". 23rd European Signal Processing Conference 2015 (EUSIPCO 2015). 2015.
conference
E. Cakir, T. Heittola, H. Huttunen and T. Virtanen. "Polyphonic sound event detection using multi label deep neural networks". International Joint Conference on Neural Networks 2015 (IJCNN 2015). 2015.
conference
A. Diment, E. Cakir, T. Heittola and T. Virtanen. "Automatic recognition of environmental sound events using all-pole group delay features". European Signal Processing Conference (EUSIPCO 2015). 2015.
conference
A. Mesaros, T. Heittola, O. Dikmen and T. Virtanen. "Sound event detection in real life recordings using coupled matrix factorization of spectral representations and class activity annotations". Proceedings of 40th IEEE International Conference on Audio, Speech and Signal Processing (ICASSP). 2015. pp. 151-155.
article
P. Pertilä and J. Nikunen. "Distant speech separation using predicted time-frequency masks from spatial features", Speech Communication, Vol. 68. 2015, pp. 97-106.
article
K. Drossos, A. Floros, A. Giannakoulopoulos and N. Kanellopoulos. "Investigating the Impact of Sound Angular Position on the Listener Affective State", IEEE Transactions on Affective Computing, Vol. 6, 1, 2015, pp. 27-42.
conference
D. Baby, J. Gemmeke, T. Virtanen and H. V. Hamme. "Exemplar-based speech enhancement for deep neural network based automatic speech recognition". ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. 2015. pp. 4485-4489.
article
D. Baby, T. Virtanen, J. Gemmeke and H. V. Hamme. "Coupled dictionaries for exemplar-based speech enhancement and automatic speech recognition", Ieee-Acm transactions on audio speech and language processing, Vol. 23, 11, 2015, pp. 1788-1799.
article
E. Räsänen, O. Pulkkinen, T. Virtanen, M. Zollner and H. Hennig. "Fluctuations of Hi-Hat Timing and Dynamics in a Virtuoso Drum Track of a Popular Music Recording", PLoS ONE, Vol. 10. 2015.
conference
D. Baby, J. Gemmeke, T. Virtanen and H. V. Hamme. "Exemplar-based speech enhancement for deep neural network based automatic speech recognition". IEEE International Conference on Acoustics, Speech and Signal Processing. 2015.
conference
S. Drgas and T. Virtanen. "Speaker verification using adaptive dictionaries in non-negative spectrogram deconvolution". 12th International Conference on Latent Variable Analysis and Signal Separation. 2015.
conference
T. Barker, T. Virtanen and N. H. Pontoppidan. "Low-Latency Sound-Source-Separation using Non-Negative Matrix Factorisation with Coupled Analysis and Synthesis Dictionaries". ICASSP 2015. 2015.
conference
A. Diment and T. Virtanen. "Archetypal analysis for audio dictionary learning". IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). 2015.
conference
A. Hurmalainen, R. Saeidi and T. Virtanen. "Similarity Induced Group Sparsity for Non-negative Matrix Factorisation". Proceedings of 40th IEEE International Conference on Audio, Speech and Signal Processing (ICASSP). 2015. pp. 4425-4429.

2014

article
J. Nikunen and T. Virtanen. "Direction of Arrival Based Spatial Covariance Model for Blind Sound Source Separation", IEEE/ACM Transactions on Audio, Speech & Language Processing, Vol. 22, March, 2014, pp. 727-739.
conference
T. Barker, H. V. Hamme and T. Virtanen. "Modelling Primitive Streaming of Simple Tone Sequences Through Factorisation of Modulation Pattern Tensors". INTERSPEECH2014, 15th Annual Conference of the International Speech Communication Association, 14-18 September 2014, Singapore. 2014. pp. 1371-1375.
conference
T. Barker, T. Virtanen and O. Delhomme. "Ultrasound-Coupled Semi-Supervised Nonnegative Matrix Factorisation for Speech Enhancement". 2014 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP), Florence, Italy, May 4-9.2014. 2014. pp. 2148-2152.
conference
D. Baby, T. Virtanen, T. Barker and H. V. Hamme. "Coupled Dictionary Training for Exemplar-Based Speech Enhancement". 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 4-9 May 2014, Florence. 2014. pp. 2883 - 2887.
conference
G. Sanchez, H. Silén, J. Nurminen and M. Gabbouj. "Hierarchical modeling of F0 contours for voice conversion". INTERSPEECH 2014, Proceedings of the15th Annual Conference of the International Speech Communication Association, 14-18, September 2014, Singapore. 2014. pp. 2318-2321.
conference
O. Gencoglu, T. Virtanen and H. Huttunen. "Recognition of Acoustic Events Using Deep Neural Networks". 2014.
conference
T. Barker and T. Virtanen. "Semi-supervised non-negative tensor factorisation of modulation spectrograms for monaural speech separation". Neural Networks (IJCNN), 2014 International Joint Conference on. 2014. pp. 3556-3561.
article
T. Heittola, A. Mesaros, D. Korpi, A. Eronen and T. Virtanen. "Method for creating location-specific audio textures", EURASIP Journal on Audio, Speech and Music Processing, Vol. 2014. 2014.
conference
T. Virtanen, B. Raj, J. Gemmeke and H. V. Hamme. "Active-set Newton algorithm for non-negative sparse coding of audio". In Proc. International Conference on Acoustics, Speech, and Signal Processing. 2014.
article
Z. Wu, T. Virtanen, E. S. Chng and H. Li. "Exemplar-based sparse representation with residual compensation for voice conversion", IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 22. 2014.
conference
D. Baby, T. Virtanen, T. Barker and H. V. Hamme. "Coupled Dictionary Training for Exemplar-based Speech Enhancement". International Conference on Acoustics, Speech, and Signal Processing. 2014.
conference
D. Baby, T. Virtanen, J. Gemmeke, T. Barker and H. V. Hamme. "Exemplar-based noise robust automatic speech recognition using modulation spectrogram features". IEEE Spoken Language Technology Workshop. 2014.
conference
T. Barker, H. V. Hamme and T. Virtanen. "Modelling Primitive Streaming of Simple Tone Sequences Through Factorisation of Modulation Pattern Tensors". INTERSPEECH 2014. 2014.
conference
J. Nikunen and T. Virtanen. "Multichannel audio separation by Direction of Arrival Based Spatial Covariance Model and Non-negative Matrix Factorization". Proceedings of 39th International Conference on Acoustics, Speech, and Signal Processing (ICASSP). 2014. pp. 6727-6731.
conference
P. Pertilä and J. Nikunen. "Microphone Array Post-Filtering Using Supervised Machine Learning for Speech Enhancement". INTERSPEECH 2014 - 15th Annual Conference of the International Speech Communication Association. 2014.
conference
M. Parviainen, P. Pertilä and M. S. Hämäläinen. "Self-localization of Wireless Acoustic Sensors in Meeting Rooms". 4th Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA). 2014.
incollection
A. Diment, P. Rajan, T. Heittola and T. Virtanen. "Group Delay Function from All-Pole Models for Musical Instrument Recognition". Aramaki et al eds. Springer International Publishing. 2014. pp. 606-618.

2013

conference
A. Hurmalainen and T. Virtanen. "Learning State Labels for Sparse Classification of Speech with Matrix Deconvolution". Proceedings of the Automatic Speech Recognition and Understanding Workshop (ASRU). 2013.
article
P. Pertilä, M. S. Hämäläinen and M. Mieskolainen. "Passive temporal offset estimation of multichannel recordings of an ad-hoc microphone array", IEEE Transactions on Audio, Speech, and Language Processing, Vol. 21, Nov., 2013, pp. 2393-2402.
conference
A. Diment, T. Heittola and T. Virtanen. "Semi-supervised Learning for Musical Instrument Recognition". 21st European Signal Processing Conference 2013 (EUSIPCO 2013). 2013.
Results 1 - 100 of 366