Archives: 31st May 2024

AIM at AES Europe 2024

Logo of the AES Europe 2024 conference in MadridOn 15-17 June, AIM PhD students will participate in the Audio Engineering Society’s 2024 European Conference (AES Europe 2024). AES is a leading professional body in the field of audio processing. The theme of this year’s European conference is “Echoes of the Past Inspire the Sound of the Future”. All papers will also be published in the Journal of the Audio Engineering Society (JAES).

AIM students will present the following papers at AES 2024:

You can find the full schedule at: https://aeseurope2024.sched.com

See you all there!


AIM PhD student participates in organising a DCASE challenge task on computational bioacoustics

Logo of the DCASE workshopAIM PhD student Shubhr Singh, among other C4DM members, is participating in the organisation of the task on Few-shot Bioacoustic Event Detection as part of the IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE 2024).

This task addresses a real need from animal researchers by providing a well-defined, constrained yet highly variable domain to evaluate machine learning methodology. It aims to advance the study of audio signal processing and deep learning in the low-resource scenario, particularly in domain adaptation and few-shot learning. Datasets will be released on 1st June 2024, with the challenge deadline being on 15 June 2024.

Can you build a system that detects an animal sound with only 5 examples? Let’s liaise to push the boundary of computational bioacoustics and machine listening!


AIM PhD candidate presents “MusiLingo” at workshop

Photo of PhD candidate presenting his paper at the workshop.AIM PhD candidate Yinghao Ma presented his work entitled “MusiLingo: Bridging Music and Text with Pre-trained Language Models for Music Captioning and Query Response” at a recent workshop hosted by the British Machine Vision Association and the Society for Pattern Recognition in London, on April 24, 2024. MusiLingo is a system merging pre-trained music encoders with language models to enhance music-text interaction. It aims to make music more interpretable for everyone, ranging from composers to those that are hard of hearing, using a projection layer that integrates music embeddings into language models for effective text generation. More information about the workshop can be found here, and more information on MusiLingo is available on the paper and Hugging Face.