AIM PhD candidate Yinghao Ma presented his work entitled “MusiLingo: Bridging Music and Text with Pre-trained Language Models for Music Captioning and Query Response” at a recent workshop hosted by the British Machine Vision Association and the Society for Pattern Recognition in London, on April 24, 2024. MusiLingo is a system merging pre-trained music encoders with language models to enhance music-text interaction. It aims to make music more interpretable for everyone, ranging from composers to those that are hard of hearing, using a projection layer that integrates music embeddings into language models for effective text generation. More information about the workshop can be found here, and more information on MusiLingo is available on the paper and Hugging Face.