AIM at ISMIR 2024
On 10-14 November 2024, several AIM researchers will participate at the 25th International Society for Music Information Retrieval Conference (ISMIR 2024). ISMIR is the leading conference in the field of music informatics, and is currently the top-cited publication for Music & Musicology (source: Google Scholar). This year ISMIR will take place onsite in San Francisco (CA, USA) and online.
Similar to previous years, the Centre for Digital Music will have a strong presence at ISMIR 2024.
In the Scientific Programme, the following papers are authored/co-authored by AIM members:
- Augment, Drop & Swap: Improving Diversity in LLM Captions for Efficient Music-Text Representation Learning (Ilaria Manco, Justin Salamon, Oriol Nieto)
- Between the AI and Me: Analysing Listeners’ Perspectives on AI- and Human-Composed Progressive Metal Music (Pedro Sarmento, Jackson Lothn, Mathieu Barthet)
- Can LLMs “Reason” in Music? An Evaluation of LLMs’ Capability of Music Understanding and Generation (Ziya Zhou, Yuhang Wu, Zhiyue Wu, Xinyue Zhang, Ruibin Yuan, Yinghao Ma, Lu Wang, Emmanouil Benetos, Wei Xue, Yike Guo)
- ComposerX: Multi-Agent Music Generation with LLMs (Qixin Deng, Qikai Yang, Ruibin Yuan, Yipeng Huang, Yi Wang, Xubo Liu, Zeyue Tian, Jiahao Pan, Ge Zhang, Hanfeng Lin, Yizhi Li, Yinghao Ma, Jie Fu, Chenghua Lin, Emmanouil Benetos, Wenwu Wang, Guangyu Xia, Wei Xue, Yike Guo)
- Content-based Controls for Music Large-scale Language Modeling (Liwei Lin, Gus Xia, Junyan Jiang, Yixiao Zhang)
- Diff-A-Riff: Musical Accompaniment Co-creation via Latent Diffusion Models (Javier Nistal, Marco Pasini, Cyran Aouameur, Maarten Grachten, Stefan Lattner)
- Diff-MST: Differentiable Mixing Style Transfer (Soumya Sai Vanka, Christian J. Steinmetz, Jean-Baptiste Rolland, Joshua D. Reiss, George Fazekas)
- From Audio Encoders to Piano Judges: Benchmarking Performance Understanding for Solo Piano (Huan Zhang, Jinhua Liang, Simon Dixon)
- GAPS: A Large and Diverse Classical Guitar Dataset and Benchmark Transcription Model (Xavier Riley, Zixun Guo, Drew Edwards, Simon Dixon)
- I can listen but cannot read: An evaluation of two-tower multimodal systems for instrument recognition (Yannis Vasilakis, Rachel Bittner, Johan Pauwels)
- MIDI-to-Tab: Guitar Tablature Inference via Masked Language Modeling (Drew Edwards, Xavier Riley, Pedro Sarmento, Simon Dixon)
- MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models (Benno Weck, Ilaria Manco, Emmanouil Benetos, Elio Quinton, George Fazekas, Dmitry Bogdanov) – Best Paper Nomination
- Music2Latent: Consistency Autoencoders for Latent Audio Compression (Marco Pasini, Stefan Lattner, George Fazekas)
- Semi-Supervised Contrastive Learning of Musical Representations (Julien Guinot, Elio Quinton, George Fazekas)
- SpecMaskGIT: Masked Generative Modelling of Audio Spectrogram for Efficient Audio Synthesis and Beyond (Marco Comunità, Zhi Zhong, Akira Takahashi, Shiqi Yang, Mengjie Zhao, Koichi Saito, Yukara Ikemiya, Takashi Shibuya, Shusuke Takahashi, Yuki Mitsufuji)
- ST-ITO: Controlling audio effects for style transfer with inference-time optimization (Christian J. Steinmetz, Shubhr Singh, Marco Comunità, Ilias Ibnyahya, Shanxin Yuan, Emmanouil Benetos, Joshua D. Reiss) – Best Paper Nomination
The following Tutorial will be presented by AIM PhD student Ilaria Manco:
- Connecting Music Audio and Natural Language (Seung Heon Doh, Ilaria Manco, Zachary Novack, Jong Wook Kim and Ke Chen)
The following journal paper published at TISMIR will be presented at the conference:
- PiJAMA: Piano Jazz with Automatic MIDI Annotations (Drew Edwards, Simon Dixon, Emmanouil Benetos)
As part of the MIREX public evaluations, AIM PhD student Yixiao Zhang is task captain for the Music Description & Captioning task.
Finally, the following AIM members are organising Satellite Events:
- Elona Shatri as General Chair for WoRMS 2024
- Ilaria Manco as Organising Committee member for NLP4MUSA 2024
See you at ISMIR!