On 17/7/2024, Nuno Correia (coordinator of MODINA) took part on the panel “Challenges and Opportunities for Music Creation” of the Ethical and Responsible AI Music Making Workshop, organised and hosted by University of the Arts London (UAL). This is part of the Music Responsible AI project, led by UAL, and of which Tallinn University (coordinators of MODINA) are also partners.
Among other topics, Nuno presented work developed by Tallinn University (namely, by William Primett) around sound and AI, as mentors of 2 of the 2024 MODINA residencies: Temporal Spaces and SFDCANBAC++.
Temporal Spaces
Our approach: To re-synthsize digitally synthesized interactive audio with another instrument or a combination of other instruments. Participants would first control the original audio through their movements, and then the system would play back the re-sythsized audio.
Web demo: Online model explorer: https://rave-model-explorer-ux.glitch.me/
Training data and models:
- Source audio was generated with a Virtual Modular Synthesis (VCV rack) patch which was controlled by the participants movement in real-time
- Various pre-trained models were used and combined. A guitar timbre model is trained on approximately 6 hours of examples. Other datasets include: organ, magnetic resonator piano, VocalSet (singing voice), Bird Sounds
- https://huggingface.co/Intelligent-Instruments-Lab/rave-models
- The project then utilizes the following neural audio synthesis via RAVE: https://github.com/acids-ircam/RAVE and pre-trained models from Otologic.jp
Tools/frameworks used: nn~ object inside Max/MSP to load RAVE models
Installation video: https://www.youtube.com/watch?v=kY4Pjeno46w
SFDCANBAC++
Our approach: We used dimensionality reduction to map out speech recordings from the two artists onto an interactive space that could be navigated to “remix” conversations and create musical textures
Web Demo: https://wprimett.github.io/files/chat_visualizer/demo.html
Training data:
- Speech recordings taken from a series conversational videos where the artists talked about their creative practice entitled, “Best practices chat”
- 860+ audio samples extracted from 72 minutes of audio.
- Best Practices Chat Playlist: https://www.youtube.com/watch?v=485p6Z1-hHo&list=PLVIdoREykT8Jk_vGehGzNbhzex9HshB5x
Tools/frameworks used:
- Python script with the following algorithms:
- UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction
- t-SNE: t-distributed Stochastic Neighbor Embedding
- Libraries: scikit-learn, umap-learn