Skip to main content

Google’s Music Transformer can generate piano melodies that don’t sound half bad

testsetset

Google’s song-composing artificial intelligence (AI) might not measure up to Mozart or Liszt anytime soon, but it’s made impressive progress recently. In a blog post and accompanying paper (“Music Transformer“) this week, contributors to Project Magenta, a Google Brain project “exploring the role of machine learning as a tool in the creative process,” presented their work on Musical Transformer, a machine learning model that’s capable of generating relatively coherent tunes with a recognizable repetition.

“The Transformer, a sequence model based on self-attention, has achieved compelling results in many generation tasks that require maintaining long-range coherence,” the paper’s authors write. “This suggests that self-attention might also be well-suited to modeling music.”

As the team explains, producing long pieces of music remains a challenge for AI because of its structural complexity; most songs contain multiple motifs, phrases, and repetition that neural networks have a tough time picking up on. And while previous work has managed to channel some of the self-reference observable in works composed by humans, it has relied on absolute timing signals, making it poorly suited for keeping track of themes that are based on relative distances and recurring intervals.

The team’s solution is Music Transformer, an “attention-based” neural network that creates “expressive” performances directly without first generating a score. By using an event-based representation and a technique known as relative attention, the Music Transformer is able not only to focus more on relational features, but generalize beyond the length of training samples with which it’s supplied. And because it’s less memory-intensive, it’s also able to generate longer musical sequences.


June 5th: The AI Audit in NYC

Join us next week in NYC to engage with top executive leaders, delving into strategies for auditing AI models to ensure fairness, optimal performance, and ethical compliance across diverse organizations. Secure your attendance for this exclusive invite-only event.


In tests, when primed with Chopin’s Black Key Etude, Music Transformer produced a song that was consistent in style throughout and contained multiple phrases sourced from the motif. By contrast, two previous algorithms — Performance RNN and Transformer — provided the same primer either lacked a discernable structure completely or failed to maintain a structure.

Here’s Music Transformer riffing on the above-mentioned Black Key Etude:

And here’s it generating songs without a primer:

The team concedes that the Music Transformer is far from perfect — it sometimes produces songs with too much repetition, sparse sections, and odd jumps — but they’re hopeful it serves as a muse for musicians in need of inspiration.

“This opens up the potential for users to specify their own primer and use the model as a creative tool to explore a range of possible continuations,” the team wrote.

Code for training and generating Music Transformer is forthcoming, they say, along with pre-trained checkpoints.