In partnership with

Google DeepMind dropped Lyria 3 two days ago.

It's their most advanced music generation model, and instead of keeping it locked behind an API, they plugged it straight into the Gemini app.

750 million people now have access to a music generator that creates 30-second tracks with vocals, auto-generated lyrics, and AI cover art. Either from a text prompt, a photo, or a video clip.

I spent an embarrassing amount of time testing it. Got late to a meeting but no regrets.

Before we jump in, let's catch up on AI this week.

NEWS

TOOLS

1. Dokie AI: An AI presentation maker that generates complete slide decks from text, documents, or topic inputs with automatic layout and structure. More focused on traditional "ready-to-present" PowerPoint output than web-style tools like Gamma.

2. Lindy AI: An AI work assistant that replaces your executive assistant. It triages email, drafts replies in your voice, preps you for meetings, and sends follow-ups automatically. Integrates with 3,000+ apps and works through iMessage, Slack, or the web with no-code agent setup.

3. Scira AI: An open-source, minimalist AI search engine (formerly MiniPerplx) that breaks your query into sub-tasks, searches live sources, and returns cited answers. Supports 10+ LLMs including Grok, Claude, and Gemini, plus deep research, code execution, and Reddit/X search in one clean interface.

What's the Deal?

You open Gemini, type a prompt, select the “Create Music” feature and get a 30-second song back.

The track comes with vocals, lyrics (auto-generated or your own), and an AI-generated cover art.

You can download it, share it, or remix it right inside the app.

It also accepts images. Lyria interprets the vibe of the uploaded photo, looks at people, setting, mood and turns it into music.

This is the feature no competitor has so far.

30 seconds feels short but Google knows what they're doing.

Short clips = lower copyright risk, less compute, and enough to hook you into generating five more. Which is exactly what happened to me.

The Tests

I ran 6 prompts through Lyria 3 to see what it can actually do.

Test 1: can It do bollywood?

Prompt: "A filmy 90s Bollywood item number. Dholak, tabla, brass section, and a punchy bass synth. Female vocalist with a powerful, celebratory voice singing in Hindi. Fast tempo, high energy. Lyrics about owning the dance floor at a wedding."

Result: The output was genuinely fun. It is catchy, danceable, and has all the right instruments.

Only nitpick: the beat leaned more "upbeat pop" than authentic 90s Bollywood. But that's on me as the prompt emphasised high energy and it delivered high energy.

Fair enough.

Test 2: genre mashup that shouldn't work

Prompt: "K-pop meets Carnatic classical. Start with a veena intro over lo-fi beats, then transition into a high-energy K-pop drop with synth pads and 808s. Male vocalist switching between Korean-style rap verses and melodic Indian classical-inspired chorus. Tempo: 128 BPM."

Result: The K-pop elements were spot on, has a crisp production, and a tight beat.

The Indian classical bit felt missing entirely or so subtle it got buried under the synths.

This is where the 30-second limit probably hurts as there is not enough time for a veena intro AND a K-pop drop AND Carnatic chorus transitions.

Worth noting that the K-pop production quality alone was impressive. Lyria clearly has strong training data there.

Test 3: image-to-music (the wild card)

This one was a two-parter.

Part A: I uploaded a team photo and no context or prompt.

Result: Lyria went literal.

It picked up on people standing side by side, cold weather clothing, outdoor setting and turned those visual elements into rhyming lyrics.

Part B: Same image but with a prompt attached: "Make it an intense startup grind anthem, lo-fi hip hop with a motivational spoken word overlay."

Result: Banger.

The contrast between the two outputs is honestly hilarious. Same photo but completely different tracks.

Takeaway: image-to-music without a text prompt is a party trick. Image + prompt is where it gets good.

Test 4: custom lyrics

Prompt: "Upbeat indie pop, acoustic guitar and hand claps, female vocalist with a warm tone. 110 BPM. Lyrics: Woke up to a hundred tabs open (open), coffee's cold but the deadlines are golden. Ship it now, fix it later (later), we're all just prompt engineers and creators."

Result: Nailed it.

The lyrics came through exactly as written, the backing vocal echoes on "(open)" and "(later)" actually worked. DeepMind's prompt guide says to use parentheses for backing vocals and it follows through.

One thing: my lyrics weren't enough to fill 30 seconds, so Lyria repeated them a second time which is not ideal but it proves the model respects custom lyrics over generating its own.

Test 5: the meme song

Prompt: "A dramatic orchestral trailer score that builds tension for 20 seconds, then drops into a goofy kazoo and ukulele melody. Lyrics: Monday morning, Slack is pinging, boss is typing, my soul is singing... into the void."

Result: Mildly underwhelming.

The instrumentals were solid and the orchestral build was legit dramatic, but the goofy drop I was hoping for didn't land. It stayed serious when it should have gone full absurd.

By this point I'd already generated 10+ tracks and was riding a high from Tests 3 and 4. So the expectations were elevated.

The kazoo betrayed me.

Test 6: the cat ballad (Google's own suggestion)

Google's blog literally suggested making "a ballad from your pet's POV”, so I called their bluff.

Prompt: [uploaded a friend’s cat’s photo] "A soulful R&B ballad from this cat's perspective. Deep male baritone vocals, slow tempo, piano and strings. The cat is singing about being ignored when the human works from home. Melancholic but with a hint of passive-aggressive sass."

Result: Cute.

The R&B elements carried hard, piano and strings arrangement was beautiful, and the baritone vocals were rich.

It missed the passive-aggressive angle (turns out AI doesn't do subtle emotional nuance well in 30 seconds), but the overall track had a warmth to it that made it genuinely listenable.

My Take

I got late to a meeting because of this.

That doesn't happen with most AI launches anymore.

I try things, note them, and move on. This one had me generating clip after clip, remixing outputs, trying increasingly absurd prompts. I felt giggly and nervous at the same time.

Giggly because the outputs are fun and surprisingly competent. Nervous because music just got commoditised.

Suno still makes longer, arguably more creative tracks, Lyria 3 caps at 30 seconds and plays it safe on complex genre fusions.

So on pure output quality, this isn't a Suno killer.

But Suno has millions of users and Lyria 3 just landed inside an app with 750 million.

That distribution gap is the story.

The image-to-music feature is the sleeper hit, nobody else does this.

Upload a holiday photo, a product shot, your dog and get a soundtrack back. It's a gimmick until you actually try it, and then you're uploading your entire camera roll.

After a point, I think AI music will hit the same fatigue wall as AI video.

People will learn to tell the difference and the novelty will fade, but right now it's a blast.

Hit reply with the weirdest track you generate. I want to hear what you come up with.

Until next time,
Vaibhav 🤝

PS: All cover art attached was generated with the music, adding a nice touch to the experience :)

If you read till here, you might find this interesting

#AD 1

The Lithium Boom Is Heating Up

Lithium stock prices grew 2X+ from June to January. $ALB climbed 227%. $LAC hit 151%. $SQM, 159%. But the real winner may be a private stock, EnergyX. Their tech can recover 3X more lithium than traditional methods, leading General Motors to invest. Now they’re preparing to unlock up to 9.8M tons of lithium. Buy private EnergyX shares alongside 40k+ people before EnergyX’s share price increases after 2/26.

This is a paid advertisement for EnergyX Regulation A offering. Please read the offering circular at invest.energyx.com. Under Regulation A, a company may change its share price by up to 20% without requalifying the offering with the Securities and Exchange Commission.

#AD 2

Stop typing prompt essays

Dictate full-context prompts and paste clean, structured input into ChatGPT or Claude. Wispr Flow preserves your nuance so AI gives better answers the first time. Try Wispr Flow for AI.

Reply

Avatar

or to participate

Keep Reading