
When a new model drops claims, what do you do?
I've been a Claude maxi since the launch of Sonnet 3.5 last year.
Anthropic has shared a few updates since then, and so has Google and OpenAI.
Gemini 3.1 Pro launched last week, claiming to do things Claude’s Sonnet 4.6 still can't do.
So I wanted to see for myself what I've been missing. Turns out, not a lot.
Stick till the end for the full prompt book. But first:
Spotlight: A 2-Day AI Engineering Mastermind
Before we get into the tests, something worth flagging if you're a professional working in tech or want to get into it:
The schedule is as follows:
Day 1: 27th Feb, 7:00–11:00 PM
Developer Productivity, AI Agents & AI Assisted Coding w/Cursor & Claude.
Day 2: 28th Feb, 10:00 AM–2:15 PM
Agentic AI, Crew AI & Langraph Frameworks & Live Demo Of Real Agentic Workflows That Companies Use.
Day 2: 28th Feb, 3:00–6:00 PM
Orchestration, MCPs, Building Multi Agents Systems & Case Studies.
Tools of the Week
1. Wideframe
AI agent that handles video work outside your NLE, organizing footage, searching clips, writing briefs, and building rough cuts. Works natively with Adobe Premiere Pro and DaVinci Resolve files.
2. Cozmo AI
Deploys multimodal AI employees for banks, insurers, and lenders that can handle calls, read documents, interpret videos, apply company policies, and update core systems like Salesforce and Guidewire.
3. Fern
A real-time AI meeting assistant that runs in two modes, Shadow Mode (invisible copilot giving you live prompts, context, and questions to ask) or Full Presence (visible participant that speaks, researches, and completes tasks on the spot).
Test 1: A standup set
Prompt:
Write a 1-minute stand-up…
Result: Low expectations going into it.

Gemini built a whole narrative around a dying ficus named Steve, escalated it to an HR tribunal, and landed the darker joke of the two.
Getting jealous of a true crime victim because "at least she got some fresh air" genuinely caught me off guard. Funnier on first read.
Sonnet skipped the storyline and went pure observational. Less structured, but it sounded more like a real person on stage.
Gemini's was more of a sketch, and Sonnet's was more stand-up.
Winner: Tie.
Test 2: A luxury landing page
Prompt:
Build a single-page landing page…
Result:
Gemini gave me a clean, professional and fully usable landing page.
Sonnet gave me a brand.
It got the colours right, understood the whole luxury experience, down to scroll animations and a footer tagline that read "Patience is the rarest ingredient."
One more quality-of-life observation: Gemini gave me raw code which I to save, and open in a browser to see the result.
Sonnet rendered the finished page live in the chat.
Winner: Sonnet.
Test 3: Animate an SVG from scratch
Prompt A:
Create an SVG…
Result:
Gemini added a dark circular background and went more illustrative and Sonnet kept it minimal with just the rocket.
Prompt B:
Now animate this…
Result:
Gemini added a hover bob, flickering flames, twinkling stars and everything is functional.
Hard to tell from a screenshot since the rocket is gently floating in place, but trust me, it's moving.
Sonnet went way more ambitious in the code. It had three-layer flames, glow filters, and exhaust particles but the rocket flew off its own screen.
The whole thing was gone.
Prompt C:
Make the flame…
Result:
Gemini kept improving. It added spline-based flame morphing and a blue body glow, rendering it perfectly.
Sonnet's code got more sophisticated but the rocket never came back on screen.
Winner: Gemini.
Test 4: Roast IRCTC
I screenshotted the IRCTC homepage and fed it to both.
Prompt:
Be brutally honest. What's working, what's not, what would you change first? Don't be nice. Be useful.
Gemini zoomed out, thought about the page as a whole and gave structural fixes.
It called the Vande Bharat banner "a billboard, not a functional tool.” and its feedback reads like a design critique you'd hand to someone doing a full redesign.
Sonnet zoomed in, went field by field, catching input-level issues that cause real user drop-offs and its feedback reads more like a QA report you'd file as bug tickets.
Gemini cared about what the page looked like and Sonnet cared about how it works.
Winner: Tie.
Test 5: Design a UPI Wrapped
Prompt:
Design a "Wrapped" year-in-review…
Result:
Gemini gave me a stats screen with a punchline while Sonnet gave me a narrative.
"YOU SPENT." as a hero moment, roast woven into every label like "Zomato. Always Zomato.", or "'Just browsing' you said". It was a running commentary of jokes.
Gemini’s neon green palette is more immediately striking but Sonnet understood that Spotify Wrapped is about making you feel something before you see the numbers.
One more observation: Gemini lists raw amounts and Sonnet uses percentages with ranked context, which is exactly how Wrapped makes data feel like a story instead of an audit.
Winner: Sonnet
My take
Going in, I was ready to be humbled.
Coming out, the score is Sonnet 2, Gemini 1, Ties 2.
What I noticed is that these two models have completely different orientations.
Gemini thinks like an engineer while Sonnet thinks like a creative.
That gap showed up in every single test.
The IRCTC roast answered different questions. Neither is wrong but they just don't have the same instincts.
The only test Gemini won cleanly was the SVG animation and that too was less about Gemini being better and more about Sonnet being ambitious to the point of breaking itself.
So am I switching? No.
But I'm adding Gemini to the rotation for anything that needs to stay on screen and keep working, like animations, and complex components.
For everything else, Sonnet's still the one.
Hit reply: Run any of these prompts yourself and tell me which model won for you.
Until next time,
Vaibhav
If you read till here, you might find this interesting
#Partner 1
Meet America’s Newest $1B Unicorn
A US startup just hit a $1 billion private valuation, joining billion-dollar private companies like SpaceX, OpenAI, and ByteDance. Unlike those other unicorns, you can invest.
Over 40,000 people already have. So have industry giants like General Motors and POSCO.
Why all the interest? EnergyX’s patented tech can recover up to 3X more lithium than traditional methods. That's a big deal, as demand for lithium is expected to 5X current production levels by 2040. Today, they’re moving toward commercial production, tapping into 100,000+ acres of lithium deposits in Chile, a potential $1.1B annual revenue opportunity at projected market prices.
Right now, you can invest at this pivotal growth stage for $11/share. But only through February 26. Become an early-stage EnergyX shareholder before the deadline.
This is a paid advertisement for EnergyX Regulation A offering. Please read the offering circular at invest.energyx.com. Under Regulation A, a company may change its share price by up to 20% without requalifying the offering with the Securities and Exchange Commission.
#Partner 2
Smarter news. Fewer yawns
Business news takes itself way too seriously.
Morning Brew doesn’t.
Morning Brew delivers a smart, skimmable email newsletter on the day’s must-know business news — plus games that make sticking around a little more fun. Think crosswords, quizzes, and quick breaks that turn staying informed into something you actually look forward to.
Join over 4 million professionals reading Morning Brew for free. And walk away knowing more than you did five minutes ago.












