I make playlists for everything. Portraits, golden hour walks, moody rainy-day street sessions. Music has always been part of how I think about visual work. So when I stumbled across Serge Ramelli doing something I’d never seen before, combining AI lyrics, AI music composition, and AI video generation into a single finished music video, I stopped whatever I was editing and watched the whole thing twice.
This isn’t a photography tutorial in the traditional sense. No histogram talk, no lens recommendations. But if you’re a visual creator trying to understand where AI tools are actually useful right now (not in theory, but in practice), this is one of the most honest demonstrations I’ve seen.
Why This Video Hit Differently Than Most AI Content
Most AI content I see falls into one of two categories. Either it’s someone breathlessly hyping tools without showing real output, or it’s someone dismissing AI entirely because the results look “off.” Serge Ramelli’s video on his song “You Are Not Your Body” is neither of those things.
He made a complete music video. Lyrics, melody, visual clips, edit. The whole pipeline. And he walked through every decision he made along the way, including the parts that took hours and the parts that surprised him. That honesty is what made me pay attention.
The Workflow, Step by Step
Here’s how Ramelli built the project from scratch.
-
Write the core idea yourself. He started with his own basic lyrics about reincarnation, a theme he finds personally fascinating. The human concept came first.
-
Let ChatGPT refine the rhythm. He didn’t hand the whole thing to AI and walk away. He wrote the seed, then used ChatGPT to improve the lyrical flow and meter. The collaboration kept the meaning intact while tightening the craft.
-
Generate the music with Suno. Suno is an AI music tool that takes lyrics and a style prompt and produces a full song. Ramelli asked for something in the style of Robbie Williams, and the result is genuinely listenable pop with real emotional range. If you’ve never used Suno, the output can be startling in the best way.
-
Generate video clips with Veo 3. This is where most of the time went. Ramelli used Google’s Veo 3 (specifically Model 2, which costs 10 credits per clip without audio) to generate hundreds of short video clips. He wrote visual prompts for each one, reviewed the outputs, kept what worked, and discarded the rest. He was clear that this part took many hours. There’s no shortcut here. You generate a lot to find the few clips that actually serve the vision.
-
Edit everything in DaVinci Resolve. Once he had his selected clips, he brought them into DaVinci Resolve and cut a full music video. The editing is where the human creative judgment takes over completely. Pacing, transitions, which image lands on which beat. That part is still entirely yours.
What Photographers Specifically Can Take From This
Here’s the thing I kept thinking about while watching. Ramelli is a photographer first. His entire career is built on visual storytelling and teaching others to see. And he applied that same eye to a medium he’d never worked in before.
His experience reading light, composing frames, and understanding what makes an image feel right is exactly what guided his Veo 3 prompts. He wasn’t just typing random descriptions. He was writing prompts the way a photographer thinks about a shot. That specificity is what separated his selected clips from the hundreds he discarded.
If you’ve ever written a detailed location scout note or briefed a client on a visual direction, you already know how to write a good AI image prompt. The skill transfers.
Where I’d Push Back Slightly
I’ve been experimenting with AI video tools for a few months now, mostly for client pitch decks and personal projects. And I’ll say this honestly: the gap between “impressive output” and “output that serves a specific story” is still significant. Ramelli generated hundreds of clips to find the ones he loved. That’s not a flaw in his process, it’s just the reality of the tool right now.
For photographers used to getting a decisive moment in a single frame, the volume-based nature of AI video generation can feel counterintuitive. You’re not hunting for one perfect shot. You’re sifting through a lot of good-enough until something genuinely resonates. That’s a different creative muscle, and it’s worth knowing before you dive in.
Where I’d extend his approach is in using reference images alongside text prompts in tools that support it. If you have a strong portfolio of your own work, feeding visual references can dramatically narrow the gap between what you imagine and what the AI produces.
The Takeaway Worth Keeping
The most useful thing Ramelli demonstrates isn’t any single tool. It’s that the creative director role belongs to you, and AI handles the execution of specific tasks within a larger vision you’ve already formed. The song exists because he cared about reincarnation. The video exists because he knew what feeling he wanted viewers to have. Every tool was in service of that.
Watch the full video to see the actual clips and hear the finished song. Seeing the output alongside the process description is what makes the whole thing click.
Watch Serge Ramelli’s full tutorial and music video on YouTube
Comments
Leave a Comment