I thought I’d be building AI systems
When I joined a startup building AI generated images and videos, I expected to spend my time solving difficult generation problems.
Instead, I found myself building folders full of prompt templates.
experts/
ecommerce/
prompts.txt
jewellery/
prompts.txt
watches/
prompts.txt
fashion/
prompts.txt
To give credit where due, the prompts.txt is not a single file, but a structured set of files, which allows sharing of prompts, templating, and even versioning.
It worked well enough to ship products, but something felt fundamentally wrong.
Whenever I suggested evolving the architecture, the response was usually the same.
“We’re in a red ocean. We have to keep sailing.”
I understood the business pressure. Speed matters. But every week I became more convinced we were optimizing the wrong layer.
Different products fail for different reasons.
The first thing that bothered me was how differently models behaved across categories.
To humans, these prompts are obviously different.
To a diffusion model, they are simply distributions of pixels and language.
Sometimes that difference matters a lot.
I remember asking a model to generate a watch dangling from a pocket.
Instead of a pocket watch hanging from a chain, it generated something closer to a wall clock attached to clothing.
Interestingly, another model interpreted the original prompt correctly.
That was the moment something clicked.
The problem wasn’t just prompt engineering.
Different models understood different concepts with different levels of accuracy.
A prompt isn’t a strategy, if one is serious about making a business. Given the advent of agentic cli tools and cursor like editors, there is nothing preventing another person to replicate and refine prompts.
The more categories I looked at, the more obvious it became.
Even within watches, a luxury wrist watch and an antique pocket watch have different visual expectations.
Yet our pipeline treated every request as the same problem.
That isn’t really a generation strategy. It’s a wrapper.
The interesting work happens before generation. At some point I stopped asking,
“How do we write a better prompt?”
Instead I started asking,
“How do we decide what should happen before generation even begins?”
That completely changed how I thought about the system.
A user asking for an advertisement isn’t really asking for pixels. They’re describing intent.
Those decisions shouldn’t be hidden inside prompt text. They should exist as first class components of the system.
I started imagining a team of specialists. Instead of one giant generation pipeline, I started thinking about specialists.
Imagine a creative director instead of a prompt template.
The director receives a request. The first decision isn’t which prompt to use.
The first decision is which expert should handle the request.
Each expert:
Some categories might use a LoRA. Others might use a custom workflow. Others might rely on a different base model entirely.
The important part is that specialization becomes intentional instead of accidental.
The router becomes the product
People often say AI startups are just wrappers around foundation models.
They’re usually right.
But I don’t think the wrapper is the interesting part.
The interesting part is everything between user intent and the model call.
A good system might
Over time, that decision layer becomes smarter. Not because the frontier models changed, because your system learned which decisions produce better outcomes.
That is much harder to copy than a collection of prompts.
This is where LoRA adapters become useful. LoRA is one piece, not the destination
But because some domains genuinely benefit from specialization.
Imagine training adapters only on high quality examples for a specific category.
Now the router isn’t just choosing prompts. It’s choosing expertise.
That expertise can evolve independently as new data arrives.
Data becomes your advantage
Every generation tells you something.
Which adapter? and parameters?
Over time you stop collecting prompts. You start collecting decisions and outcomes.
That feedback loop becomes increasingly difficult for competitors to replicate because it reflects how your users actually create content.
When I first joined, I thought prompt engineering would be the interesting part of the job.
It wasn’t.
The interesting part was realizing that prompts were only the visible surface of a much larger system.
The companies that survive won’t win because they discovered the perfect prompt.
They’ll win because they build better decision systems.
Foundation models will continue improving.
The real question is no longer which model you call.
It’s how intelligently you decide to call it.