We Accidentally Built a Virtual Staging Company.
Here’s How That Happened.
It started, as most bad ideas do, in the middle of the night.
A photographer friend texted me a photo: an empty living room, no furniture, bare walls, a single overhead light casting the kind of shadow that makes rooms look like crime scenes in MLS listings. “My client won’t pay for staging. Can AI furnish this?”
The question seemed simple. It wasn’t.
Virtual staging — the practice of digitally furnishing an empty room before putting it on the market — has been around for about fifteen years. Photographers usually do it with Photoshop, and it costs anywhere from $50 to $200 per photo. A 3-bedroom apartment shoot comes back at $500 to $2,000 extra on top of the photography fee.
A lot of agents see the price and just say no, so the listing goes up with empty rooms. Buyers try to mentally furnish the space, fail, scroll past, but then the property sits vacant and everyone blames the market. Somewhere in there, a photographer loses a repeat client.
My friend was looking for a quick hack, but there wasn’t one that worked. So I built one. A very ugly one.
I created Altitude with my co-founder exactly for things like this, to solve problems for real people. My first attempt was a Python script that called the Gemini API and told it to furnish a room. Two dozen lines. I ran it on one of his photos.
The result was… not bad? A couch appeared, maybe a coffee table materialized. The lighting felt a little AI-dream-sequence, but it was furniture, and it was in the room, and it had taken eleven seconds.
I texted it back. He responded with a screenshot of his client’s reply: “Can we do all 11 photos?”
And that’s when the trouble started.
Here’s something that sounds obvious in retrospect but took us embarrassingly long to understand: when you stage one room photo in isolation, you get furniture. When you stage eleven photos of the same apartment in isolation, you get eleven different apartments.
The living room has a blue sectional. The dining area, photographed from the kitchen angle, now has a cream loveseat facing the wrong direction. The hero shot has a floor lamp. The wide shot from the doorway has two floor lamps and a different plant entirely.
It looks like a furniture warehouse had a disagreement with itself.
We called this the “hallucination consistency problem,” and it’s why most AI staging tools that exist today produce images that feel just slightly… off. Every photo is technically staged. None of them feel like the same home.
Our fix was inelegant but it worked: generate a manifest first.
Before touching a single image, we make one call to Gemini — feeding it every photo in the shoot plus the client’s style brief — and ask it to produce a JSON document describing the furniture for the whole apartment. Palette, materials, the specific pieces, the zones. We store this as “the ground truth.”
Then, when we stage each photo, we show the model that manifest and say: these are the pieces that belong in this home. And crucially, after the first photo in any zone is staged, we feed that result as a visual reference to every subsequent angle. The model has literally seen what the sofa looks like. It doesn’t hallucinate a different one.
Boom, solved. Same sofa, same lamp, same rug — across every angle, every room.
By this point I had a fully working command-line tool. You’d run `python3 main.py` in a folder full of JPEGs, wait a few minutes, and get a folder full of staged JPEGs. It had caching , if a step had already run, it would skip it. Running it twice cost nothing.
My friend was paying, maybe, $0.35 per full apartment run in AI API costs. His existing vendor charged $800. This math was great, but the product sucked.
Nobody wants to SSH into a machine and run a Python script to stage a listing. Photographers use Lightroom, Capture One, maybe a web app if you’re lucky. The CLI was a proof of concept for a workflow we hadn’t designed yet, so we started designing the workflow.
We talked to photographers doing volume real estate work. The people who shoot two, three, four listings a week, who have agent clients who expect fast turnaround, who are constantly juggling shoots and edits and delivery deadlines.
They all said they needed a power tool where you upload photos, assign them to rooms, describe what you want in each room, click a button, and come back to a ready-to-deliver set of staged images. And, they wanted compliance.
Turns out, California passed AB 723 in 2023. It requires that any AI-generated or digitally altered real estate image carry a visible disclosure watermark and be accompanied by a permanent, publicly accessible URL to the original unedited photo.
So then we built a compliance pipeline that runs after every staged image is generated. It reads the MLS rules for your board — watermark text, position, size, opacity — applies them using Sharp (a Node.js image processing library), and then creates a public disclosure record. Every staged image gets a short URL that shows a side-by-side comparison of the original and staged version, timestamped and immutable.
The batch workflow handles 90% of cases. You upload, you wait, you download. However, advanced users may want a little more control over the output. For that, we built a canvas — a node-based editor where each photo is a node and each generation is another node connected to it. You can chain them: stage a room, then feed the staged result back as a base for another generation with different instructions. You can pass a staged image from one node as a “reference” to another node in a completely different part of the canvas.
It’s the kind of tool that looks confusing in a screenshot and feels obvious after thirty seconds of use. The AI isn’t a magic button; it’s a collaborator you’re directing. The canvas makes that collaboration visible and reversible.
If you want to learn more about what we have built, visit the project website here
https://interior-staging-image-generation.vercel.app/
And if you want to learn more about our company, visit the Altitude website.
https://www.altitudedp.com/









