"AI estimating" is a phrase that has, fairly recently, become marketing slop. Half the products that claim it are wrapping ChatGPT around a system prompt; the other half are doing regression on historical job data. We do neither. Here's exactly what happens when a tech taps the camera icon on the TradeOS mobile app.
The input
The tech can give us up to three signals:
- Photos — anywhere from 1 to 8. We strongly recommend at least one wide shot and one close-up of the model number / nameplate.
- Voice note — held while talking, released when done. Whisper transcribes locally on the device for shorter clips, server-side for longer ones.
- Text scope — optional, but useful for things the camera can't see (e.g. "second floor, no attic access").
The pipeline
The whole thing runs in five stages. Total wall time is usually 6–10 seconds; we stream partial results back to the truck so the tech sees the line items appearing live, not all at once at the end.
- 1. Vision pass. A vision-capable model looks at the photos and emits structured observations: equipment type, model identifiers, visible damage, accessibility notes. This is JSON, not prose — it's the building block, not the answer.
- 2. Speech transcription. Voice notes are transcribed in parallel with the vision pass. We then reconcile transcript and observations into a single normalized scope.
- 3. Materials lookup. The structured observations are matched against the org's materials library (parts, prices, vendor SKUs). Anything we can't confidently match falls back to a generic line item flagged for human review.
- 4. Labor estimation. Labor lines are estimated using the org's labor rates and a rules table maintained per trade (HVAC, plumbing, electrical). Not ML; the rules are inspectable in
Settings → Estimating. - 5. Draft assembly. A short scope-of-work paragraph is generated, line items are assembled into a draft, and the whole thing is handed back to the tech.
Guardrails
- Drafts are never sent. Every AI draft enters as a
Draftquote requiring an explicit Send tap. Always a human in the loop. - Source-of-truth pricing. Materials prices come from the materials library, not from the model. The model doesn't get to invent a price for a Rheem PROG40-38N — we look it up.
- Confidence scoring. Each line item carries a confidence score. Anything below 0.7 is highlighted in the draft so the tech eyeballs it before sending.
- Cost guard. The whole pipeline is capped at $0.08/draft. We log per-org AI cost per month and surface it on the dashboard so it's never a surprise on your bill.
What it doesn't do
On purpose:
- We don't auto-send. Even if the draft looks perfect, it waits for the tech to tap send.
- We don't use prior customer data to "personalize" pricing. The same job draft for two customers will price the same materials and labor lines.
- We don't pretend to diagnose. The vision pass identifies what's there, but the diagnosis ("you need a new compressor") still belongs to the tech.
That last point matters. The thing that makes a great quote isn't a fancy model. It's a tech who knows the job, can explain the work, and doesn't make the customer feel rushed. Our job is to remove the typing.