A computer-vision platform that turns a single store photo into an objective, scored verdict on retail execution — at the scale of thousands of outlets.
A global fashion brand relied on field and merchandising teams to check that in-store displays matched brand guidelines across thousands of outlets. Verification was done by eye, store by store, and the results were slow to gather, subjective, and impossible to compare across regions.
The brand needed an objective, repeatable way to measure how closely each store's execution matched the intended look — fast enough to act on, and consistent enough to trust across markets, languages, and lighting conditions. Manual audits couldn't scale, and self-reported checklists were unreliable.
We built a lightweight mobile app so field reps could photograph a fixture in seconds, with on-device guidance to keep framing and lighting consistent — the quality of the input determines the quality of the score.
Each photo is compared against a reference planogram image through a computer-vision pipeline that returns a similarity score and a clear pass/fail with a confidence level, rather than a vague yes/no.
We built the model-training and image-processing pipeline so that edge cases captured in the field feed back into retraining, steadily improving accuracy across the long tail of store layouts.
The data-collection side of this problem is as important as the modelling side — if field reps can't reliably capture usable photos across thousands of outlets with varying lighting, layouts, and connectivity, the model never gets good input. The mobile capture app guides reps through a structured capture flow (which fixture, which angle, reference framing), works offline with queued upload for low-connectivity locations, and does basic on-device quality checks (blur, framing, lighting) before a photo is accepted — rejecting bad captures at the point of capture is far cheaper than discovering them after the fact.
The core CV pipeline compares an uploaded photo against a reference image or planogram for that fixture type — detecting product presence, placement, facing direction, and shelf organisation, and identifying deviations from the reference. This isn't simple image-diffing: lighting, angle, and minor repositioning between photos mean the model needs to be robust to variation that doesn't represent real execution issues, while still catching variation that does (missing products, wrong placement, planogram non-compliance). The model-training pipeline uses labelled examples from real store photos across the brand's outlet variety to handle this robustly.
The output for a brand team isn't a raw CV detection list — it's a compliance score per fixture, rolled up to a per-outlet and per-region score, with the specific deviations flagged (which products are missing or misplaced, which sections don't match planogram). Dashboards let regional managers see execution scores trending over time and drill into specific outlets or fixture types underperforming, turning what used to be sporadic manual store audits into a continuous, comparable, and scalable scoring system.
Retail execution went from a slow, subjective audit to an objective measurement teams could act on. Field and marketing staff were freed from manual checks, regional performance became directly comparable, and the brand gained a consistent, data-backed view of how its stores actually looked.
Tell us what you're building. We'll tell you the fastest honest path to shipping it.
Start a conversation →