AI Integration

From Claude Vision to coffee leaf rust: AI in the field at 26,000 plants

A coffee estate with 26,000 plants cannot afford to miss leaf rust early. Here is how Claude Vision became the first-pass agronomist on our estate — and the production architecture behind it.

July 2026 · 8 min read

Coffee leaf rust does not wait for a scheduled agronomist visit. It moves block to block in days, carried by wind and rain. By the time a human walks the affected rows, the spray window is often closed and the damage is already economic. On an estate of 26,000 plants, the detection problem is not a question of skill — it is a question of coverage and speed. We built a Claude Vision pipeline that changes both.

The problem an agronomist visit costs $200

A qualified agronomist in East Africa charges roughly $200 per site visit, covering a 2-hour walk and a written report. On a 50-acre estate with 9 distinct blocks, full expert coverage during peak rust season would require multiple visits per week. The economics do not work. Most smallholder and mid-scale coffee operations end up with one monthly agronomist visit and a lot of hope.

World Coffee Research estimates that coffee leaf rust (Hemileia vastatrix) causes yield losses of 30 to 70 percent in untreated crops. The fungus spreads fastest under the warm, wet conditions common to highland Kenya during October and April. Early detection, within the first 72 hours of visible symptoms, dramatically narrows the treatment area and cuts chemical costs.

The field worker already walks every block on a routine schedule. The problem is that their eye is not trained to catch early-stage rust, and even if it were, they cannot document and escalate fast enough. They need a tool that gives them a diagnosis in the field, before they move to the next row.

Why Claude Vision specifically

We tested GPT-4V and Gemini Vision alongside Claude Sonnet before committing to an architecture. The evaluation ran 200 labeled leaf photos across four disease categories. All three models had comparable raw accuracy on clear images. Claude Sonnet separated itself on two things: structured-output reliability and latency consistency.

Anthropic's tool_use API means the diagnosis comes back as a typed object, not a paragraph we have to parse. With GPT-4V, even with detailed system prompts, we saw occasional free-text responses that fell outside our expected format under edge conditions. With Claude, the tool-use contract enforces the schema at the model level. A malformed response is a model error, not a parsing problem for our code to absorb.

End-to-end latency from photo capture to diagnosis display in the React Native app runs under 4 seconds on a standard Kenyan mobile connection. That is within the acceptable range for a field worker who is pausing between rows. It is not real-time, but it is fast enough.

The full Claude Vision capability set is documented in the Anthropic Vision API reference. For narrow visual classification tasks with a defined output schema, it is production-ready today.

The architecture: photo to diagnosis to action

The pipeline is deliberately shallow. We were not building a research system — we were building something a field worker with a mid-range Android phone could use without training.

Field worker photographs a leaf during their routine block walk using the MkulimaOS React Native app.
Image uploads directly to Vercel Blob. A signed URL is returned to the mobile client in under 1 second.
Mobile calls /api/scout/diagnose with the signed image URL, GPS coordinates, block ID, and worker ID.
Server-side handler calls Claude Sonnet with the image URL, a system prompt containing the disease knowledge base, and a tool_use definition that forces the structured output format.
Response is validated against our Zod schema. If validation passes and confidence meets the threshold, the result is written to PostgreSQL and surfaced on the estate manager's dashboard. If not, it goes to the human agronomist review queue.

The diagnosis response object looks like this:

{
  "threat": "coffee_leaf_rust",
  "confidence": 0.91,
  "recommendedAction": "apply_treatment",
  "affectedArea": "lower_canopy",
  "severity": "early_stage",
  "reasoning": "Orange-yellow pustules visible on abaxial surface of 3 leaves. Pattern consistent with CLR stage 2. No secondary infection markers present.",
  "flagForHumanReview": false,
  "capturedAt": "2026-04-14T09:22:11Z",
  "blockId": "block-04",
  "gpsCoordinates": { "lat": 0.3742, "lng": 35.1201 }
}

The estate manager sees the diagnosis on their dashboard within seconds. If recommendedAction is apply_treatment, the farm operations system opens a treatment task automatically. No manual data entry, no phone calls, no waiting for the weekly report.

What Claude catches (and what it does not)

We are specific about the model's capabilities because overstating them would be more dangerous than the disease.

Catches reliably: coffee leaf rust at stages 1 through 3, coffee berry borer damage on ripe and semi-ripe cherries, antestia bug scarring, and general stress indicators including chlorosis and water-deficit symptoms. These are the four highest-frequency field problems on the estates we work with.

Does not catch reliably: novel or rare disease patterns not well-represented in the training distribution, overlapping multi-disease presentations where two conditions share visual space, and any photo where lighting, angle, or focus is poor. The model also does not assess root health, soil conditions, or systemic issues invisible from leaf surface alone.

Our confidence threshold is 0.85. Any diagnosis below that score sets flagForHumanReview: true and routes to the agronomist queue. The human is the authority. Claude is the fast first pass that handles the 70% of cases that are clear-cut, freeing the agronomist to focus on the 30% that genuinely need expert judgment.

We did not build a system that replaces the agronomist. We built one that tells them exactly where to walk.

Cost economics

A Claude Sonnet call with a high-resolution leaf image costs approximately $0.012, accounting for image input tokens and the structured output. On a 50-acre estate running daily block walks across 9 blocks, that is roughly $3.24 per day for continuous scouting coverage — less than $100 per month.

A single agronomist visit costs $200. Peak rust season runs 8 to 10 weeks. Full-coverage weekly visits would cost $1,600 to $2,000 per season. With the Claude Vision system, the agronomist visits only when flagged by the model — typically 2 to 3 times per season on estates where the system is detecting early. That is $400 to $600 in agronomist fees plus under $300 in API costs, against a potential yield loss of tens of thousands of dollars if rust goes undetected for two weeks.

Early detection also means smaller treatment areas. Spraying a single block costs a fraction of spraying the whole estate. On estates that ran the system through the April 2026 season, treatment costs dropped by roughly 40% compared to the prior year's reactive spray schedules.

What we are building next

The current system works from visual data alone. The next version combines leaf photos with soil-sensor readings (pH, moisture, nitrogen) and localized weather data to improve confidence on borderline cases. When a disease pattern looks consistent with rust but confidence sits at 0.78, knowing that the block experienced 3 days of above-average humidity pushes that to a more actionable number.

We are rolling out to two additional East African estates in Q3 2026, which will give us a broader disease distribution to evaluate the model against. Both estates grow different coffee varieties and sit at different altitudes, so we expect some recalibration of the confidence thresholds.

We are also planning to open-source the disease-classification schema and the MkulimaOS scouting API interface, so other agritech teams building on Claude Vision can skip the schema design phase and focus on their specific crop domain.

You can see the production system at MkulimaOS. If you are building in agritech or evaluating Claude Vision for a narrow classification problem in your own domain, we would like to talk. Book a discovery call at our contact page — we are direct about what the model can and cannot do, and we will tell you whether your use case fits the same pattern before you commit to building.

References & further reading

See MkulimaOS live All notes

Get the next one

Notes like this one, in your inbox.

Production AI engineering, EUDR compliance work, and lessons from running real software in rural Kenya. Twice a month. Unsubscribe anytime.

Also in the pipeline

Building in public

Start a project.

Read something here that maps to a problem you have? Tell us about it — we'll tell you whether we can ship it.

Book a discovery call See the work