Want a new Green Revolution? Start by capturing plant-level data.

For over a decade, global agriculture has been waiting for an AI revolution.

First, Big Data was going to change everything; in 2013, Monsanto paid $1.1B for a machine learning startup, minting the first AgTech unicorn. By 2019, researchers predicted AI-powered greenhouses would spark a second Green Revolution. Next, companies tried teaching robots to pick fruit. Then ChatGPT came along, and farmers were promised GenAI agronomists. Now, boosters claim agentic AI, or even AGI, will finally deliver sweeping benefits.

Spoiler alert: none of these promises panned out. AgTech fundraising is flatlining as investors balk at plowing money into fallow ground. The few bright spots are in narrow areas like biotech and precision agriculture, not transformative AI solutions.

The problem is twofold: first, farms are incredibly challenging environments in which to develop AI; and second, current agricultural data isn’t good enough to overcome those challenges. Until we fix that fundamental problem and start feeding AI models not just more data, but radically better data, most agricultural AI will continue to die on the vine.

3 reasons agricultural AI fails

What makes agricultural AI so tough? Consider these challenges—any one of which would be enough to sink most projects:

Slow feedback. AI depends on rapid iteration, but breeding and testing new seeds is a slow process. In 1970, Norman Borlaug won the Nobel Peace Prize partially due to innovations born out of increasing the number of crop breeding cycles per year from 1 to 2. In recent decades, the major seed companies have pushed this to 3 feedback cycles per year. Still, this is nowhere near enough for the fast feedback loops on which AI depends.
High dimensionality. AI models suffer “accuracy collapse” in environments where a high number of documents are needed to obtain a precise answer and farming is one of those environments. Simple questions—how much nitrogen to apply, say—involve countless variables, from soil type to previous crops and yields, to pathogen and tilling history, to the presence of livestock on the property decades ago and more. Simplifying all of these variables into something AI can process can be virtually impossible.
Edge cases abound. AI models are great if you’re dealing with spherical cows, or trying to predict the most probable next token, but real farms have countless operational idiosyncracies. Broad capture of edge cases requires either dimensional addition (provoking the curse of dimensionality mentioned above) or for an AI to develop something more like a “world model,” which is well beyond current techniques. Nothing generalizes. Even if you have a model that solves for edge cases, its usefulness will likely be far narrower than you would have hoped as no two farmers’ needs are alike: even if they’re growing the same crops, they’ll have different technological aptitudes, labor practices, access to capital, or philosophies of farming. No AI model can serve every farmer, because there’s no universal “right answer” to target.

Much of Silicon Valley views these challenges as hurdles, not roadblocks: throw in more data, and eventually AI will deliver results. Certainly, there’s no shortage of ag data: the average farm produces an estimated 500K data points per day. But despite that, there’s remarkably little high-quality data, and the garbage-in, garbage-out rule applies to agriculture, too. Billions have been invested to harvest and marshal the 200TB (give or take) of high-quality data that powers the big LLMs. Those data consist of hundreds of trillions of tokens. The corresponding ag dataset simply does not exist.

Understanding ag’s data problem

Part of the issue is that agricultural data is fragmented. Every farmer’s data is different, and standardizing data without flattening out edge cases is incredibly challenging.

But there’s another, deeper problem: current datasets don’t capture the insights we need. AI models are trained on external factors (e.g.weather, soil acidity, nitrogen levels, and so forth) but know virtually nothing about how plants respond to those factors.

Yes, temperature and rainfall influence how crops grow. Yes, pathogens or levels of certain molecules in the soil affect some plants, in some ways, under some conditions. However, those data sets will never tell a farmer conclusively what their crops need because they only capture what’s happening outside the plant. It’s like tuning a racecar based solely on its speed; without access to engine telemetry, your algorithms won’t achieve much.

This doesn’t mean AI has no place in agriculture. Computer vision use cases like distinguishing plants from weeds, or tossing spoiled fruit in processing, have made huge inroads in the industry. But without plant-level insights, agricultural datasets are all noise and no signal; no matter how big they grow, they’ll never be able to overcome the sector’s unique challenges.

Consider Gro Intelligence: it built the world’s largest trove of ag-focused climate data, raised over $120M, and recently shuttered its doors. Or consider AcreValue, a startup one of us cofounded to turn farmland data into actionable insights, only to later concede that its estimates are only a ‘starting place’ given the near-impossibility of capturing everything about an acre of soil in a fixed list of variables.

The places where agricultural AI is succeeding, sorting fruit or detecting unripe tomatoes or applying herbicides, say, are narrower in scope. Their power comes not from vast datasets, but from carefully chosen use-cases. That’s effective, but doesn’t scale: a tomato-sorting algorithm is useful, but won’t usher in the next Green Revolution.

Looking inside plants

To gain leverage against the broadest, most valuable use cases in agricultural AI, we need data from “inside the engine,” which is to say, inside the plants we’re growing. New technologies are now making that possible for the first time.

Our company, for instance, creates crops that communicate about internal processes by fluorescing. This summer, when one of our plants suffered a fungal infection, its immune response triggered a fluorescent signal, and for the first time in the 10,000-year history of agriculture, a farmer learned about a pathogen before symptoms became visible.

That’s good for farmers: earlier warnings mean better outcomes. But it also opens the door to a new class of agricultural data. For the first time, AI innovators can leverage data that reveals not just what’s going on around crop plants in the field, but what’s happening inside them.

This new data generation mechanism makes it possible to sidestep the messy process of inferring plant biology from external factors. Instead of building sprawling, overcomplicated models, practitioners will increasingly build lean algorithms powered by data about a plant’s inner workings. No need for endless datasets: just get tailored insights directly from the plant itself.

The path forward

To not poison ourselves and cook the planet while we feed our species, all 8+ billion of us here now with another couple of billion on the way by 2050, we’ll need agricultural AI to unlock new efficiencies, higher yields, and greater resilience. But we won’t get there using current datasets. For more than a decade, researchers have been trying to brute-force solutions to the ag sector’s challenges by piling on ever more data and compute power, but they’ve come up short.

Now, we have a chance to cut through that Gordian Knot, armed with radically better data reflecting actual biological processes, giving farmers meaningful insights about how to care for their crops.

Companies like Waymo and 1X are creating their own datasets to support incredibly powerful AI models that interpret and interact with live environments. InnerPlant can chart a similar course, building datasets that document plant metabolism throughout the season across our agricultural heartland. Our sensors give us a head start. We’re the only company that can see disease as soon as the plant’s immune system reacts.

There’s still work to be done: getting plant-level data is only the first step. But it’s the vital precursor for the kinds of transformative AI that farmers have been promised for years. The agricultural AI revolution is finally coming, and it starts with using data drawn directly from plants themselves.

Author

AIJ Thought Leader

View all posts

AIJ Thought Leader 4 weeks ago

5 minutes read

Agricultural AI Sucks. Here’s How to Fix It

By Christopher Seifert, Head of Software Product, & Chip Franzén, Software Engineer at InnerPlant

3 reasons agricultural AI fails

Understanding ag’s data problem

Looking inside plants

The path forward

Author

3 reasons agricultural AI fails

Understanding ag’s data problem

Looking inside plants

The path forward

Author

Related Articles

Why Healthcare Providers Are Adopting AI for Smarter Medical Billing

Is AI Taking Over Appointment Setting? A Deep Dive

Beyond bells and whistles – where retail AI actually delivers value

International Women’s Day: Female tech leaders on how businesses can help women reach the top