AI-based advisory services for agronomy and crop management are emerging worldwide, reaching hundreds of thousands of farmers in the United States, India, Africa, and beyond. Agricultural professionals are using public chatbots to get answers, and some organizations have developed their own. Until now, users have not had a systematic way to assess the accuracy, relevance, or general usefulness of the answers obtained from these generative AI ag information services.
Led by the Center for Digital Agriculture (CDA) at the University of Illinois Urbana-Champaign (UIUC), the AI AgriBench consortium provides users of public chatbots and crop management advisory services with a trusted framework to evaluate and build confidence in the new generation of digital tools that support farmers and the broader agricultural community.
The AI AgriBench benchmarking service is designed to answer a simple but critical question: “Is agronomic advice from large language models (LLM’s) and AI-based agricultural advisory tools accurate?” The first version of the AI AgriBench benchmark, being released widely today, addresses core agronomic understanding, a key technical foundation for on-farm advice.
“It is exciting to be able to release our first benchmark evaluating agronomic knowledge in public chatbots and ag advisory services. It is encouraging that state-of-the-art LLMs have achieved a deep understanding of this technical area, but of course, real-world farmer questions require additional challenging capabilities which will be addressed in future versions of the benchmark,” says Vikram Adve, Donald B. Gillies Professor of Computer Science, AIFARMS PI, and director of the AI AgriBench Consortium.
Building an agronomy and crop management benchmark data set.
The AI AgriBench benchmark data set draws on thousands of land-grant university extension PDFs from the CropWizard project corpus, which keeps it grounded in the same technical resources used by farmers and advisors. Questions focus on real farm decisions in categories including insect pests, diseases, weeds, nutrition, soils, seed, horticulture, water, and weather across 31 crops. Answers are multi-paragraph, concise, practical, and written in plain language.
To ensure the benchmark questions reflect real user inquiries and that the answers are technically sound, more than 20 experienced agronomy professionals reviewed nearly a thousand Q&A pairs, confirming that the responses met their standards for accuracy, relevance, completeness, and conciseness. The volunteer experts selected high-quality Q&A pairs and manually edited them as needed to develop the final “ground-truth” data set of 416 pairs used in the benchmark data set.
A Transparent Leaderboard for Evaluating AI Agronomy Systems.
The AI-AgriBench public leaderboard presents evaluation results for a wide range of public LLMs and ag-specific advisory services, enabling easy comparison across systems. Responses generated by these models and services are compared against the ground-truth answers and scored by a panel of three “judge LLMs” using carefully designed judging prompts. Organizations seeking inclusion request the official test-set questions, run their models, and submit formatted responses for standardized evaluation.
Complete technical details on the data set curation process, definitions of the four metrics, the LLM-based evaluation methodology and judging prompts, and a brief discussion of the leaderboard results are provided in an accompanying technical blog post. The methodology is based on research conducted within the CropWizard research project in the AIFARMS National AI Institute. The actual benchmark Q&A pairs are kept confidential to reduce the likelihood of “tuning to the benchmark.” In all other aspects, the consortium aims for full transparency and welcomes questions.
Momentum Builds Around a Shared AI Benchmark for Agriculture.
The founding members of the AI AgriBench Consortium, including CDA, AIFARMS, Bayer Crop Science, Extension Foundation, Kissan AI, John Deere, Microsoft, and DeepRoot Strategies, have steered the vision, strategy, and technical goals since its inaugural announcement at World Agri-Tech Innovation Summit in San Francisco, CA, in March 2025.
We are excited to announce that four new members have joined the consortium: Farmers Business Network (FBN), Taranis, Precision Development, and Digital Green. These are some of the pioneers in AI-driven agricultural advisory, reaching hundreds of thousands of farmers.
Consortium membership is open to commercial, nonprofit, academic, and other organizations worldwide with expertise in AI-driven digital services for agriculture that share the consortium’s goals. Organizations that can contribute relevant datasets, benchmarking, AI evaluation experience, or traditional advisory expertise are encouraged to join. Visit our website for membership information.
Individuals or organizations who would like to stay informed about the consortium’s plans can sign up for our mailing list. Members of the media and anyone with questions are welcome to contact us at aiagribench@lists.illinois.edu or Melanie Rodriguez at 217-300-5884.
ABOUT:
AI AgriBench is a public benchmarking consortium led by the Center for Digital Agriculture at the University of Illinois Urbana-Champaign, created to evaluate the accuracy, relevance, and practical usefulness of AI-powered advisory services for agriculture. The consortium provides a transparent, expert-reviewed benchmark grounded in real-world agronomy and farm management questions drawn from land-grant university extension publications. AI AgriBench assesses public chatbots and AI-based advisory systems using curated, field-level scenarios that reflect real decisions faced by farmers and agronomy professionals, and reports results through a public leaderboard. The benchmarking methodology is based on research conducted within the CropWizard project of the AIFARMS National AI Institute. AI AgriBench brings together academic, industry, nonprofit, and extension partners worldwide to support trustworthy, responsible adoption of AI in agriculture. Founding and member organizations include AIFARMS, Bayer Crop Science, Extension Foundation, Kissan AI, John Deere, Microsoft, DeepRoot Strategies, Farmers Business Network, Taranis, Precision Development, and Digital Green.