Unlocking reliable responses with Gemini Enterprise Agent Platform’s Agentic RAG

June 5, 2026

18

Experiments and outcomes

We evaluated agentic RAG on FramesQA, which relies on the FRAMES paper. An instance multi-hop query is:

“Of the highest two most watched tv season finales (as of June 2024), which finale ran the longest in size and by how a lot?”

The RAG system must carry out a number of steps to reach on the appropriate reply. First, it has to determine that the 2 most watched finales are from the reveals M*A*S*H and Cheers. Then, it has to seek out their operating occasions, and calculate the size distinction. In lots of RAG settings (Vanilla RAG or agentic RAG with out enough context), we might find yourself in a scenario the place the mannequin says one thing like:

“Regardless of a number of scans, I discovered no express runtimes for M*A*S*H or Cheers. The paperwork present viewership information, however not the length in minutes or hours.”

This doesn’t reply the query.

Happily, our agentic RAG can resolve this by first looking for the TV reveals, then utilizing the Question Rewriter and Enough Context Agent to have a focused seek for the run time of M*A*S*H or Cheers. Then, Gemini can simply decide which finale ran the longest in size and by how a lot:

“The M*A*S*H finale ran for 150 minutes, making it the longest of the highest two. It was 52 minutes longer than the Cheers finale, which ran for about 98 minutes.”

We ran an experiment to check this capability at scale (FramesQA has 824 queries together with a corpus containing 2,676 PDF paperwork). Within the “Vanilla” RAG setting, we use Google’s RAG Engine (which has a sophisticated retrieval engine, LLM parser, and re-ranker). We in contrast this with our agentic RAG in two settings. Within the single-corpus setting, we retrieve from the FramesQA paperwork. Within the cross-corpus setting, we additionally embrace three different distracting datasets, the place the Planner Agent should decide the place to retrieve from. This cross-corpus setting mimics use circumstances the place corporations have databases managed by separate groups. We compute accuracy by utilizing an LLM-as-a-judge to check the system responses to the bottom fact solutions within the dataset.

Within the cross-corpus setting, our system almost matches its single-corpus accuracy. Even when the Planner Agent should choose the proper corpus out of 4 potentialities, we efficiently route the search queries and reply 90.1% of questions appropriately. Additionally, the latency of each single- and cross-corpus variations is about the identical (inside 3% on common). This demonstrates that our Agentic RAG system can purpose over a number of, unrelated information sources, which opens up potentialities for extra versatile retrieval eventualities.

Unlocking reliable responses with Gemini Enterprise Agent Platform’s Agentic RAG

Experiments and outcomes

Related Articles

Report shares the state of bodily AI and robotics

Investing within the Way forward for Mexico’s Telco Panorama

After surprising quarter, IBM insists that AI is not killing the mainframe

LEAVE A REPLY Cancel reply

Latest Articles

Report shares the state of bodily AI and robotics

Investing within the Way forward for Mexico’s Telco Panorama

After surprising quarter, IBM insists that AI is not killing the mainframe

GKN Aerospace and Pratt & Whitney increase additive manufacturing work to F135 engine | VoxelMatters

MIT’s new lidar chip might give self-driving vehicles a wider view

ABOUT US