4.6 C
Canberra
Saturday, June 6, 2026

Unlocking reliable responses with Gemini Enterprise Agent Platform’s Agentic RAG


Experiments and outcomes

We evaluated agentic RAG on FramesQA, which relies on the FRAMES paper. An instance multi-hop query is:

“Of the highest two most watched tv season finales (as of June 2024), which finale ran the longest in size and by how a lot?”

The RAG system must carry out a number of steps to reach on the appropriate reply. First, it has to determine that the 2 most watched finales are from the reveals M*A*S*H and Cheers. Then, it has to seek out their operating occasions, and calculate the size distinction. In lots of RAG settings (Vanilla RAG or agentic RAG with out enough context), we might find yourself in a scenario the place the mannequin says one thing like:

“Regardless of a number of scans, I discovered no express runtimes for M*A*S*H or Cheers. The paperwork present viewership information, however not the length in minutes or hours.”

This doesn’t reply the query.

Happily, our agentic RAG can resolve this by first looking for the TV reveals, then utilizing the Question Rewriter and Enough Context Agent to have a focused seek for the run time of M*A*S*H or Cheers. Then, Gemini can simply decide which finale ran the longest in size and by how a lot:

“The M*A*S*H finale ran for 150 minutes, making it the longest of the highest two. It was 52 minutes longer than the Cheers finale, which ran for about 98 minutes.”

We ran an experiment to check this capability at scale (FramesQA has 824 queries together with a corpus containing 2,676 PDF paperwork). Within the “Vanilla” RAG setting, we use Google’s RAG Engine (which has a sophisticated retrieval engine, LLM parser, and re-ranker). We in contrast this with our agentic RAG in two settings. Within the single-corpus setting, we retrieve from the FramesQA paperwork. Within the cross-corpus setting, we additionally embrace three different distracting datasets, the place the Planner Agent should decide the place to retrieve from. This cross-corpus setting mimics use circumstances the place corporations have databases managed by separate groups. We compute accuracy by utilizing an LLM-as-a-judge to check the system responses to the bottom fact solutions within the dataset.

Within the cross-corpus setting, our system almost matches its single-corpus accuracy. Even when the Planner Agent should choose the proper corpus out of 4 potentialities, we efficiently route the search queries and reply 90.1% of questions appropriately. Additionally, the latency of each single- and cross-corpus variations is about the identical (inside 3% on common). This demonstrates that our Agentic RAG system can purpose over a number of, unrelated information sources, which opens up potentialities for extra versatile retrieval eventualities.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

[td_block_social_counter facebook="tagdiv" twitter="tagdivofficial" youtube="tagdiv" style="style8 td-social-boxed td-social-font-icons" tdc_css="eyJhbGwiOnsibWFyZ2luLWJvdHRvbSI6IjM4IiwiZGlzcGxheSI6IiJ9LCJwb3J0cmFpdCI6eyJtYXJnaW4tYm90dG9tIjoiMzAiLCJkaXNwbGF5IjoiIn0sInBvcnRyYWl0X21heF93aWR0aCI6MTAxOCwicG9ydHJhaXRfbWluX3dpZHRoIjo3Njh9" custom_title="Stay Connected" block_template_id="td_block_template_8" f_header_font_family="712" f_header_font_transform="uppercase" f_header_font_weight="500" f_header_font_size="17" border_color="#dd3333"]
- Advertisement -spot_img

Latest Articles