AGIBOT holds World Problem 2026 to see how AI fashions carry out on actual duties

June 7, 2026

19

AGIBOT holds World Problem 2026 to see how AI fashions carry out on actual duties

Individuals within the problem examined and debugged robots engaged on completely different duties. | Supply: AGIBOT

AGIBOT Innovation Expertise Co. final week hosted the AGIBOT World Problem 2026 alongside ICRA 2026 in Vienna. The corporate introduced collectively 526 analysis and enterprise groups from 27 international locations to compete throughout two embodied AI tracks: “Reasoning to Motion” and “World Mannequin.”

Shanghai-based AGIBOT stated the competitors highlighted a key shift in how embodied AI is evaluated. The firm stated it confirmed that the business is shifting past simulation scores towards closed-loop testing on actual robots, actual duties, and standardized benchmarks.

The competitors adopted a benchmark-driven format that mixed on-line automated analysis with an offline real-robot ultimate in Vienna. With AGIBOT’s EWMBench and Genie Sim Benchmark, the constant framework enabled automated testing, standardized metrics, and reproducible outcomes.

In the course of the offline ultimate, finalist groups accomplished duties utilizing the AGIBOT G2 humanoid robotic. By incorporating real-robot validation into the analysis course of, the competitors positioned robotic stability, real-world adaptability, and long-horizon activity reliability on the middle of the scoring system. The corporate, also referred to as Zhiyuan Robotics Co., stated this extra carefully aligns technical analysis with sensible deployment wants.

The problem drew analysis and business groups from main establishments and corporations, together with the Chinese language Academy of Sciences, Tsinghua College, the College of Science and Expertise of China, the College of California San Diego, Russia’s Sber Robotics Heart, Alibaba, Amap, and vivo. Greater than 100 groups surpassed the official baseline.

What’s the distinction between the R2A and WM tracks?

The 2 tracks on the AGIBOT World Problem 2026 mirrored the broader evolution of embodied AI from activity execution towards understanding, prediction, and decision-making, in line with AGIBOT.

The Reasoning to Motion (R2A) observe evaluated how robots perceive duties, plan actions, and execute them in bodily environments. The R2A observe, upgraded from the 2025 Manipulation observe, expanded the analysis from motion execution to the complete means of atmosphere understanding, activity planning, and bodily execution.

The World Mannequin (WM) observe centered on how AI programs predict physical-world adjustments and mannequin interactions primarily based on robotic actions and sensor inputs.

Groups educated reasoning-and-manipulation fashions utilizing the AGIBOT WORLD open-source dataset and evaluated them by Genie Sim 3.0, with the benchmark masking language understanding, spatial reasoning, atomic expertise, disturbance adaptation, and zero-shot switch.

Within the ultimate rating, PrismBot from vivo gained the championship with 43.47 factors, adopted by Shanghai RoboParty’s RP-VLA with 35.66 factors and Russia’s GreenVLA with 33.19 factors.

AGIBOT targets grocery store duties with the problem

Alongside the competitors, AGIBOT and Dexmal launched a grocery store benchmark observe centered on end-to-end decision-making and whole-body management. This observe integrated non-ideal bodily interactions, together with object drops and greedy failures, to raised mirror the complexity of real-world interplay and supply a extra sensible analysis framework for world mannequin analysis.

Set in a practical retail atmosphere, the observe required fashions to finish the complete cellular manipulation course of, from autonomous navigation and merchandise selecting to merchandise transport and placement, beneath bodily constraints equivalent to shelf top limits and randomized merchandise placement. By API-based distant management, members’ algorithms instantly managed actual robots, making a sensible benchmark for evaluating embodied intelligence in deployment-oriented eventualities.

Within the World Mannequin (WM) observe, NeoVerse-ABot, a joint workforce from the Institute of Automation of the Chinese language Academy of Sciences, and Amap CV Lab, gained first place. The PAI@IAII workforce from the Institute of Industrial Synthetic Intelligence on the Chinese language Academy of Sciences, ranked second. The Loop workforce from the College of Science and Expertise of China positioned third.

With the World Challenge, AGIBOT hoped to contribute to a more practical and reproducible evaluation framework for embodied AI.

With the World Problem, AGIBOT hoped to contribute to a extra sensible and reproducible analysis framework for embodied AI. | Supply: AGIBOT

AGIBOT releases full-stack toolchain for robotic validation

Past the competitors itself, AGIBOT opened a full-stack toolchain masking real-world information, simulation analysis, and real-robot testing. The toolchain included the AGIBOT WORLD open-source dataset, Genie Sim 3.0, and the AGIBOT G2 robotic platform, serving to builders validate fashions throughout the trail from coaching to simulation and bodily deployment.

EWMBench and Genie Sim Benchmark supported standardized metrics, automated analysis, and comparable outcomes throughout simulation and bodily testing. They addressed frequent challenges equivalent to inconsistent analysis standards and the hole between simulated efficiency and real-world deployment.

AGIBOT stated it’ll combine the technical and ecosystem sources developed by the competitors with its ongoing benchmark improvement and open-source efforts. The corporate additionally plans to launch a web-based simulation leaderboard, introduce extra check duties and diversified benchmarks, and assist extra complete quantitative analysis of mannequin capabilities.

As well as, AGIBOT stated it’ll proceed to refine its benchmarks and full-stack toolchain, working with world analysis establishments, builders, and business companions. Its acknowledged purpose is to assist embodied AI transfer from particular person algorithmic advances towards programs that may be deployed and scaled in real-world settings.

In different benchmark information, Fraunhofer IPA final month supplied a brand new check benchmark for humanoid robots, and NIST proposed its personal baseline efficiency benchmark for humanoids.

ITE AD for the 2026 RoboBusiness call for speakers

Submit your session thought for the 2026 RoboBusiness

AGIBOT holds World Problem 2026 to see how AI fashions carry out on actual duties

What’s the distinction between the R2A and WM tracks?

AGIBOT targets grocery store duties with the problem

AGIBOT releases full-stack toolchain for robotic validation

Related Articles

Report shares the state of bodily AI and robotics

Investing within the Way forward for Mexico’s Telco Panorama

After surprising quarter, IBM insists that AI is not killing the mainframe

LEAVE A REPLY Cancel reply

Latest Articles

Report shares the state of bodily AI and robotics

Investing within the Way forward for Mexico’s Telco Panorama

After surprising quarter, IBM insists that AI is not killing the mainframe

GKN Aerospace and Pratt & Whitney increase additive manufacturing work to F135 engine | VoxelMatters

MIT’s new lidar chip might give self-driving vehicles a wider view

ABOUT US