In 2016, I stated one thing that went in opposition to the place robotics was heading on the time: imaginative and prescient alone doesn’t work for greedy.
Not “it wants enchancment.” Not “the tech isn’t there but.” It doesn’t match the issue.
Greedy is bodily. Contact, power, friction. Imaginative and prescient can information the strategy. It could actually’t really feel what occurs subsequent.
Again then, we noticed it within the lab. Tactile vibration information predicted grasp failure with 83% accuracy and detected slip at 92%. Early outcomes, however clear sufficient. The indicators that matter don’t present up in photographs.
Ten years later, the remainder of the sphere is operating into the identical restrict.
Imaginative and prescient will get you shut
Imaginative and prescient nonetheless issues. It handles detection, positioning and planning. It will get the robotic to the correct place, lined up the correct approach.
It does that nicely, however manipulation doesn’t cease when the gripper reaches the article.
That’s the place issues break.
What occurs at contact isn’t seen
Earlier than contact, the robotic is working off photographs.
After contact, it’s coping with forces.
A foul grasp doesn’t begin as a visible change. It reveals up as a shift in power. Slip begins within the fingertips earlier than something strikes sufficient to see. An excessive amount of stress reveals up within the wrist earlier than the article deforms.
By the point a digital camera picks up an issue, it’s already taking place.
Imaginative and prescient sees outcomes. Contact sensing measures interplay because it occurs.
And the helpful information lives proper there, in the intervening time of contact.
The proof is already there
This isn’t a idea anymore.
Tactile-driven insurance policies beat vision-only ones on duties that contain power. Benchmarks like ManiSkill-ViTac present higher efficiency while you mix imaginative and prescient with tactile enter, particularly in insertion and meeting. Fashions like π0, OpenVLA, and Octo rely upon synchronized inputs from a number of sensors. Take away power or tactile information, and efficiency drops.
Nobody is changing imaginative and prescient. They’re including what’s lacking.
The strongest techniques right this moment mix imaginative and prescient, proprioception, power, and contact right into a single mannequin.
That’s what strikes efficiency.
Imaginative and prescient has already given most of what it may possibly
Imaginative and prescient nonetheless carries a number of the system. However it doesn’t clear up the laborious half.
Bodily AI improves with extra information, however not all information issues the identical. Drive and tactile indicators have an outsized influence on how nicely a system handles actual contact.
Most datasets nonetheless lean closely on imaginative and prescient and joint information.
So that you see the identical sample time and again. Robots attain the correct place. Then battle with insertion, meeting, and something that is determined by compliance.
The lacking data is bodily.
Tactile information hasn’t scaled but
Gathering good contact information hasn’t been simple. You want instrumented finish effectors, dependable power and tactile sensors, tight synchronization, and constant codecs.
That’s a {hardware} downside as a lot as a modelling one.
Till lately, the infrastructure wasn’t there.
Now it’s.
The bottleneck is how briskly groups can deploy it and begin accumulating information.
Closing the loop
What began as a declare in 2016 is now exhibiting up in every single place.
Robots that solely see will maintain hitting the identical limits. Robots that may really feel will begin to shut the hole.
Imaginative and prescient stays. It’s not going anyplace.
However it gained’t carry manipulation by itself. The shift comes from including the indicators that matter on the level of contact.
At Robotiq, our tactile sensors are constructed to seize these indicators instantly on the gripper, so robots see and really feel what they’re doing.

