AI Robots Learn Touch and Vision to Handle Objects Like Humans

Summary: A new breakthrough shows how robots can now integrate both sight and touch to handle objects with greater accuracy, similar to humans. Researchers developed TactileAloha, a system that combines visual and tactile inputs, enabling robotic arms to adapt more flexibly to real-world tasks.

Unlike vision-only systems, this approach allowed robots to manage challenging objects such as Velcro and zip ties, demonstrating human-like sensory judgment. The findings mark a major step toward developing physical AI that could help robots assist with everyday tasks like cooking, cleaning, and caregiving.

Key Facts:

Tactile Integration: Combines visual and tactile sensing for better object handling.
Adaptive Performance: Outperforms vision-only systems in complex tasks like Velcro manipulation.
Everyday Potential: Moves robotics closer to real-world applications in homes and workplaces.

Source: Tohoku University

In everyday life, it’s a no-brainer for us to grab a cup of coffee from the table. We seamlessly combine multiple sensory inputs such as sight (seeing how far away the cup is) and touch (feeling when our hand makes contact) in real-time without even thinking about it. However, recreating this in artificial intelligence (AI) is not quite as easy.

An international group of researchers created a new approach that integrates visual and tactile information to manipulate robotic arms, while adaptively responding to the environment. Compared to conventional vision-based methods, this approach achieved higher task success rates. These promising results represent a significant advancement in the field of multimodal physical AI.

They found that by applying vision-tactile transformer technology, their Physical AI robot exhibited more flexible and adaptive control. Credit: Neuroscience News

Details of their breakthrough were published in the journal IEEE Robotics and Automation Letters on July 2, 2025.

Machine learning can be used to support artificial intelligence (AI) to learn human movement patterns, enabling robots to autonomously perform daily tasks such as cooking and cleaning.

For example, ALOHA (A Low-cost Open-source Hardware System for Bimanual Teleoperation) is a system developed by Stanford University that enables the low-cost and versatile remote operation and learning of dual-arm robots. Both hardware and software are open source, so the research team was able to build upon this base.

However, these systems mainly rely on visual information only. Therefore, they lack the same tactile judgements a human could make, such as distinguishing the texture of materials or the front and back sides of objects. For example, it can be easier to tell which is the front or back side of Velcro by simply touching it instead of discerning how it looks. Relying solely on vision without other input is an unfortunate weakness.

“To overcome these limitations, we developed a system that also enables operational decisions based on the texture of target objects – which are difficult to judge from visual information alone,” explains Mitsuhiro Hayashibe, a professor at Tohoku University’s Graduate School of Engineering.

“This achievement represents an important step toward realizing a multimodal physical AI that integrates and processes multiple senses such as vision, hearing, and touch – just like we do.”

The new system was dubbed “TactileAloha.” They found that the robot could perform appropriate bimanual operations even in tasks where front-back differences and adhesiveness are crucial, such as with Velcro and zip ties. They found that by applying vision-tactile transformer technology, their Physical AI robot exhibited more flexible and adaptive control.

The improved physical AI method was able to accurately manipulate objects, by combining multiple sensory inputs to form adaptive, responsive movements. There are nearly endless possible practical applications of these types of robots to lend a helping hand.

Research contributions such as TactileAloha bring us one step closer to these robotic helpers becoming a seamless part of our everyday lives.

The research group was comprised of members from Tohoku University’s Graduate School of Engineering and the Centre for Transformative Garment Production, Hong Kong Science Park, and the University of Hong Kong.

About this AI and robotics research news

Author: Public Relations
Source: Tohoku University
Contact: Public Relations – Tohoku University
Image: The image is credited to Neuroscience News

Original Research: Open access.
“TactileAloha: Learning Bimanual Manipulation with Tactile Sensing” by Mitsuhiro Hayashibe et al. IEEE Robotics and Automation Letters

Abstract

TactileAloha: Learning Bimanual Manipulation with Tactile Sensing

Tactile texture is vital for robotic manipulation but challenging for camera vision-based observation.

To address this, we propose TactileAloha, an integrated tactile-vision robotic system built upon Aloha, with a tactile sensor mounted on the gripper to capture fine-grained texture information and support real-time visualization during teleoperation, facilitating efficient data collection and manipulation.

Using data collected from our integrated system, we encode tactile signals with a pre-trained ResNet and fuse them with visual and proprioceptive features.

The combined observations are processed by a transformer-based policy with action chunking to predict future actions.

We use a weighted loss function during training to emphasize near-future actions, and employ an improved temporal aggregation scheme at deployment to enhance action precision.

Experimentally, we introduce two bimanual tasks: zip tie insertion and Velcro fastening, both requiring tactile sensing to perceive the object texture and align two object orientations by two hands.

Our proposed method adaptively changes the generated manipulation sequence itself based on tactile sensing in a systematic manner.

Results show that our system, leveraging tactile information, can handle texture-related tasks that camera vision-based methods fail to address.

Moreover, our method achieves an average relative improvement of approximately 11.0% compared to state-of-the-art method with tactile input, demonstrating its performance.

Ideas and Discoveries

AI Robots Learn Touch and Vision to Handle Objects Like Humans

About this AI and robotics research news

Categories

ID

NEW

Rare images reveal active sunspots minutes before they unleashed powerful X-flares that caused November 2025’s stunning auroras

Viewpoint: ‘Gold standard science’ farce: RFK Jr.’s pledge dangerously mocks U.S. healthcare and endangers medicine globally

About this AI and robotics research news

Related posts

Latest posts