Using our vision and sense of touch, humans are able to understand the world around them, and how to go about interacting with it. By simply looking at an object we can often accurately predict what it will feel like to touch, which in turn impacts our decision making process of what to do with it. This also works the opposite way, feeling an object without seeing it. And historically this has been possible only with the power of the brain.
Now an MIT team – from the Computer Science and Artificial Intelligence Laboratory – are working on a way for an AI to ‘see by feeling’, or vice versa.
The team used a robotic arm which could pick up certain sensations. This tactile sensor developed by another team at MIT is known as GelSight. Using a webcam the team began recording a variety of objects and materials being touched. 200 of these objects were touched over 12000 times, which were then broken into still frames and fed to the system.
According to the team, the program was then able to correlate over 3 million sensations with their respective visuals.
For the system to effectively ‘see’ what its feeling, it must correlate the sensation with a material, and then figure out roughly what shape it is. This gives the AI a basis from which to make predictions – matching certain attributes with a reference image.
A similar technique is used for matching the visuals of an object with it’s texture. For example, looking back at reference data, an AI sees an object and uses its reference data to say locate an area on the object that is best to grasp, or touch, depending on the utility of the object.
Unlike similar AI systems developed in the past – which used large datasets – this latest development uses what is known as Generative Adversarial Networks. Two components are used in this method; a generator and a discriminator. While the generator produces images to trip up the discriminator, the discriminator itself tries to ’catch’ the generator. If the generator is caught, the discriminator must explain the reasoning behind its decision. This information then helps the generator to improve its own accuracy.
Yunzhu Li, the lead author of the paper, told MIT News that continuing to develop this ability to connect these two sensations will be important, not only to help AI understand what they are seeing or feeling, but how to best interact with them based off of this knowledge.