For robots to adapt to new situations and environments, or solve unique problems, they need to be able to learn. One of the solutions to this problem is reinforcement learning – a broad term which could be defined as a way to maximise a particular reward within a particular scenario.

While research, testing and development of reinforcement learning algorithms is constantly happening, it’s application in the physical world is still being explored.

Now, using existing learning algorithms, a team at Google have developed a robot capable of teaching itself to walk – according to a recent paper. The researchers behind the project believe that robots capable of walking will be more useful in the future – and the results of the test were a step in the right direction.

The approach taken by the team isn’t common. The more standardised method is to run a virtual mockup of the robot through a virtual world, allowing the algorithm to fail at certain tasks until it eventually learns what needs to be learnt. This limits risks such as a real robot in a real environment breaking when making errors. The team at Google, however, decided that similar training could be done in the real word with the actual robot, without putting it, or others, in too much danger.

To do this, the team set up a small designated area with specific terrains, limited the robots movements, and made sure that the robot would either turn around or attempt to walk backwards once it reached the limits of the training area.

The team also point out that there was much human intervention—particularly during earlier, less restricted testing. This included interventions such as hard coding solutions to reduce particular problems, and using a motion capture system to give accurate data on the robots location within the testing area.

Despite this level of intervention, the algorithm still required the robot to move by itself to get a sense of the environment—a truly embodied learning situation for the robot.

Within two hours the robot was able to walk properly. As seen in the above video of the test, the algorithm was able to accumulate enough data to successfully stand, walk, and turn both left and right. It could then use this knowledge to walk backwards if it couldn’t turn around.

Following this, the team found that the algorithms repertoire of knowledge could be applied to different scenarios. Since it already understood the basics of movement on the terrain it was trained to walk on, the robot was able to quickly adapt to new types of terrain. Obviously this is beneficial, since the reinforcement learning algorithm cuts back on time needed to learn any new tasks similar to a previously learnt task.

While the team have noted that there was much human intervention and hard coding involved over the course of the training, the algorithms ability to adapt and walk across new terrains based off of it’s refined approach to movement is a testament to the power of reinforcement learning, and an insight into the future of artificial intelligence.

Paper: https://arxiv.org/pdf/2002.08550.pdf