Dr David Howard on evolutionary machine learning in robotics
My name is David Howard, I’m a Senior research scientist and robotics and autonomous systems group at DATA61 in Brisbane. DATA61 is part of CSIRO and we focus on basically merging the digital and physical worlds.
We do some deep learning here but the thing I focus on is evolutionary learning. And evolutionary learning is, is quite a bit different.
So for evolution to occur, you need three things; you need selection, need variation, and you need heredity. And this is true for natural creatures, it’s true computer programs, it’s true for robots.
So deep learning, you need lots and lots of data, and basically, what you’re doing is you are learning new examples. So when a new example comes in, it can get classified as one thing or another. What I do is based on genetics, it’s based on Darwinian evolution, and it’s based on competition. So for evolution to occur, you need three things; you need selection, need variation, and you need heredity. And this is true for natural creatures, it’s true computer programs, it’s true for robots.
We know that evolution is the most powerful creative force in the world and the evidence is all around us, right? Especially here in Brisbane—the evolutionary miracle that is a kangaroo, is something that no one could really have conceived of, without seeing it first. And in fact, when some of the first explorers came back to England, everyone thought they were making it up. And especially with a platypus and things like that, people just didn’t think it could exist. The point is that evolution is really really powerful, its creative, it creates novelty, diversity, all of these things that are really useful when you think about them in the context of learning.
Well, what we do is we, instead of having a single solution which we optimize and optimize and optimize we have a population, okay. We take inspiration from nature, we have a population of different solutions, and initially, these are random. So if I was a robot, I might have a different number of wheels or they might be in a different location and my body, my body might be a different length or width or whatever—we have a population of different robots, and they’re all random. What we do is we then test them on the problem that we want them to solve.
A really simple example in this case would be, within five seconds going as far as you can, from your initial point, and the further you go from that point, the higher score you’re going to get. And we call that score “fitness” in terms of survival of the fittest—we want the fittest to survive.
These solutions are initially rubbish, they’re random, they’re not going to be that good, but some of them are going to be better than others. And what we do is we take the ones that are better, and we say that they can be “parents”, so they have the opportunity to reproduce, again in the Darwinian sense, to create children that are a little bit like them. And then if you think about how we might encode a robot—what is the genome of a robot, it’s just a load of numbers, right? So it might have a length, which is a number, 10, you know, centimeters, eight centimeters, a width similarly, you know, a number of wheels, which would just be an integer; 246, however many you want, and we can treat that as a genome.
Okay, so what we do is we treat that as a genome, and we apply mutations to it, so we might randomly change (instead of having four wheels) it’s got five wheels when the genome passes from the parent to the child. Similarly, we can cross over—we can have two parents—and we can pick a point in both of their genomes and cross them so that the child has a little bit of parent A and a little bit of parent B.
The idea is that we are selecting based on the fitness score; things are good, and we’re creating our new population from things that are good, meaning that they’re going to get better, on average, from iteration to iteration, which we call generation to generation. As we proceed through these generations, all we need is this genome, this description, numerical description of a thing, a robot in this case, and some operators that work on it, and some selection pressure which we get from the fitness, and that’s it.