They have put this algorithm to play Atari’s Pitfall and, although the previous ones were unable to make a single point, it already improves humans
Can an algorithm explore complex environments, or does it have to be chewed up instead? That is, can we train artificial intelligence systems to make decisions by exploring and understanding complicated environments and to learn to acquire rewards in an optimal way? That is the question that Adrien Ecoffet, Joost Huizinga and their colleagues have been trying to answer for years and the truth is that it is a complex question.
Fortunately, we have video games.
Algorithms vs video games
And it is that, if we think about it for a moment, video games are a fantastic framework to train artificial intelligences in these types of decisions (and to prove which methodology works best): they have everything you need to learn in successively more complex environments, they allow you to establish rewards based on reaching a specific location or completing a level in a video game and, in fact, they pose a challenge even for the humans themselves.
Ecoffet and his team work with reinforcement learning algorithms and decided to test their new approaches with classic Atari video games. Specifically, ‘Montezuma’s Revenge’ and ‘Pitfall’. It’s not just a flush of nostalgia, it’s that Atari 2600 games have become a ‘gold standard’ for these types of systems. Without going any further, until now, algorithms achieved modest scores in the first and failed miserably in the second: they did not score a single point.
The family of algorithms developed by the team of Ecoffet (called Go-Explore) change the matter, as just published in the journal Nature. The idea is that GO-Explorers can explore environments in depth and create an archive to help them remember where they’ve been, making sure they don’t forget the route to a promising middle ground or a successful outcome.
And with those tools, Go-Explore algorithms Quadruple previous scores in ‘Montezuma’s Revenge’ Y exceeds average human performance in ‘Pitfall’ (where, as I already said, the previous algorithms failed to get any points).
After this success, and always according to the data of ‘Nature’, researchers have applied the same algorithms to robotic tasks that simulate picking up and placing objects with a robotic arm (in isolated locations behind locked doors). And it’s good news because there is still a lot left for an AI to beat us in Fornite, the mere fact that they do it in games from 1982 is a sign that they will soon (and that this has interesting practical applications).
Picture | Atari –Vijoy Rao