DeepMind is now getting superhuman performance on certain Atari games with Q-learning-based reinforcement learning. Their new architecture outperforms all previous approaches on six out of the seven games they tested on and surpasses a human expert on three of the games.
This is much more impressive than the 2011 results where Joel Veness programmed MC-AIXI to play Pacman. First, the Pacman demo only played at sub-human levels. And second, the input data MC-AIXI got on what was happening in the game was highly preprocessed. So it can be argued (and I would argue) that a lot of the structure of that game was effectively digested and hand-coded into the MC-AIXI’s game playing harness by Joel when he wrote the preprocessor.
However, these new results from DeepMind are achieved with no preprocessor and no special tuning parameters. They are just feeding the raw pixels from the Atari games into an opaque, black-box reinforcement learner that uses Q-Learning with Deep Neural Nets. Their “experience replay mechanism” appears to be one of the more valuable, novel innovations that helps them get this high level of performance. It’s described more in the full paper.