Fact

What Lessons Can We Learn from Deepmind vs the…

What Happened?

Recently (in December, although the results were only just revealed Jan 24), a team of Artificial Intelligence agents created by the group at Deepmind — the same group that created AIs to beat the Chess and Go world champions — challenged two Team Liquid Starcraft II pro players (TLO and MaNa) to a 5-game match playing as the Protoss race and proceeded to defeat each player 5-0. A week later, MaNa defeated a new agent 1-0 for humanity’s only win.

What Are the Details?

Deepmind self-training strategy. Human examples to begin, then competition against other AI agents to improve.

Deepmind is an organization, financially supported by Google, that creates self-learning artificial intelligence programs. They’re best known for creating the AI agents that defeated the chess and go world champions.

Since then they’ve turned their attention to Starcraft II. Not only is the e-sport an incredibly fast-paced game with a large number of aspects for the player to consider at any given time but, more importantly, it features ‘imperfect information’ or, as it’s more commonly known, The Fog of War. In other words, while in chess and go the players can see the entire board and judge their next moves using a totality of information that includes the up-to-date position of the opponent pieces, in Starcraft II the only information a player has is what their pieces can ‘see’ on the map at any given moment.

Deepmind began by training an initial AI agent using human replays from all leagues. They used this agent to spawn new agents, which then trained further by playing each other in purely AI leagues. There they learned to develop and counter their own strategies.

Once ready, the Deepmind team reached out to Blizzard for a pro player to test their AI bots on and TLO was the name give. The Little One is a German pro known for his ability to play all races, and for being a generally all-round nice guy. They chose a Protoss vs Protoss match-up to reduce the complexity for the AI agents. A 5-game match was agree upon and played in early December. TLO lost 5-0.

However, it was only revealed after the match that TLO was playing 5 different agents, each with preferred strategies, meaning his standard method of adapting to an opponent wasn’t effective since he was effectively playing five different opponents.

Given that TLO does not play Protoss as his main (he’s a Zerg player), Deepmind then reached out to his teammate MaNa who was also beaten 5-0, this time by a set of agents trained for a week more. Some interesting revelations occured during that match, which led Deepmind to train a new agent, which MaNa managed to defeat 1-0.

What Does it Mean?

To start with, it’s important to note that the single human win was, arguably, the fairest game of the challenge as it used a new agent, trained in one week to only act on what was showing in the camera view of the screen — just like a human player would. It appears this was done because it came to light in the 9th game (4th game against MaNa), that the AI had perfect micro (individual unit control) while managing three squads of units in vastly different parts of the map. Something no human player could do. Presumably, while the original AIs were handicapped to human actions-per-minutes (APM), they primarily used the mini-map to manage units, giving them a massive advantage over humans, who cannot do that (at least partially because the minimap is in a tiny corner of the screen).

The other point worthy of note is that the human players commented that they could not adjust for the AI’s tactics from match to match. It came to light the reason for that was because there were, in fact, five different AI agents, each with preferred strategies, instead of one. This also put the human players at a vast disadvantage because it was equivalent to them playing a team of five pros while not knowing anything about their opponent’s tactics. Something that would rarely happen in human pro matches.

On the positive side, there were some interesting strategies that the AIs used repeatedly. While the human-like AI walled-off, many of the other AIs didn’t. All AIs, however, over-saturated their gas mining in the early game, choosing to expand later than their human opponents. It became clear that this small difference in income ended up resulting in a stronger economy, something underestimated by the human community of Protoss players. Of course, these strategies have only been tested in one type of match-up so it remains to be seen what other interesting strategies the AI will come up with.

Conclusions

This was a very interesting challenge. The main difference between Starcraft II and board games, and the reason the DeepMind team chose this game, was to see if their AIs could perform to human-like capabilities in an environment with imperfect information. That is, anyone who’s played Starcraft knows of the Fog of War, segments of the map you can’t see because you don’t have units there. Entire battle in WWII have been won due to creative use of the Fog of War, so this is not an idle challenge.

Interestingly, and perhaps a bit worrying, the Deepmind agents, once trained, can be installed, fully functional, on laptop computers. That’s right. While the training phase takes a vast amount of resources (equivalent to 50 GPUs for each agent), the finished agent can then function on a device that any person can buy. Scarier still, an agent can be trained from novice to pro in a week, amounting to the human equivalent of dozens, or even hundreds, of years of playing.

While the real-world consequences of this technology are obviously frightening, in game, I’m looking forward to the pros training with AIs to develop new and interesting strategies. Perhaps more feints or double feints, and other bits of misdirection. One strategy the last AI showed itself vulnerable to, and confused by, was oracle harassment of its mineral line. Humans would have dealt with that easily while the AI was confused time and time again. Presumably, future agents trained from this one will not be fooled but it will be interesting to see how man and machine relearn the game together. And I’m looking forward to seeing a single AI agent enter a pro tournament.

EMAIL
Facebook
Pinterest