Making Sense of Our Reaction to AlphaStar

Many have discussed about the event and analysed how AlphaStar played, so I am not going to repeat what others have already discussed. Rather, I want to build on others’ reaction and discuss some aspects that the community seems to overlook.

As a social science researcher, I find people’s reaction toward the outcome more interesting than the performance itself. I write this article with the assumption that you have watched the AlphaStar demonstration.

Background

After AlphaGo of DeepMind defeated Lee Sedol in a historic go match, DeepMind signaled their intention to make Starcraft their next challenge in 2016. Fast forward to 2019, AlphaStar, which is the code name given to the Starcraft AI of DeepMind, defeated professional players TLO and MaNa (Team Liquid’s article sums up  the demonstration). All the matches are played using Protoss versus Protoss.

You can learn more about AlphaStar in DeepMind’s blog post.

Expectation

The Starcraft community’s expectation of AlphaStar before the demonstration appears to favor the AI to emerge victorious (reddit threads before the event: 1 and 2). Here are some comments:

  • “The AI must be pretty competent at this point, otherwise I’d doubt they’d show it off. I honestly don’t think it’ll be long before it starts taking out the top pros. Technology advances so quickly these days.”
  • “Considering the amazing work they did with AlphaGo, I can’t wait to see what they do with Starcraft.”

I shared the same sentiment. Before the demonstration, I was thinking that if DeepMind followed their approach with go, there is no doubt they are confident in defeating top pros. Otherwise they won’t show it. They indeed followed a comparable path by testing with a high level, but not among the best, player (Fan Hui in go). TLO, who is a Zerg player, plays Protoss at a very high but not tip top level. That is a pretest to the actual main study, which is to play against a top Protoss pro player like MaNa.

However, the atmosphere was drastically different to what I observed three years ago when AlphaGo defeated Lee Sedol convincingly. In an interview, the legendary BoxeR, was very confident that AI could not defeat Starcraft progamers. In another interview, Flash also said that he could defeat the AI when the day comes. I wrote in an article in 2016 that, they were showing the common sign of “unwarranted” confidence displayed by many experts who are challenged by computers in human history. The go community was equally confident before the AlphaGo event, and it is as if losing to AI in a game that human could not truly master after so many years is simply unthinkable.

Notably, both BoxeR and Flash highlighted dealing with imperfect information and adapting to real-time changes as the reasons they believe AI could not match a human player. Put it in another way, they believe human are superior in strategic and tactical thinking in real-time than AI. This fundamental belief is at the core of the key debates we see after the event.

Mechanics and strategies

There is no doubt that a computer (bot or AI) is superior to human in mechanics (i.e., macro and micro). What a computer can achieve “mechanically” is well demonstrated in the two videos below. Hence, it is logical to place our focus on AlphaStar’s ability to “think” strategically and tactically.

When I was watching the event live, I was very impressed by how AlphaStar played like a human. If you didn’t tell me that is a computer, I would have thought that is a human. To those who have been following AI and machine learning development using Starcraft as the context, they would notice many papers limit the AI to specific tasks (i.e., mini games). It is easier for the computer scientists to manage and improve the AI when there are fewer factors involved. The challenge is to take the next step and implement it in a full scale Starcraft game [1 – academic references are inserted at the end of the article]. It is hard for human to learn multiple things within Starcraft simultaneously as well, and we learn better and quicker when we break the things down and focus on specific tasks. Thus, the fact that AlphaStar played like a human in a full game is an accomplishment.

I am particularly interested in how AlphaStar could adapt to different situations with different strategies and builds, because this is identified as one of the biggest challenges for AI in Starcraft [2]. This is something that we may not see, because they train different agents to use different strategies.

Interestingly, however, the Starcraft community places more emphasis on AlphaStar’s great mechanics than its ability to deal with imperfect information and adapt in real-time. I see many comments go along the line that AlphaStar did not defeat Starcraft progamers. The arguments behind this assertion are generally based on the perception that AlphaStar won by mechanics and not strategy. Indeed, AlphaStar’s macro is close to perfect, and its micro is better than progamers or simply inhuman. The vod below shows AlphaStar out-micro MaNa in a Stalker battle. From the community’s perspective, what is the point of having an AI that beats a human with mechanics when many bots out there can do just that? Isn’t AlphaStar the chosen one to beat human strategically (and tactically), but not mechanically?

That is a fair point, but should we, and can we, disintegrate strategy and mechanic? These two are not independent entities to begin with. Your strategy options are limited by your mechanic, for example, you are not going to hit a two base timing attack if you cannot even macro on two bases. What the community really wants is to have AlphaStar “play fair” mechanically and compete strategically. DeepMind can restrict AlphaStar’s command input to be as close to a human’s as possible, and this is what they actually did or at least intended to do. DeepMind highlighted that AlphaStar’s mean and max APM are in fact comparable to the progamers’ (see image below). However, input frequency and speed alone do not make up “mechanic skill”, as a big share of it is to input the right command as efficiently as possible. The DeepMind team is no fool, so they definitely understand this and they must have thought about this more than anyone. Then, the question is, is there a realistic way to make AlphaStar’s mechanic human-like? Putting it in another way, how can we force AlphaStar to be as inefficient with command input as a human player? From DeepMind’s perspective, going down this path to ensure DeepMind is mechanically human diverges from its main goals.

DeepMind’s perspective

The Starcraft community largely discusses AlphaStar within the boundary of the game of Starcraft, and we may have overlooked DeepMind’s main goals with AlphaStar. When Demis Hassabis (co-founder of DeepMind) was asked about his interest in beating Starcraft three years ago, he said that:

“Maybe. We’re only interested in things to the extent that they are on the main track of our research program. So the aim of DeepMind is not just to beat games, fun and exciting though that is.”

It is clear that DeepMind’s main goal with AlphaStar is never to prove and convince us that they can develop an AI which can defeat Starcraft progamers in our terms. Starcraft is merely a context to make progress in the field of AI. It was never about Starcraft or go. These games are selected as the contexts due to what they represent in terms of complexity for AI development (see image below). Thus, in line with their goals, DeepMind set an input boundary that is “reasonably human-liked”, and then they tackle the game features they truly are interested in (e.g., imperfect information).

If we want to debate about the legitimacy of AlphaStar’s “victory” over Starcraft progamers due to its inhuman mechanics, we should also consider other seemingly inhuman moves it showed. For example, AlphaStar always “looked” at the right place, while human players are competing under an “economy of attention” (i.e., a finite resource) to cater their attention at the right place right time. The game is designed and balanced with this in mind, and the players take advantage of this by stretching opponent’s attention at multiple locations. In a way, one can argue that AlphaStar is not playing at an inhuman level, but rather showing us how a “theoretically perfect” human is capable of.

In fact, we play the game knowing that our human opponents are not perfect. In the first game against MaNa, AlphaStar did a proxy multiple Gateways rush, and it decisively took the game by attacking up the main ramp. MaNa’s post game comment was that:

“AlphaStar is not scared about the ramp. If I am playing against a human player right there, nobody is going up that ramp. Especially after I had scouted the four gate like twice.”

This kind of assumption is inseparable from our strategic planning. In this article whereby I discuss about sOs’s strategic reaction to Terran proxy, sOs used two Stalkers and a Shield Battery to defend the main ramp against two Cyclones. Terran do not know what is on top of the ramp, so they turn back. Holding the ramp at that time is a big part of sOs’s strategy. With hindsight, what sOs had may not be sufficient in dealing with the two Cyclones.

Our assumptions of opponent’s human tendency are not limited to strategy. Below vod is a good example of how we make short term tactical choices with that in mind. INnoVation pushed his units out to secure his third, and Trap moved his Stalkers down the left ramp. Knowing where the Stalkers went, INnoVation siege up his two Tanks (and place a Marine for vision) to deter Trap’s units from coming from the left, while he moved his bio down the right ramp to defend the third. In INnoVation’s mind, Trap would blink his Stalkers away immediately once the two Siege Tanks shot at the Stalkers when they were coming up from the left ramp. To his surprise (which he responded by typing “?”), Trap did not instinctively blink back and instead blink forward to kill the two Tanks.

While DeepMind’s main goal is not to convincingly defeat Starcraft pros strategically, I believe there is room for improvement in how they can demonstrate AlphaStar excel in its strategic thinking. Unlike in the ten games that AlphaStar won, AlphaStar can no longer see the whole map at once in the only game it lost. One can attribute the non-intuitive unit movements in that specific game to this feature difference. Brownbear explained this pretty well (see vod below).

I quicken to add that appreciating where DeepMind come from academically and critiquing the mechanical set up of AlphaStar are not two opposing sides. Rather, I believe what we observe is the common misunderstanding that the general public has toward academic research. Academics are mostly interested in broad phenomenons that are generalisable. To do so, they have to study them in certain contexts, which they perceive as suitable for examining the interested phenomenons. Contexts are not the focus most of the time, and from a conceptual stand point, the established findings should be applicable to other comparable contexts that the academics did not test on. However, the general public often focus on the context and misunderstand the true value of the studies.

The “a bat and a ball problem” is a good and interesting example:

A bat and a ball cost $1.10 in total. The bat costs $1 more than the ball. How much does
the ball cost?

This question was introduced by Kahneman (a Nobel Laureate) and Frederick [3] as a way to demonstrate we take mental short cut and come to conclusions intuitively. Many answered $0.10, but the correct answer is $0.05. Many students from top universities (e.g., Harvard and MIT) answered $0.10. A key purpose of using students from top universities is to show that the cognitive tendency to rely on intuition is not simply due to IQ. However, this question got famous not because of the underlying reason, but because most Harvard students got it wrong. My soon-to-be father-in-law once asked me this question, and he emphasized on the difficulty of this question by highlighting that most Harvard students got it wrong.

I have the similar challenge when I try to explain the experiments I conducted to layman. The point I’m trying to highlight is how academics look at a problem differently because they start from a distinct premise. By focusing on the context (i.e., Starcraft), we overlook what DeepMind really intend to do.

Public reception

Unsurprisingly, this demonstration was reported by the mainstream media. Most did a good job. I find it funny and interesting that many avoid using the word “race”. For example, CNN replaced “race” with “group”, by saying that “players can take on the roles of three different galactic groups (Terran, Zerg, or Protoss) and fight to control the galaxy.” I doubt the journalists are unaware that “race” is the default term, so it shows they want to tiptoe around the word and the implications. If I were to say, Starcraft players are racists. We laugh, but the general public would probably raise their eyebrows and put their hands on the pitchforks. I was even joking recently that only Protoss can talk about balance nowadays, and Terran do not have that racial privilege. I don’t think that would go well if that is taken out of context. lol.

The demonstration was a little disappointing overall. The AlphaGo event was a big deal, and the go community took full advantage of the buzz to market the game to the world. In contrast, the AlphaStar event looked lackluster and forgettable. Three years ago, I wrote that “simply by suggesting that Starcraft may be the next project, DeepMind is already contributing to the popularity of Starcraft. If DeepMind formally challenges Starcraft one day, Starcraft will be talked about by everyone everywhere.” That is definitely not the case, and it is an opportunity missed.

I believe we as a community did not do our part to make the best out of this opportunity to introduce the game to the public in a welcoming manner. We put up a very defensive stance toward the claim that AlphaStar defeated Starcraft progamers. While I understand that DeepMind made a disputable claim that “AlphaStar’s success against MaNa and TLO was in fact due to superior macro and micro-strategic decision-making, rather than superior click-rate, faster reaction times, or the raw interface”, we must also understand that the general public have no idea what actually happened! Many may have heard about Starcraft for the first time. This is supposedly a great opportunity to educate and introduce people to the game with open arms, but we pick the wrong fight with the wrong people instead (see tweet below). We definitely could do better in representing this beautiful game.

To learn how to handle the situation, we need to look no further than the recent incident involving Stephen Curry and NASA. Stephen Curry, who is a NBA superstar, commented on Winging It (a basketball podcast) that he did not believe we had been to the moon. NASA replied maturely and constructively by inviting Curry to tour the lunar lab at Johnson Space Center in Houston. They made the best out of the situation to ride on Curry’s influence and educate the public.

I did my part three years ago. Let’s do our part.


Academic references

[1] Vinyals, O., Ewalds, T., Bartunov, S., Georgiev, P., Vezhnevets, A. S., Yeo, M., … & Quan, J. (2017). Starcraft ii: A new challenge for reinforcement learningarXiv preprint arXiv:1708.04782.

[2] Cho, H. C., Kim, K. J., & Cho, S. B. (2013, August). Replay-based strategy prediction and build order adaptation for StarCraft AI bots. In Computational Intelligence in Games (CIG), 2013 IEEE Conference on (pp. 1-7). IEEE.

[3] Kahneman, D., & Frederick, S. (2001). Representativeness revisited: Attribute substitution in intuitive judgmentHeuristics and biases: The psychology of intuitive judgment49, 81.


If you enjoyed this article, I’d love you to share it with one friend. You can follow me on Twitter and Facebook. If you really like my work, you can help to sustain the site by contributing via PayPal and Patreon. You can also support me and enjoy quality tea with a 15% discount at AFKTea by using the “TERRAN” code. See you in the next article!

Advertisement

9 thoughts on “Making Sense of Our Reaction to AlphaStar

  1. I was looking forward to see your take on the whole ‘alphastar situation’. Not disappointed in the slightest. Thanks.

    ps/unrelated. Yeah, there is some kind of ‘terran balance whine’ shaming, so there is race elitism (X is THE HARDEST race), and stuff like ‘Z is so easy’, or ‘protoss f2 a-move race’. So you could say that starcraft community as a whole is ‘racist’ af. Which I personally find rather amusing.

    1. Thanks! I actually was drafting an article with a more “typical” analysis focusing on AlphaStar’s play, but brownbear posted a video that shares many similar points I want to make. He did a great job! So I scrapped the whole article and wrote a new one with a completely different angle. That took some time.

    2. It’s funny because I’m the kind of player who looked for improvment. But sometimes I have the feeling that Protoss have less action to do to beat me. At my level, I know that it’s relative. You can beat most of the player on ladder with good macro but sometimes it’s more easy to critize the race than our self. I’m Terran and I’m very confortable at my level in TvP in macro game because if I place my composition in a good place I win without double drop and intense micro. Except when I saw DT i’m out ^^ I just wanted to point that match up is relative to everyone, It’s not the opponent which have the best option against you. It’s about you to find how the oppenent can be weak when you put him in a bad situation with a good composition of your own. So stop whining against the race nobody is perfect, We just have to improve ourself.

      PS: sorry for this long post.

  2. If you want to beat an IA in Starcraft is to drop and drop again. I’d play against an IA in the past and I didn’t have the level to compete against him. But as a Terran player against a Zerg who have better income than me I just drop his main again and again until he loose. Because no bot can anticipate your droping movement it can be anywhere on th map so for now it’s the best problem for the deepmind team

  3. Gwern makes a great point here (https://www.reddit.com/r/reinforcementlearning/comments/ajeg5m/deepminds_alphastar_starcraft_2_demonstration/?st=jrqknmrk&sh=6a314dd1)

    “From my point of view, people are vastly overrating the small absolute differences between individual human players and underrating the immense amount of work which goes into providing an approach which works at all to reach human level, after which it takes a lot less work to surpass human level… I suppose this is an example of “the narcissism of small differences” – to a sheep, other sheep look very distinct.”

    1. Hey David, it’s been some time! Indeed that’s a good point. This shows how people of different background and knowledge look at the event and understand it differently.

What do you think?

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s