From CS294-10 Visualization Sp11
Domain and Initial Questions
One of my hobbies is playing Starcraft II. Blizzard, the company that made the game, periodically posts the Top 200 players in North America, and these datasets are quite popular (at least among those who play the game). The datasets, labeled "Starcraft II Top 200" are posted here. Here is an example of one of them.
These datasets in table format already answer some of my questions, such as who is the current top player, and I can quickly search through the table for specific people by using ctrl-F to see their rank, wins, losses, and race.
However, I have some other burning questions related to the datasets. What is the most played race among these top players? What are the wins and losses associated with these races? Have these changed over time?
I wrote a python script to scrape the site, going through all its pages to get all of the Top 200 posts (except the very first few, which didn't have the complete data). Besides getting the data shown in the tables, I added in a "date" column which is the date of the blog post, since I wanted to track the time aspect. The time frame I used to scrape the datasets is from September to the present. All graphs are generated using Tableau.
Data Assessment and Iterations
To determine the most popular race, I graphed the number of players and their races.
We can see that Terran is clearly the most popular race, and random is the least. This makes sense as professional/top players tend to pick one race and stick with it, so random is more rare.
Now, what are the win and loss scores for each of these races?
From this it does look like each race wins more than lose, but do players of a particular race do "better," as in have higher win/loss ratios?
Here we have the average win/loss ratios of all players for each race. From this we see that top players generally perform about the same no matter what race they play, though the Terrans do seem to do slightly better.
Now that I can see how these races perform relative to each other, I'm interested in seeing if the statistics have been changing over time.
We can see that Terrans have been the most popular and random the least throughout this time period. Interestingly, the popularity of Protoss started out higher than Zerg, dipped below them, then rose above them again. What could have caused this change? Unfortunately, the datasets provided cannot help understand these changes.
As new strategies develop for each race, players of a type of race may dominate for some time, until people figure out ways to counter them. Can this be shown in the changes in win/loss ratios over time? As a follower of professional Starcraft (though I'm not a professional myself), I think that Terrans dominated early on, and then Zerg and Protoss start rising as they started to find ways to counter Terrans and develop their own dominant strategies.
I graphed using a logarithmic scale to see the differences between the races more clearly.
Interestingly, players who randomed had the highest win/loss ratio at first, then dropped, probably as players started specializing in particular races. Indeed, Terrans have dominated the early scene and were overtaken by Protoss and Zerg for a little while. Many players thought that Terran was overpowered, as perhaps shown by their early dominance, but the other races started developing strategies that could counter the Terrans. The Protoss interestingly spiked toward the end, for reasons that cannot be explained by the data. Each of the races started to converge towards the present time as one would might expect. During the early stages of a new real-time strategy game such as Starcraft II, people have yet to figure out how to play out their races, so there is much volatility in wins and losses. However, as players start mastering the game mechanics, they start to counter each other's strategies more effectively and perform just as well despite whatever races they have chosen, assuming that each race is balanced. Win/loss ratios have been falling in general over time, most likely because more and more players have joined the game and thus there is more competition amongst top players, lowering the ratios.
One question that popped into my mind was, do higher ranked players tend to have higher win/loss ratios?
We can see that although there is general increase toward the higher ranks, it is quite volatile. This shows the matchmaking system at work. Players who beat relatively lower ranked players get less points than those who beat higher ranked players. Those with high ratios may not be as highly ranked if they beat a lot of lower ranked players. This graph still shows that higher ranks tend to correlate with higher win/loss ratios.
Now I'm interested in seeing the best of the best players in North America.
Impressively, SeleCT has been sitting in the top 3 this whole time, and, judging by the relatively large size of the line, he has a relatively high win/loss ratio.
Again I'm interested in win/loss ratios, this time with the top 5 players and their races.
From this we see a lot of Terrans in the top 5 have relatively high win/loss ratios, and not so much for Zerg. Interestingly, PiQLiQ switched races or uses 2 races and still managed to be in the top 5, with a higher ratio when playing Protoss.
I wanted to put another perspective to wins and losses:
I put average wins on the Y axis and average losses on the X axis. Colors and points depict the players (I chose to add color so that it isn't just a big blob on the bottom left), and sizes depict average rank. As shown by the regression line, the points tend to lie above 45 degrees, meaning that average win/loss ratios in the Top 200 are above 1, which is supposedly the overall average. This visualization shows that there is some deviation in the win/loss ratios, but not too much. This makes sense since these players are above average. Interestingly, the player Zark has played a ton of games, which is not seen in the other images. A question immediately comes to mind: does playing more games help get you a higher rank and higher win/loss ratio?
Here I plotted total number of games played over average rank. A plot is a player, size is win/loss ratio, and color is race. The regression line shows that there is indeed some correlation between rank and number of games played. However, the points are very volatile so this cannot be an absolute conclusion. There are some people who play a lot and are at the lower ranks, while some play relatively little and have higher ranks and win/loss ratios. One thing I would like to do is get a dataset for more players, perhaps the top 1000, to test this question better. Unfortunately this data is not easily available. Again, we see the player Zark, who has played a lot of games.
I believe this final visualization addresses my initial questions most clearly. The average wins/loss ratios are on the Y axies and date (September to February) is on the X axis. The color depicts race and size depicts number of players. We see that Terran (green) has the thickest line so it has the most number of people playing it, followed by Protoss, then Zerg, then Random. The sizes stay relatively the same over time, meaning that popularity stays about the same over time. We have the same analysis as the Average Win/Loss Ratios over Time graph, as this shows that Terrans have had the highest win/loss ratio most of the time. Their dominance was briefly broken by periods of Protoss and Zerg reign, and the ratios started to converge to relatively lower values towards the end. The ratios seem to be rising at present times for Terran and Protoss while dipping for the Zerg. The ratios for randoms have been dropping throughout this time period. We see that the win/loss ratios are between 1.2 and 1.6, meaning these players win more than they lose. This makes sense since these are the top 200 players.
Saung Li - Feb 11, 2011 03:26:03 am
I've posted this work on the Starcraft II forums here, and there has been some great feedback.