A2-WittonChou
From CS294-10 Visualization Fa08
Contents |
Data Domain
Everybody likes movies. Well... I do, at least, and I'm sure a lot of other people do as well. So naturally, I wanted to know which movies topped the charts this year. Thus, I set out to find data regarding how much each movie earned.
Finding the Data
I stumbled onto an excellent site with that information and more. On a site called The Numbers there are tables with information on the Top Grossing Movies of 2008. The data included in this table are:
- Rank
- Movie Name
- Release Date
- Distributor
- Genre
- MPAA Rating (G, PG, PG-13, R, Not Rated)
- 2008 Gross
- Tickets Sold
- Inflation-Adusted Gross
I was able to copy the table directly into Excel with minimal issues.
Initial Visualization (via Tableau)
Having never used Tableau before and only seeing basic demonstrations, I figured I should just see if I could import my data from Excel and see a basic comparison of each movie's gross. I decided to use the Inflation-Adjusted Gross to account for any inflation between the release date and present day, albeit we're still in 2008, so the adjustments are minimal.
Immediately, I could see that The Dark Knight was, by far, the movie that generated the greatest gross revenue this year (although by no means was this a surprise). What did surprise me, though, was how close the Indiana Jones movie was to Iron Man in gross revenue.
Logically, the next step was to sort the data by gross revenue.
This view allows us to easily compare solely movie gross but not much else.
Intermediate Adventures
Then I thought to myself how the companies that produced these top 100 films did. The result is movies grouped by Distributor and sorted by gross.
Here we get a sense of how many movies made it into the top 100 from each Distributor by the height of the major cell divisions (assisted by the highest gross generating movie's bar). We also see the distribution of gross revenue between the movies from a particular Distributor.
It seemed like Warner Bros dedicated their efforts into making The Dark Knight the blockbuster hit of the year with less emphasis on their other movies. By contrast, Paramount Pictures had two well performing moies in Iron Man and Indiana Jones and the Kingdom of the Crystal Skull (Kung Fu Panda is also a notable mention).
So how did this play out for these two high performing Distributors? Well, Paramount managed to edge out Warnor Bros and their knight in... well, not-so-shiny armor as we can see with Distributor vs Gross bar charts here:
Total:
Average Per Movie in the Top 100:
Getting Warmer
Then I began wondering, "What about the genre of the movie?" What kind of movie do people like to watch and how the type of movie affects gross revenue? As a result, I made this:
However, more questions popped into my head as I realized many distributors like to save their best movies for the summer run.
Here, we can tell that the summer months of May, June, and July (also a slight peak at December) are the major gross generating months. But upon deliberation of whether this statistic was interesting, I realized that this trend is expected and it is well known that Distributors like to release their blockbusters during vacation season.
The Result
Throughout the process, I wanted to make sure to keep nominal, ordinal and quantitative data types with their respective visualization facilities, making sure that gross revenue was not encoded with color, etc.
In exploring the functionality of Tableau I discovered new ways to look at the movie data aside from just the movie name versus its gross revenue. In the resulting visualization, I wanted to keep movies group by their respective genres. The reasoning is because different people have different preferences to the kind of movies they enjoy watching so it's only fair to compare the performances within a particular genre. Every now and then there are movies that stand out above all else and those can easily be seen regardless of their grouping.
I was also interested in whether particular genres would lend themselves to particular ratings, and whether or not the MPAA Ratings affect the how many people see the movie.
I also realized that 100 movies is a little excessive and doesn't come close to fitting on on screen, making it hard to compare many of the movies so I trimmed the data set down to the top 40 movies of 2008.
In the resulting visualization, we can see the high grossing films of each genre. We also notice that the films are predominantly rated PG-13 with the adventure movie genre tending towards the younger audience. Comedy seems to cover both ends of the rating spectrum with the animated film Wall-E topping the genre. We can also deduce that many of the popular films in the comedy genre have crude humor and other content inappropriate for the younger audience.
Through the genre categorization, we also notice that action, adventure, and comedy are the three main movie genres people like to watch and can be deduced from the height of the genre cell/rows as well as the lengths of the gross revenue bars.
Reflection
I had a lot of fun learning, playing around with, and producing a useful visualization with Tableau. I spent much more time with this than I had intended to because of how much fun it was to easily manipulate the data and how much information I could extract from showing the data in various ways. I ended up not having much time to play around with Spotfire, but from the little contact I had with it, I'm sure to explore more of its features in the near future.
