From CS 294-10 Visualization Sp10

Jump to: navigation, search


Dataset 1

The first data domain I explored was auto sales. My initial question was:

"What have the trends in automobile sales been (increase or decrease) for different companies and are they correlated to whether the automobile is domestic or foreign?"

I was able to find data to answer this question, but only for January 2009 and January 2010. The data I found would allow me to compare the automobile sales from last year to this year, as well as show which companies had what amount of the market share. However, I felt that this dataset may be too narrow and therefore wouldn't be able to show a clear trend in auto sales over a long period of time.

File:Auto Data.pdf

Source: Wall Street Journal.com [1]

Dataset 2

Iteration 1

Initial Question

Instead of automobile sales, I decided to look at life expectancy rates of different countries. My initial question for this domain was:

"How have the life expectancies of different countries changed over the past several years?"

I found the data to answer this question at the site NationMasters.com ([2]).

I decided I wanted to see the trends in life expectancy for each of the countries listed, for each of the years listed. The first visualization I created was fairly simple, showing each country's life expectancy trend-line. The image below shows a portion of this first visualization:

Data Vis 1

The image above only shows the first several countries, because there are too many more to show in this image. Because of this, this method of visualization was very ineffective.

Question Evolution

Instead of graphing all the country's trend-lines separately, it made more sense to try to compare the different countries' life expectancy trend-lines. This led to the question:

"How have the life expectancies of different countries changed over the past several years, in comparison with other countries?"

This visualization is below:

Data Vis 2

The image above shows all the countries' trend-lines on one axis. As can be seen, this method is very crammed with so many countries. In this case, my idea was to prune out any trend-lines that did not show a significant increase or decrease in the years displayed.

Semi-arbitrarily, I decided that any country whose MAX (life expectancy between 2003-2008) - MIN (life expectancy between 2003-2008) < 1.5 would be excluded from the data. I felt such countries' trend-lines would show up as fairly horizontal in the visualization, and therefore are statistically uninteresting.

Before I actually pruned out the data, however, the question I was trying to answer evolved a bit more.

Question Evolution

Graphing only the countries' life expectancies over the years seemed like it didn't explain much about why such trends were as they were. Because of this, in order to try to explain why the trend occurred, the question now became:

"How have the life expectancies of different countries changed over the past several years, taking GDP or population into account?"

In order to do this, I wanted to encode the GDP or population of a certain country in a certain year into the width of the trend-line. Therefore, if a line gets thicker as it rises, it shows that GDP and life expectancy are correlated. As with the original visualization, it would also be very easy to compare different countries' life expectancies and GDPs.

However, through perhaps a lack of expertise on my part, I was not able to get Tableau to include this encoding. Such an encoding, with multiple dimensions (GDP for each country, not just for one year, but for six) seemed to be too complex for Tableau to easily do.

Therefore, although I felt that this question was very interesting, I decided to try to answer it a little differently, without so many different dimensions.

Iteration 2

Initial Question

Instead of trying to compare the trend in life expectancies of different countries over several years, I decided to stick to a single year and try to show how the life expectancies of different countries were affected by another factor, in this case GDP. This is the question:

"Is there a correlation between life expectancy and gdp per person of different countries in one year?"

This visualization is below:

Data Vis 3

The image above shows life expectancy versus GDP of different countries. It also encodes population density in the size of the circles representing each country. In addition, the different colors represent the different continents each country is on.

Through this visualization, it is shown that there is a positive correlation between a country's GDP and life expectancy. I felt that this correlation is a natural assumption, even without this visualization. Therefore, the question changed again.

Question Evolution

Instead of trying to see the relation between GDP and life expectancy, I thought it would be interesting to see if:

"Is there a correlation between life expectancy and population density of different countries in one year?"

The reasoning behind this question is that if a country is so populous per square unit, is it possible that that means more people can survive in that country? Another way of looking at it is if a country is very populous per square unit, did it get that way because the life expectancy is so high?

The visualization for this is below:

Data Vis 4

In the above image, life expectancy is graphed versus population density of different countries. The size of each country's circle represents that country's GDP per person. As before, the color of the circle represents which continent that country is on.

In this visualization, there is a slight upwards slope to the cluster of points. This means that there is a slight positive, but weak, correlation between population density and life expectancy of a country.

Final Visualization

In the process of analyzing this visualization, I also noticed interesting trends among each of the continents. It is not perfectly clear why some continents points are clustered along the life expectancy axis, or the population density axis, while others are not. It seemed that continents with countries that had a high relative GDP per person tended to cluster together (for example, Europe), while the continents that had lower GDPs per person were more spread apart (for example, Africa).

I felt it was interesting to see, so I compiled a dashboard highlighting the points of specific continents for easy viewing. In addition, there is a map with points on each country, whose size represents life expectancy in that country.


Dashboard 1


In the end, it was pretty interesting to go through the process of finding a dataset, trying to find something interesting about it, and trying to visualize the dataset so as to show that "something interesting".

While I don't know if the result of my visualization is strong enough to support my final idea that life expectancy of a country is correlated to the population density, I think the general upward slope of dots points in that direction.

[add comment]
Personal tools