A2-JohnsonNguyen

From CS294-10 Visualization Fa07

Jump to: navigation, search

[edit] Step 1: Domain

  • I remember one of the more important data sets when I was looking for college were rankings. Usually the rankings were performed for a particular area and each row would represent a college and all of its statistics that contributed to its ranking. Unfortunately, it's very difficult to find an electronic copy of rankings as the descriptive ones require fees. I also browsed the online data set recommended for a while but found that the datasets there did not provide enough angles to look at the data. They were very specific about one characteristic and would sometimes zoom across several years which would not have made an interesting visualization. After a few days of continual search, I finally stumbled upon an excel worksheet that ranked graduate schools.
  • Difficulty: The Excel file did not smoothly go into Spotfire as there were extra columns that would show ranking for the particular feature. For example, there would be a column devoted to ranking the college based on how much funding it has, and another devoted to ranking the SAT scores. I had to remove these to allow for my own editing. Also there was formatting issues I had to take care of in order for the software to be able to read it. It does not seem to respond to Excel formatting such as alignment, valign, bolding, and even italization. After these were gone, the information was read in reasonably well.

[edit] Step 2: Question

  • The immediate question that comes to mind is definitely, which is the best college. However, I knew prior to analysis that there would be room for critique of the initial qusetion.
  • Since the rankings are based on the different characteristics, the individual who is viewing the data has his own set of weights he would assign to them. Hence, generating a new rank. I'm not too sure if there is the ability to reorder the already ranked data but it's clear that we will be able to see outliers if we have many different views of the data.

[edit] Step 3: Visualizations

In this first visualization, Spotfire simply took the data that it received and generated a scatter plot where it took the ranking of the schools and plotted them versus the overal total research that is done. Unfortunately, this is a poor representation of the data as there are many other fields to consider such as the faculty, endowment, and SAT scores that are not captured in the graph. The legend on the right states the name of the school which I think could have easily been used as the X-axis. However, overall, this chart is very decent for a default graph that was automatically generated upon receipt of the data. We can see a clear trend of the total research correlating with the rank of the school; higher ranked schools devote more time to their overall research. The Y-Axis is in the thousands.

Image:JNVis1.jpg

For these next two graphs, they are rather simple, but I was able to answer a personal question of the difference between funding for public and private school. The pie chart does the job of informing us exactly how many of the 50 schools are public and which are private. And from the depictions, we can see that the majority of the ranked schools are public. Looking at the bar graph though, we observe that the funding on the other hand, is packed into the private schools. This is the unfair distribution of funds between the two types of schools and it just amazes me how public schools are able to keep up with such a difference in financial magnitude. Image:JNVis2.jpg Image:JNVis3.jpg

Finally, we have a visualization that I find pretty interesting. It's a parallel coordinate graph that basically is a line graph of all the colleges. Each column represents a different data field of the college. As I click on each one of the lines, I'm able to follow it and see very clearly the level of funding, it's type, and even rank. There are always outliers where a very low school has incredible fundings. I guess one of the other reasons why I find this chart to be so informative is that it's able to combine several dimensions into one plot. Albeit, it's cluttered but it does manage to deliver almost all of the data in one go.

Image:JNVis4.jpg



[add comment]