A1-NealFultz
From CS294-10 Visualization Fa07
From Baseball Hacks, by Joseph Adler
This a scatter plot of where David Ortiz hits. It looks nice, and it has a diamond to help orientation. You can easily pick out which hits went foul, which were home runs, and notice a large clump by first base.
Deconstruction:
Here the data is the x and y coordinates of where a hit was fielded. Both of these are (Q), and furthermore, since the 0 is arbitrary, they're interval variables. Since it just presents two variables, it only needs to use position; Size and value aren't used, but the clumps of points sort of make it look that way.
The axis labels and tick marks are totally gratuitous; they should probably be turned off.
If you just wanted to look at the distribution of hits, this is all you need.
From Year 2000 In a Nutshell by Norman Shakespeare
I don't really know what to make of this graph. It plots two ordered variables against each other (OK...) but then it has a funky scale on both axes. And WTF are the diamonds supposed to be? What are the labels for?
Deconstruction:
According to the text, the process involved ranking each system in order of difficulty to fix (~age, I suppose), then assigning a value to each of them (not ranking them). Therefore, difficulty is (O) and value is (Q).
One problem with the original chart is that the scales don't make any sense for this type of data. The scale makes points in the lower left corner seem further apart, and points in the upper right seem closer together; and the motive for this isn't very clear. It's possible it's just the excel default.
Another thing that confused me is that both diamonds and triangles are used, but only triangles appeared in the legend. Diamonds were the actual data, triangles were the company average. The labels next to the diamonds are the point's difficulty rank. He's also using both shape and color to represent the same thing.
To fix this, I switched to a bar plot, with a normal axis; The text said a system with a cost of three is not necessarily three times more expensive than the one with rank one, but a system with a value of three *is* three times more valuable. So instead of using position to represent value, I went with size. I left the company averages in because they might be useful to someone, even though they all fall on y=x. On the other hand, the graphic is still kind of confusing, so this might just be a case of garbage in/garbage out.



