A1-CharlotteWickham

From CS294-10 Visualization Fa07

Jump to: navigation, search

Contents

[edit] Bad example - U.S. height

Source: Time Magazine, vol 170, no 4, July 30th 2007, Europe edition, pg12.

Image:badex_usheight.jpg

This visualization is intended to illustrate the heights of U.S. men in relation to those of other countries. Each of seven countries is represented by a flag pole, its height depicting the average height of men in that country. In addition that average height is quoted above each flag pole.

There is also a figure representing the time change in the height of black and white males in the U.S. superposed on the flag pole figure.

This visualization is successful in one respect: it is clear the U.S. no longer has the tallest people. However, I think it has a number of flaws.

Note that the flagpoles have no visible baseline. This gives the impression that Filipinos are about two thirds the height of the Dutch when in reality they are more like nine tenths the height. The country names also alternate in their vertical positions. This further confuses the viewer as to where these poles actually start.

What purpose do the flags themselves serve? They may aid country identification but the countries are listed anyway. And how many people could identify the flag of the Philippines? They also provide some obstruction to matching the heights at the top of the flag pole to the country names at the bottom. Their varying angles and areas also seem to distort our perception. Do the Germany and Norway flag poles look the same height? Is this an optical illusion (or the result the the page not quite being flat in this copy)?

[edit] Deconstruction

There are two separate graphics in the visualization so I will treat them separately.

Firstly, the flagpole diagram. The data model here consists of tuples of country name (nominal) and average height of males in that country (quantitative). The order from tallest to shortest is important here too so we could consider the rank of the country (1 being shortest to 7 being tallest) as also being an element of each tuple. With only 7 countries this is a very small dataset.

The image model maps each tuple to a bar. The length of the bar (against the y-axis with no visible baseline) is mapped to average height and the x-position of the bar to the rank. The actual numerical heights and country names serve as labels above and below the corresponding bar respectively. The other graphical elements (the flags and pole shading) are pure chart junk and provide no additional information.

Secondly, the small time series plot. Here the data model consists of tuples of year, average height and race. Height is mapped to the y-axis and year to the x-axis. Lines connect each tuple of the same race and colour is mapped to race (red for white and black for black).

[edit] Redesign

The purpose of the visualization is to explain that the U.S. no longer has the tallest males. This is essentially a temporal comparison between the countries but the visualization here only lets us compare the countries at one time point. The small time series plot is irrelevant without reference to the time evolution of other countries. The ideal situation would be to have the average heights for these countries 20 years ago. Then a plot similar to that in "The Visual Display of Quantitative Information" on page 158 would be ideal. I could not find the relevant data to create this plot but an idealization of it is below. Without the additional data a redesign would consist of only the 2007 column.


Image:timeheight.jpg


Some points considered in the redesign:

  • Emphasizing the rank of the countries
  • Keeping the height and country label close together for ease of identification
  • Removing the junk (flagpole metaphor)
  • Highlighting the U.S.
  • Improving relative height comparisons (attempted but not resolved see below).


A note on the baseline problem:

Below is the same visualization above with the addition of a 0cm baseline. We can see that including this the graphic becomes very long and narrow and close inspection is difficult. With limited space on a page it is too impractical to include this baseline. One alternative might be to have the visualization at the top of the page and fill the large gap with other articles but leaving the baseline at the base of the page for those who wish to make comparisons.

Image:baselineheight.jpg


[edit] Good example - Mutant crops

Source: New York Times August 28, 2007 "Useful Mutants, Bred With Radiation". Also online here.

Image:goodex_crops.jpg

This visualization provides an illustration of the distribution of mutant plant varieties both geographically and over crop types. Here, the term mutant refers to new varieties that are the result of mutations due to natural variation or irradiation as opposed to GM crops. The area of the circles on the map is proportional to the number of mutant varieties registered in the corresponding geographical region. These are annotated both with the number of varieties and the region name.

The chart on the left illustrates how the mutant varieties are distributed among crop types. The inner ring divided among three broad categories and the outer one providing further division within these categories. These are also annotated with the exact numbers and coloured to emphasize the hierarchy of classification.

I find this visualization successful. There are no obstructions to quickly pulling the information you are interested in from the plot. For example, Asia has the most varieties registered, cereals make up about half of all varieties and of all the plants rice has the most mutant varieties registered.

[edit] Deconstruction

This visualization presents two largely independent datasets. The first consists of tuples of the name of geographical area (nominal) and the number of mutant varieties (quantitative). There is also an implied data structure that maps the geographical areas to longitude and latitudes and stores the map data. The number of mutant varieties to mapped to the area of a circle centered at the longitude and latitude of the geographical area.

The second data structure contains data of the type of mutant varieties. Originally we could imagine the data took the form of a list of mutant varieties and the category they belonged to. The data needed for the chart on the right would be an aggregate over this list at two different levels. First aggregating over the broad categories and then over the subsets in these categories. The result would be two tables:

Category Number of varieties
Cereals 1199
Noncereal crops 717
Ornamental Plants 625


Type Category Number of varieties
Rice Cereals 525
Barley Cereals 303
... ... ...
Legumes Noncereal crops 355

The first table maps the number of varieties (quantitative) to angle in a donut type graph. The second also plots varieties to angle in a donut with a slightly larger inner radius. The category (nominal) is mapped to colour. The outer donut has slight variations in the intensity of the colours but this carries no information.



[add comment]
Personal tools