From CS 294-10 Visualization Sp10

Jump to: navigation, search



NOTE: I read the instructions wrong and initially included just implementation details, not rationale. Here's the design decisions, minus the hairy bits. (Pauloppenheim 00:02, 9 February 2010 (UTC))

The data from the Guardian is primarily presented as a summary of largest contributors, and rolls all other contributors together into simple buckets. Unfortunately the "other" and "private" buckets dominate the summary numbers. Hence, I decided to first see how they bucketed the data, then re-bucketed it myself. The primary interest is total contribution, and then how much of it has yet to be met with cold hard cash.

The design makes the most obvious, accurately-perceived display (main bar size) show the total contribution. The secondary color shade lightening represents how much of the money is unfunded, in parallel with the concept of "lightness" in payment. The color of the bars is the nominal bucketing of country or group (but possibly multi-government organization). Each bar is also labeled with which bucket it represents, and there is a 2x2 key for the graph which explains the dark / light shades and red / blue hues.

I made a very conscious effort to remove as much chartjunk as possible, and increase my data-ink ratio. I think I did fairly well, minus the language on the labeling. That is likely the largest weakness of the graph - those labels are LONG! Hopefully the graph works well.

Implementation details

This was intense.

  1. The Guardian data did not add up. (Guardian post, Guardian data on Google Spreadsheets)
  2. I asked permission to use the ReliefWeb data, which at least agreed between the two spreadsheet files. (ReliefWeb data)
  3. The ReliefWeb data did not clearly add up. The "summary" XLS added up, but the split of "Personal" and "Other" did not completely add up; I have ~ $11K discrepancy between the ReliefWeb "Funding" amount and the one I calculated. The "Uncommitted" totals did add up. I spent many many hours trying to figure out how to make this work before giving up and fudging the "Other" bucket.
  4. After several iterations on my python parsing script trying to make the numbers match, I realized that the breakdown is much simpler and holistic than the ReliefWeb summary - Countries, Intra-Gov't orgs, NGOs, companies, individuals. Fortunately several of those were only significant as a single number, so "individuals" and "companies" as single numbers made sense, and made for a much simpler graph.
  5. I contemplated the split between "funded" and "uncommitted unfunded" - I really wanted to use a single number, but realized that I could sort by the total, but fade the value down on the unfunded part, keeping a single number, but giving more info upon further investigation.
  6. I made the final CSV file using my python parser script, and wrote a simple drawing script using matplotlib.
  7. Installing matplotlib was needlessly painful on Mac OS X 10.5.8. easy_install --prefix=~/local numpy==1.3 and easy_install --prefix=~/local matplotlib==0.98 finally worked, after trying about 6 versions of each.
  8. I turned the simple graph into a SVG file, and edited it the rest of the way in Inkscape.
  9. The second split was for countries and organizations, and being a difference of kind, I used color. Fortunately I only needed 2 - gov'ts and not-gov'ts.
  10. The font is Futura because Inkscape f'd up most of the rest of the good looking ones. Inkscape also seems to have many selection management and input focus bugs and design issues.
  11. In terms of the "data-ink ratio" savings, I only put numbers in for the extremes, and at the elbow of the curve.
  12. Python code, CSV files, and SVG files available upon request.

back to Assignment 2: Visualization Design

[add comment]
Personal tools