Data and Image Models
From Visualization Sp06
Lecture on Jan 19, 2006
Readings
- The eyes have it, Schneiderman. (html)
- The structure of the information visualization design space. Card & Mackinlay. (ieee)
- Chapter 2: Graphical Integrity, In The Visual Display of Quantitative Information. Tufte.
- Chapter 3: Sources of Graphical Integrity, In The Visual Display of Quantitative Information. Tufte.
Optional Readings
- On the theory of scales of measurement. Stevens. (jstor)
Jingtaow - Jan 19, 2006 09:08:02 am
The graphical integrity problem raised by Tufte is pretty thought provoking. Here I would like to recommend another book - Darrell Huff, Irving Geis, "How to Lie With Statistics", W.W. Norton & Company (1954, 1982). From chapter 5 to chapter 7 (pp 60 - pp86), the authors also discussed almost the same graphical integrity problems in depth with more negative examples. Different from Tufte, who attribute the major causes to things like lack of quantitative skills and/or the good willings of making the statistical data "less boring", Huff believed that most of the mistakes he covered were intentional. Personally, I agree more with Huff because it's hard to believe some mistakes pointed out more than 50 years ago still happens again and again nowadays were caused by "lack of skill".
I'd like to add two more common mistakes related with graphical integrity. These mistakes are drawn from David A. Freedman - "Statistical Models - Theory and Practice", Cambridge University Press, 2005.
1. Confounding. It's common for news reporters to draw samples of two or more variables on the same chart with seemingly correlations (linear/logistic etc) and claim/imply that these variables have causal relationships. In fact that's not true in most situations due to confounding effects (e.g. A cause B, A cause C but there is no causal relationship between B & C). ASSOCIATION is not CAUSATION. Without randomized controlled experiments or really lucky natural experiments, most of the results from observational studies should only be intepreted as ASSOCIATION. e.g. recent research had rejected the common belief that there is a causal relationship between human body cholesterol level and blood pressure and found that almost all the drugs used to lower cholesterol level can do nothing but kill you faster.
2. Data Snooping. If you propose your model/interpretion AFTER looking at the data collected, it's very likely that the "statistically significant" results were caused simplely by data snooping. E.g. if you have 100 50-dimensional random noise samples, you can always selectively keep 5-dimesnsions and get a statistically significant result at 10% level.
Bryan - Jan 19, 2006 03:49:19 pm
To add to what Jingtaow said--I also have read several times and recommend Innumeracy as a discussion of what your average person lacks as far as competence in interpreting quantitative information. For me, the greatest challenge in visualization has been guessing the level of sophistication of my audience so as to best communicate my point while neither insulting nor losing anyone.
Noaa - Jan 19, 2006 05:56:18 pm
I think it's pretty ironic that the view of graphical information in the US is that it's for the dimmer of the population and thus informative relational graphs are largely excluded from publications geared towared the more educated public. It's ironic because while the US views them to be largely for the unsophisticated, the very graphs that are looked down upon in the US are used by foreign countries that the US itself considers to be intellectual countries. One would think that the reverence for the Japanese grasp of statistics would make for an eagerness to participate in the usage of their graphical methods, but in fact the opposite goes on. Pretty strange...
Cynbot - Jan 23, 2006 06:23:25 pm
About "The Eyes Have It": The study described here: http://www.eurekalert.org/pub_releases/2005-11/uoo-dds111805.php seems to confirm that filtering is indeed a crucial aspect of cognition. I agree that the computer is the right tool to aid us in navigating to our goal and the design of the algorithms used to sort, analyze inperpret the data will ultimatley prove to deepen our consumption of the abundance of data in our lives.
Brien - Jan 23, 2006 07:13:16 pm
That people warmed to the water flow diagram mentioned in "The Eyes Have It" is not surprising, given their dismal alternative of text. However, I think the water flow approach does not fully use its visual potential. Although conceptually simple, it is probably the lowest-level construction mentioned. Like its analogy in the signals and systems field (the flow diagram), it an assembly language of sorts. Even though filters can be composed into higher level filters, the system will always retains its sequential, logical nature. It's been my experience that most people aren't for this ... I'd expect the more visual approaches (like the Venn diagrams that deal with overlapping areas, and possibly the cool looking InfoCrystal) to be more intuitive to most. Maybe they're perceived as too simple, or maybe they are just too hard to execute.
Ryanaip - Jan 23, 2006 10:03:36 pm
The Card & Mackinlay reading, like many other visualization readings I've seen, references Film Finder. However, I've never actually seen a working version, and I'm not convinced that I'd find the system useful for actually browsing for movies. For example, when I'm in the video store looking for a movie, I find it best to wander and look at a few random movies. Often something will catch my eye and either look interesting or at least send me off in a new direction. It doesn't seem that Film Finder provides the right kind of interface for such associative browsing, since it only allows filtering on broad genres of movies or on specific actors/directors/years. Does anyone know of a version online anywhere?
Raymond - Jan 24, 2006 12:59:44 am
Looking at the visualizations in chapter 1 of Tufte's book, I realized the ones that appealed to me the most were not of bars, lines and graphs, but those that showed pictorally how time progresses (much like today's time-lapse photos and movies). I also liked the one that showed the relationship of the prices, wages and regisn of British Kings and queens because not only was it visually appealing, but it was also precise and simplistic. Many of the great visualizations out there display a wealth of information in very little space while maintaining efficiency- and that's what Tufte considers the "Principles of Graphical Excellence".
AaronHoover - Jan 26, 2006 11:28:34 am
In his conclusion to chapter 3 on Sources of Integrity and Sophistication, Tufte claims that "much of the world these days is observed and assessed quantitatively." Initially, I simply accepted it, but as I thought more about it, I questioned whether that's actually true. Doubtless, we are assaulted with a much larger volume of information on a daily basis, but I'm not sure if I agree that any more of it is quantitative today than it was in the past. Certainly science and engineering have become more quantitatively oriented (genomics, climatology, computational fluid dynamics, etc), but for the average person, is it really true that "much of the world" is observed quantitatively? I think I'm inclined to argue that most of the information we use to observe and perceive the world is basically qualitative and requires a certain amount of processing or analysis to extract quantitative observations or assessments.
Yi-Tao - Jan 26, 2006 01:52:08 pm
As I understood it, Tufte's point wasn't that there is more information now than there was before, it's that we have an easier time finding information. Thus, there is so much accessible information that it has become harder for us to discern the important information from the trivial. In essence, this is why we are relying more on visual aids. The size of data is daunting and we simply don't have the time to go through all of it.
Maneesh - Jan 29, 2006 10:48:26 pm
Jingtaow - Huff and Geis' book is a classic. I haven't seen Freedman's book, but the problems of confounding and data snooping definitely happen all the time.
Cynbot - Filtering is a huge part of cognition. The key is to design visualiations that help people focus on the important data.
Ryanaip - Unfortunately I don't think there is an online version of Film Finder. But the Spotfire software essentially grew out of the ideas presented in the Film Finder papers.
Nchentan - Jan 30 2006, 1:46 am
In the "The Structure of the information Visualization Design Space", I notice that in table 6 and 7, the V column are not marked, while I would think that the programs should allow the user to fly-by the virtual space, rotate the view around, or zooming in. All these actions seem to be a super set of zooming and panning the view for the scatter plot cases as in Film Finder and World Within World. What made them different in the classification?
