The Purpose of Visualization

From CS294-10 Visualization Fa08

Jump to: navigation, search

Lecture on Aug 27, 2008

Slides

Contents

Readings

  • Chapter 1: Information Visualization, In Readings in Information Visualization. Card, et al. (pdf)
  • Decision to launch the Challenger, In Visual Explanations. Tufte. (pdf)

Maneesh Agrawala - Aug 26, 2008 11:46:35 pm

Tufte explains how the decision to launch the Challenger could have been averted, had the Morton Thiokol scientists done a better job of presenting the information about the impact of temperature on O-ring damage.

More recently Tufte has analyzed the Columbia disaster. He found that once again a few scientists were aware that the shuttle may have been damaged during liftoff, but were unable to present their information to the key decision-makers in a way that would affect their decision.

Ketrina Yim - Aug 27, 2008 05:13:02 pm

I was rather surprised by the fact that the Challenger disaster could have been prevented with a more effective visualization of data. It certainly proves that miscommunication can have deadly results.

Upon taking a longer and closer look at one of the charts Morton Thiokol gave to NASA ("History of O-Ring Damage in Field Joints"), I thought about why I had difficulty decoding it. The reasons turned out to match most of the problems pointed out by both readings. First, though appropriate for the subject at hand, the little booster shapes were distracting and packed too closely together, making it difficult to tell each pair of boosters apart. It demonstrates the tradeoff between making a visualization appealing to audiences and making it clearly readable. In this case, too much effort was put into visually interesting shapes so the meaning was lost among all the black lines.

Second, splitting the chart into two sections meant that only the first portion had the visual code, so the variously shaded boxes were initially incomprehensible to me since I saw the second part first. It wasn't until I looked at the first section of the chart that I understood the cryptic labeling of the shaded squares inside each booster shape.

And it certainly wasn't comforting to see "Cannot be considered complete without the oral discussion" fine-printed at the bottom of the chart. Even if a chart is meant to accompany a presentation, it ought to be able to convey its information independently. If a visualization has to be verbally explained to get its message across, I don't think it can be called a visualization.

Scott Murray - Aug 28, 2008 09:48:14 pm

At my local bookstore today, I discovered The Ghost Map, an account of the 1854 cholera outbreak in London that was finally stemmed by John Snow's cartographic revelation. Several chapters of the book are all about Snow's process of first identifying the means of transmission and then making the case for it. His great insight was to use a visualization—the map—to convince others of what he believed to be true. Fortunately, despite the lack of visual thinking skills among the 19th century public, his map was persuasive enough to overcome the mistaken, yet popular, belief that cholera was transmitted via an infectious miasma.

On another note, I want to mention that I'll be teaching a workshop on Processing, a free, open-source programming environment that's great for data visualization. You may find it useful if you don't have any programming experience but would like to dip your toe in the water and start exploring possibilities for your final project.

Nicholas Kong - Aug 29, 2008 04:12:21 pm

I thought the Card et al. reading was a good introduction to the fundamentals of information visualization. The uses of visualization and how they support understanding were well-codified and I could not think offhand of an example visualization which fell outside their categories.

The Cost-of-Knowledge function seems an interesting and effective way to judge the efficacy of a visualization; however, the measurement of this function in some instances could pose obstacles. For example, it may be difficult to determine and/or quantify how much information one could glean from a stock trace in a fixed amount of time.

Tufte's work was a joy to read, not only due to his well-formed arguments and excellent prose. I found the layout of his text and images to be so unified that it was a sort of "visualization" as a whole. The footnotes and annotations were helpfully placed very near their reference points; the images were again adjacent to the text which they supported. In this way the layout supported a sort of memory extension, as I found I could quickly assimilate the relevant supporting information while not having to skip back and forth.

Ketrina Yim - Aug 30, 2008 12:22:55 pm

Indeed, textbook publishers could learn a thing or two from Tufte. Some textbooks are designed to be little more than vessels for information, and reading them becomes tedious work that requires re-reading to retain the facts.

James Hamlin - Aug 30, 2008 06:48:43 pm

I'm curious if the visualization community has had any interaction with George Lakoff's work on metaphor theory. Engaging problems of visualization in explicitly Lakovian terms seems like it might be fruitful. Visualization is in its essence tied to human psychology, and the key thesis of Lakoff's and his colleagues' work is that almost all thought and reasoning, especially higher level abstract thought, relies on basic metaphors. A simple example in visualization: a line graph showing change in quantity over time, where more is up, less is down, and time is represented in space. Conceived in such terms, the task of a visualization is to provide a schema for thought - a metaphor - that leverages schemas with which the viewer is familiar, tying them to novel information but allowing the already existing models of inference related to that schematic structure to be used in understanding and reasoning about the new content. Thus the task of visualization becomes identifying and constructing metaphors that lend the data an apt semantic structure (for some definition of apt).

Michael So - Aug 31, 2008 01:11:42 pm

I am curious to know what the engineers at Thiokol were thinking when they made their 13 charts. Why did they only compare a sample of 2 cases and leave out 22 other cases? Why didn't they put a big focus on temperature being the main culprit for O-Rings losing their resiliency? I guess it's because the charts were made in just 3 hours; the 3 hours may have been too short a time to make an effective argument that would have convinced NASA to postpone the launch. Perhaps if they had more time, a better visualization could have been made that expressed a clear link between temperature and O-Ring damage. Or they at least might have realized how wrong their 13 charts were at displaying what they wanted to prove.

I found it interesting in the Card et al reading about how visualizations can make us smarter (and also stupider). It all depends on how the visualization works. A visualization can expand our memory and reveal relationships and information that would be difficult to be aware of without some sort of visual aid. The distinction between a Data Table versus a table presentation was sort of confusing and new to me. It seems that "data tables" are tables that are structured in a way of cases and variables, where one is represented by columns and the other by rows, or vice versa. And the example of the table presentation in Table 1.8 in the reading, I guess it is not a "data table" because both the rows and columns represent the same set. I am still a bit confused on the distinction.

James Hamlin - Sep 01, 2008 06:38:30 am

And of course, upon doing the reading, I find metaphor mentioned in the first paragraph, though there it is not with the meaning I had in mind. The authors suggest that there is a special relationship between thought and vision, citing the ubiquity of visual metaphors for cognitive processes as evidence. Indeed, vision has a special place in thought. But so does manual activity - understanding is also 'grasping.' If one stops thinking about something one might 'set it aside.' The brain uses the 'logic' of our embodied lives to perform higher level cognitive tasks. A visualization can suggest such a metaphor, bringing out certain relationships among data, or becoming an external representation of it, extending memory or becoming an apparatus for 'external cognition' - external computation, inference, etc.. The authors define visualization as "the use of computer-supported, interactive, visual representations of data to amplify cognition." This is analogous to how I imagine a Lakovian conception of thought might be expressed: the use of _embodied_ representations to _enable_ cognition. There is significant affinity here.

Anyway, those are the theoretical/foundational concepts that came to mind while doing the reading. Also interesting was the formalization of the series of 'projections' that bring raw data to a visualization. Data tables (pretty much identical to those found in a DBMS) naturally serve as the rock bottom representation above the 'raw' data. The section on 'spatial substrate' I found most interesting. It is worth noting, I think, the perhaps unexpected richness of the substrate. At its simplest, a visualization has space and color to work with, since at the lowest level a visualization is a discrete 2D grid of color values. But given the realites of the human perceptual system, more interesting features can become 'basic' (in the sense that they are immediately and automatically interpreted by the visual system). Shape, proximity, lighting, hardness of edges, and more can become base features used alongside position and color to represent information.

Simon Tan - Sep 02, 2008 03:03:44 am

The "View of Sites on the World Wide Web" on page 13 (Figure 1.17a) of Readings in Information Visualization reminds me of the infamous XKCD "Map of the Internet" (http://xkcd.com/195/) or the "Map of the Blogosphere" (http://datamining.typepad.com/gallery/blog-map-gallery.html).

These kinds of visualizations do fit in with Card's description of information visualizations as tools for amplifying cognition. However, I can't help but feel that there is no clear *productive* purpose to these particular visualizations as there is with the other visualizations on that page. People make and view visualizations like Figure 1.17a primarily for their "interest" factor, it seems, and not really to aid in navigation or understanding of the Internet or blogosphere.

So to me, it seems that there is a space for visualizations as a sort of art form in addition to all the practical reasons for their existence. In either case, these visualizations do live up to their definition to amplify cognition; it is just that their purpose beyond that differs.

Nivay Anandarajah - Sep 02, 2008 03:21:36 pm

One little tangent off the reading: I found their description of the Navigation Chart rather revealing of a core consideration in developing visualization. Navigation Charts, as they note, sacrifice accurate projection of some physical property in order to cater to its true purpose of supporting specific calculations. We all know data can be presented in an infinite number of ways. By reading this and viewing other examples, I'm starting to tie the school of visualization design thought to the school of basic human centered design thought (coming from a background of product design). In specific, understanding the full needs and usability of the users experience with the data and developing a point of view that caters to a chosen purpose.

Yes, there are technical considerations to amplify cognition which will allow the user to have this perceptual inference towards this point of view. But I feel choosing this point of view in the first place is the backbone. As for the challenger, it seemed the initial set of data was presented with almost complete ambivalence to a purpose or point of view besides being technically unclear and misleading. If your data is presented for the purpose of predicting what could happen, then it's crucial that you realize you can't predict anything without showing trends or potential extrapolation/interpolation. Which they clearly did not attend to. Anyways, I'm just making predictions as to the core principles in designing visualization. Guess I'll wait and see.

Razvan Carbunescu - Sep 03, 2008 12:06:20 am

I would want to agree with Michael So's argument about the Thiokol's 13 charts (if they can really be called that seeing that only a few actually provide data where as some just provide conclusions/worries). While I agree that the argument could have been presented better to NASA I doubt that there was enough time to check all 24 previous launches, previous tests and put in all the data that was required.

I also think that the article against the use of Powerpoint as a presentation tool instead of technical reports is a bit misleading because it focuses only on the one part of Powerpoint that refers to presenting information in a succinct text format. I agree that the Powerpoints presented for Columbia's launch were badly designed with excessive hierarchies but that doesn't mean that the tool is wrong necessarily. Especially the example he gives about all the emails sent between engineers and how they were using 90% sentences organized into paragraphs goes against what he was talking about in the previous text presented about Challenger about the need to present information in a way understandable to an outsider.

David Poll - Sep 03, 2008 02:52:45 am

Having taken the Challenger decision as a case study in a business class (masquerading as a racecar that breaks down on cold days, and the potential for losing sponsorship if the car breaks during a crucial race), it becomes abundantly clear how vital visualizations are. It seems to me that it's extremely easy for engineers to get caught up in presenting data that they lose perspective on trying to communicate something with that data. In the case of the Challenger disaster, Thiokol's original recommendation to postpone the launch came from intuition and understanding of the problem by knowledgeable and qualified engineers, but the events that followed might have been prevented if those engineers had presented relevant, "Crystalized" data (as the other reading suggests) that packaged the information in order to make their point. The first table says it all: there's far more information than necessary to convey the point that O-ring failures are related to temperature. For me, Tufte's eventual graph of failures per temperature would have sealed the deal in an instant, because the data is so clear when presented in that way.

As for the reading on Visualization, I was particularly interested in the various classifications of data and how they can be mapped appropriately. Nominal, Ordinal, and Quantitative data each lend themselves to different types of visualizations (and can be transformed). I think I knew some of this innately, but to spell it out and formalize some of the concepts makes communicating why data should be arranged in certain ways so much simpler. My favorite observation here, though, is that this classification is, in and of itself, a transformation from nominal data (the infinite set of potential variables types out there) into ordinal data. This brought about a realization for me that virtually all communication of information is either the raw presentation or a transformation into one of these types. The thing I hope to get better at is knowing which transformation is appropriate for conveying particular points, since I think it will allow for more effective communication.

Chris - Sep 03, 2008 01:06:35 am

I'd agree with Razvan's remark about time being a significant factor. Tufte's article discusses all of this taking place the day before the launch is to happen. Ideally the engineers who designed the rocket would have addressed the question "what range of temperatures are safe for launch" long before a launch was suggested for a suspiciously cold day. In particular, a theoretical analysis should have been done during design of the shuttle, and this model should have been re-evaluated after each launch incorporating data from the additional launches. This way when a dangerous launch was suggested, an unequivocal answer could have been presented up the chain.

Witton Chou - Sep 03, 2008 03:54:52 am

I have to agree with David on the three basic variable types and how well they work. I'm sure we've all prepared some sort of visualization of data whether it's in the form of a graph or other diagram by mentally going through various options and deciding which ones might be worth trying out (or clicking on the previews in excel to see what best represents your data). But to be able to classify these basic types is not something I would have pondered prior to this class.

I also found the study human perception, the ideas of controlled and automatic processing, as well as exploting focus and periphery perception to be very interesting as well. I look forward to exploring how the human brain processes visual and textual information together as we have now seen some potential effects bad visualizations can lead to.

A question the Challenger chart visualization issue poses for me is how bad a visualization has to be for the creator to realize that the visualization is not suitable? As the creator, he would already understand what case he is trying to make and thus any rough depiction of the situation would, in his mind, convey his objective. Thus, without being able to explain all the relavent information face to face, how could one gauge the effectiveness of his visualization without testing on people? And how many, and in what fashion, would one need to test the visualization on for gauging effectiveness?

Karl Rohe - Sep 03, 2008 12:51:51 pm

At the extreme, some science fiction writers might like to predict that computers will take control of humans. At the less extreme, machine learning folks like to make block boxes which spit out answers. Their solutions are mostly devoid of humans, except for researchers and data entry people. However, it seems that the future is in humans using computers to expand their abilities.

The reading speaks about how visualizations aid our understanding. The interactive process between users and computers to make visualizations can be extremely useful in knowledge discovery, making decisions, and convincing people of the conclusions. This is strong evidence of the gestalt between users and computers. It is a reason for optimism!

Maxwell Pretzlav - Sep 03, 2008 01:15:33 pm

@Simon Tan: Visualization has indeed been incorporated into different sorts of art. You may be interested in the book You Are Here which collects a number of different sorts of maps and diagrams created for artistic, rather than intellectual or academic, purposes. Additionally a number of "new media" artists have incorporated scientific visualization into their art, a well known example being John Maeda who has created art-visualizations of things like the number of springs left in your life and tax dollars spent on different agencies.

Regarding the reading, I found the most important overall point how visualizations can amplify or impede our inherent cognitive abilities. The Card, et al. reading showed how best to organize and manipulate data to create visualizations that best assist human cognition. Tufte's eloquent account showed how badly organized and visualized data can disrupt and deter people from reaching correct conclusions. While the Card reading achieved this by showing some systematic data organization techniques and myriad examples of different visualization styles, I would be interested to know if there have been studies done using different well-established visualizations on common data which demonstrate how some styles of visualization aid cognition more than others.

HeatherDolan - Sep 03, 2008 02:09:26 pm

Here are some interesting visualizations of statistics by a photographer. He's taking statistics related to American consumption and putting them in a visual form we would not usually see in reality. Their primary intention or purpose probably falls under the category of inspiration.

http://www.chrisjordan.com/ See the Running the Numbers set.

Jeff Bowman - Sep 03, 2008 02:45:23 pm

(actually submitted at 12:45pm; the clock seems to be in EDT)

The idea of "knowledge crystalization" is a particularly good one for my view of Visualization and information design in general. It is relatively easy to see the connection, from "showing your work" in seventh-grade math classes, to giving presentations in the real world, to people who claim that they had a dream and woke up with a vision of how something could work. Perhaps inventing the flux-capacitor is a different notion than this, but the point stands: Visualizations capture thought.

I was also amazed at the idea that having a surplus of information steals attention, but also how fluid that correlation is. Richard Feynmann's demonstration was not scientific, but it was visual and clear; the data in the investigation was scientific, but not clear at all.

I'm also led to wonder whether artificial intelligence could ever progress to the point of recognizing data better than a human with an appropriate visualization. The second reading (Card et al) noted that humans are faster at picking up patterns, when there is the right amount of data, but that too little data could make the connection difficult, and too much could make the data difficult to find. Is there any way that the science of visualization and information design—and the qualitative decisions about what makes data "unusual" or "worthy of note"—could contribute to pattern recognition to determine when something is out of an acceptable range? Does AI pattern recognition have a place in the automatic creation of visual displays?

Calvin Ardi - Sep 03, 2008 02:48:17 pm

One could perhaps argue that visualization is certainly an art within itself. Although some visualizations specifically only "convert" numbers into something easier to examine, there are a lot of artistic aspects that get involved with creating a well-drawn visualization. Some of the most interesting visualizations I've seen are the ones created the The New York Times. They are drawn rather simply (color-wise and perhaps complexity of the details), but the relationships are clearly seen. Even better are the ones that can be interacted with on the online edition, which Karl described a bit above (about interactions).

Sarah vanwart - Sep 03, 2008 05:06:06 pm

I wanted to comment about a few things in the Card (et. al.) article:

First, one of the points that I found fascinating was the concept that humans have a terrible short term memory, which I really hadn't thought much about. A picture or figure actually does allow me to compensate for my own information retention deficiencies. If I were to simply pick four or five data points out of an unordered set and try to arrange them or analyze them without a pen and paper, I really don't think that I could! Without the luxury of "hold[ing] partial results outside the mind," my ability to make sense of even simple data would be crippled.

Second, I liked the formal analysis of how "knowledge crystalization" occurs. Though I acquire knowledge from information everyday, it's so informal and automatic that I seldom think about the process itself. Identifying how one specifically goes about synthesizing information by (1) collecting it, (2) determining ways in which it should be analyzed (creating a schema), (3) validating the information against the schema, (4) narrowing down relevant observations, (5) further narrowing the framework for analysis, and (6) "packaging" the results will certainly help me to constuct effective visualizations.

Finally, regarding the visualization gallery: I was able to browse through some (but not all) of the visualizations, and was blown away by the "Myths of the Developing World" video (http://video.google.com/videoplay?docid=4237353244338529080&q=genre%3Aeducational+myth). What an inspiring visualization -- finding the meaning and patterns in millions of records of multi-variate data, and displaying it all so simply and elegantly for the world. That was really something, and really brought home to me how powerful a visualization can be.

Ljuba - Sep 03, 2008 8:32:32 pm

I wanted to comment on Tufte's criticism of Feynman's public experiment with the ice water. While I completely agree that the demonstration lacked any controls and could thus never be use to make any definitive scientific statements, I believe Feynman's approach to justified for the purposes of media consumption. By that point in the investigation, temperature had already been established as the cause of the o-ring failure. Feynman's "experiment" was really more of a demonstration of the effect of cold on o-ring resiliency. Ultimately, he needed to illustrate his point in as simple a manner as possible so that the media would pick it up and relay it accurately to the public. A properly controlled, definitive scientific experiment wouldn't have made the point as effectively. I agree with Tufte that this was a lost opportunity to teach the public about the scientific process, but on whole I believe Feynman made his point well.

Seth Horrigan - Sep 05, 2008 01:19:29 pm

"Visualization: the use of computer-supported, interactive, visual representations of data to amplify cognition". I find it odd that none of my classmates have questioned Card, et al's definition. A visualization may incorporate those aspects, but while it may be sufficient it is not necessary. The graphics that we saw from Minard and Playfair are excellent visualizations of data that do "amplify cognition", but they are not interactive or computer supported. Likewise, visualizations that we will create on a computer may be "computer-supported" without being interactive. Although his definition describes one class of visualization, I do not think it right to define all "visualization" in this manner.

I do find Card's classification of visualizations into Information Visualization and Scientific Visualization interesting and useful. Also, I like his emphasis that different types of data require different visualizations. There is no one magic visualization that is best for all one-dimensional or all two-dimensional data.

Matt Gedigian - Sep 08, 2008 02:38:22 am

I found the section on Information Chromatography interesting. The claim is that a human-machine symbiosis is necessary because of the adaptability that humans provide (Card, p. 10). Unfortunately humans are only able to cope with low dimensionality representations. When dealing with datasets that might have thousands or millions of dimensions it's amazing that you don't obliterate the relevant information when doing such extreme dimensionality reduction (p. 17). I'm curious what type of 'processing' humans are doing in these scenarios. Are they employing some extensive domain knowledge (e.g. understanding psychology of phone scammers)? Or is it more of a visual perceptive process? If it's just discerning some pattern in the data then I wonder when computers will be able to do it. Either by standard statistical approaches (e.g. teh google's classifiers)? Or maybe some funky statistical topology (e.g. Topological Methods in Scientific Computing)?



[add comment]
Personal tools