Data and Image Models

From CS294-10 Visualization Fa08

Jump to: navigation, search

Lecture on Sep 3, 2008

Slides

Contents

Readings

  • The eyes have it, Schneiderman. (html)
  • The structure of the information visualization design space. Card & Mackinlay. (ieee)
  • Chapter 1: Graphical Excellence, In The Visual Display of Quantitative Information. Tufte.
  • Chapter 2: Graphical Integrity, In The Visual Display of Quantitative Information. Tufte.
  • Chapter 3: Sources of Graphical Integrity, In The Visual Display of Quantitative Information. Tufte.

Optional Readings

  • On the theory of scales of measurement. Stevens. (jstor)

Scott Murray - Sep 02, 2008 10:07:20 pm

I'm trying to access the IEEE reading from off-campus. Does anyone have any ideas on how to do that? (If you're able to download the PDF, maybe you could email it to me.) Thanks!

Contact me via email here.

Nicholas Kong - Sep 03, 2008 03:33:43 am

I think there are two ways to access library resources from off-campus: proxy server and VPN. Hope that helps!

Seth Horrigan - Sep 03, 2008 11:51:37 am

I enjoyed Shneiderman's mantra: "Overview first, zoom and filter, then details-on-demand". While computers may be used to produce advanced static visualizations, the aspect that really interests me is the interactivity component. The ability to explode a diagram, drill down on a statistic, or adjust the categories displayed is an invaluable tool that could not exist without modern electronics. Hopefully in the future, these aspects will be integrated into our everyday "paper" (a concept that is already becoming a reality with e-ink and their ilk), but for the present the desktop or mobile computing device is the amazing tool that allows us to dynamically visualize (or provide "details-on-demand") as the individual requires.

Also, Tufte's book is great. It is the grown-up's version of a picture book. Of course the examples of graphical excellence he provides are excellent, but the items that really caught my eye were more the small details he elaborates. For example, in Graphical Integrity he points out how the disingenuous artist in one graph uses parallelepipeds placed in front of others in an optical plane to give the impression that they "tower" over the earlier budget. When contrasted with a straight bar graph, the visual effect becomes very apparent, but until then it just seems like a slightly 3D image that illustrates a recent massive growth. I look forward to identifying dishonest illustrations in the future. I especially enjoyed Tutfe's observation that if the barrel volume was measured in one of the illustrations, a 454% increase was indicated as a 27,000% increase for a lie factor of 59.4 - whee!

Seth Horrigan - Sep 03, 2008 10:31:39 pm

In response to a question fielded today, this is an image depicting the geography of Napolean's Russian Campaign.

As you can see, Minard's illustration depicts the geography of the march fairly well.

Also, Wikipedia gives nice details on the invasion: http://en.wikipedia.org/wiki/French_invasion_of_Russia

Ketrina Yim - Sep 04, 2008 11:27:42 am

I, too, enjoyed reading Tufte's book. I'm still surprised that this relatively short text contains so much information and advice regarding effective graph and chart design. Especially useful are the short sentences that condense the main ideas of a chapter's sections, such as the reminder never to use more dimensions in the chart than there are dimensions in the data.

Also, I never realized that in facilitating the chart-making process, the computer ended up being the source of bad visualizations for quantitative data. In the age of the one-click chart, people can easily ignore the importance of effective and efficient data presentation in the quest to make their charts visually attractive. Tufte makes this notion evident in the text and his disapproval of PowerPoint. Yet at the same time, as a graphic artist, I feel there is a need to fight visual boredom in charts; a chart is no good if no one wants to look at it. That is less of an issue in papers and scientific journals, but the visual fluff is sometimes what attracts my eye to a chart that I might ignore otherwise in a magazine or newspaper. Is it at all possible to add aesthetic elements to a chart without distracting from the data and reducing a chart's effectiveness?

Scott Murray - Sep 04, 2008 04:10:51 pm

Since I'm approaching this subject from a design (and not computer science) background, it was helpful for me to learn about Nominal, Ordinal, and Quantitative data types. While it's possible to create visualizations without an explicit understanding of the differences between these types, I think my future work will be improved, having now learned N, O, and Q.

For example, one visualization faux pas that I've made in the past (and continue to make) is to use circles to represent one-dimensional, quantitative data. After yesterday's lecture and reading, it's now clear to me why this is a bad idea. Yes, quantitative data may be expressed as size, but, as Tufte outlines:

The number of information-carrying (variable) dimensions depicted should not exceed the number of dimensions in the data.

So, 2D circles are inappropriate for visualizing 1D data, since the two dimensions of the circle are not easily interpreted by the eye. (Does the circle's radius reflect the data value, its diameter, or the total area?)

I suppose a 1D point would be ideal, but even Tufte seems to concede that a 2D rectangle, as in a bar chart, may be appropriate, as long as the bars are uniform in width and that the 1D data is expressed in only one direction (usually height).

Source: Page 77, The Visual Display of Quantitative Information

Ljuba - Sep 05, 2008 09:57:06 pm

"If the statistics are boring, then you've got the wrong numbers." - Tufte

While reading Tufte, a found myself casually agreeing with all of the points he made in chapters 1 and 2. This one sentence at the beginning of chapter 3 made me put the book down, stop reading, and think. It would seem that the notion that statistics and numbers are boring is so ingrained in our culture that even I (a scientist by training and someone who likes numbers) found this pithy statement bordering on subversive. Why do people think the numbers are boring? Could it really be as simple as associated negative experiences with math in school?

I don't have an answer to this question or a solution to this problem, but I thought this sentence was so insightful and so practically useful to those of us who want to visualize data. The first question we should ask ourselves when starting a new project is, "are these data interesting"?

Ketrina Yim - Sep 06, 2008 12:32:26 pm

Now my question is, if it really is about negative experiences with math in school, why do so many have these negative experiences? Surely there are just as many, if not more, who had trouble with other subjects, yet the complaints about math seem loudest. It may be that math is abstract, while the world people live in is concrete. This would serve to reinforce the value of quantitative visualizations. Putting the abstract into a format that can be perceived can help all members of the audience understand the statistics presented.

Michael So - Sep 07, 2008 02:58:05 pm

I am sort of unclear on what Tufte means when he writes "If the statistics are boring, then you've got the wrong numbers". What are "wrong numbers" and "right numbers"?

I did find it interesting though that much of the twentieth century thought deemed graphics as for the unintelligent and that the reason for lousy graphics (graphics that lack integrity and sophistication) is their creators are trained only in the arts, so they lack data and quantitative analysis skills. I guess I can relate that kind of thought about graphics being for the unsophisticated to the kind of general thought between television and books; that television does not challenge nor exercise the mind as much as reading. So I guess I can see how graphics can be perceived as something made only for the simple-minded. However, that is a narrow-minded perception. Sure, graphics could be just shallow and decorative. But they also have the useful power to reveal data and can be used to effectively explore and analyze that data.

As was pointed out in the Shneiderman's reading, visual displays make use of our human perception abilities. This basically means graphics have the power to quickly and effectively present information and relationships, encouraging our attention to important areas of interest in an attempt to show a revelation and amplify our cognition.

Simon Tan - Sep 07, 2008 11:39:23 pm

Tufte is one of the more delightfully blunt authors I have read; he doesn't hesistate to point out what he feels is wrong with the world's perception of information visualization, and does it while being frugal with his words and generous with his examples.

I found his analysis of "graphical sophistication" in our press versus our school textbooks to be an eye-opener. I have always felt there was something odd about the graphs and charts in newspapers and magazines that seem to always point out a trend in such cartoonish and exaggerated fashion; now I can question them knowing the tactics used to undermine graphical integrity. Tufte's creation of the "Lie Factor" was particularly humorous, and it is great fun to point at a bad visualization and note its Lie Factor.

While all of his commentary on the dumbed-down nature of the graphics in the press made sense, I felt that the attacks on the artistic community were particularly harsh. Is there really an 'artistic bureaucracy' that drives the downwards spiral of graphic sophistication? Perhaps, but I am sure there are artists and designers who use their skills while also respecting the nature of the data they work with.

Calvin Ardi - Sep 08, 2008 12:39:09 am

@Michael: I think what he's trying to say is that if the statistics one gets from the numbers (simple e.g., mean, mode, variance, etc) isn't that interesting, then perhaps the data being collected is somewhat trivial and may not convey information that is useful or can be used to back and argument in some way. If the data isn't interesting, then getting statistics about the data may not reveal anything interesting, either.

@Ljuba: In addition to questioning whether the data is interesting, I'd argue that it'd be useful to ask what can the data (or visualization of) do for us? As we discussed for a bit during the first lecture, we can use visualizations for a number of different purposes. Thus, the data viewed superficially may not be interesting, but the visualization of it may provide for some usefulness.

Regarding the readings, I found Tufte's readings to be pleasantly succinct and riddled with examples throughout the years. Reading the preface gave somewhat a bit of context for Tufte's typesetting and organization of the chapters, especially about inserting images inline the text whenever possible (I find most of the textbooks refer to figures or diagrams that may be pages away from the actual reference), yet still making good use of the margins on the side for references or more figures.

More notably, the concepts are conveyed throughout each chapter and summarized in bullet-point format at the end, and continue to make an appearance in the later chapters. The concept of "graphical excellence is nearly always multivariate" (p.51) leads to the next chapter's concepts of misleading graphics ("Lie Factor", one dimensional data in multiple dimensions).

Interestingly, Tufte criticizes those trained solely in the graphic arts who create graphic arts with data as being inept with statistics and more focused on the presentation, but solutions don't seem to be offered as to how to remedy this.

Card and Mackinlay's paper delves assigns more technical terms and definitions to graphics and visualization, something that Tufte minimizes (perhaps in order to reach a broad audience and realize the notion that visualization is more of an art than a set of rigid rules and definition. Nonetheless, it is another perspective of images and how to classify or structure information visualization (e.g., making distinctions on data values).

Nicholas Kong - Sep 08, 2008 06:02:58 am

I'd like to echo Calvin's explanation of what Tufte means by "if the statistics are boring, you've got the wrong numbers." This goes hand-in-hand with Tufte's contempt for chartjunk and ornate, inappropriate decorations: if you need to turn your data into gaudy graphics to draw attention, the data is not worth reporting because the conclusion that you are presenting from your data is not sufficiently noteworthy.

I really took to heart Tufte's principle of not using more graphical dimensions than data dimensions. The "lying" graphics he presents are on the whole astoundingly difficult to read precisely because of these games with perspective, area, and volume. I found Tufte's sarcastic decorated graph on pg. 59 quite funny, although his florid frame detracts from retention of the data; not that he was seriously suggesting such a presentation.

Sarah Van Wart - Sep 08, 2008 12:40:33 pm

In Wednesday's class, the point was raised that although many variables could technically be expressed in a single visualization, attention must certainly be paid to the ability of the user to consume a visualization and understand the information that the author is attempting to convey. I think that the magic of the visualization is not only the ability for the engineer to construct a smart and accurate visualization, but also for the audience to be able to easily (or at least eventually) understand it.

I just wanted to comment briefly on the subjectiveness of making sense of visualizations. Though we can employ a variety of techniques in combination to assist the viewer in understanding particular concepts, it's important to be aware that each member of an audience has a different way of processing the visualization. For example, for me, I found the "Map of the Market" visualization (http://www.smartmoney.com/marketmap/) to be very abstract, and it took me a while to understand what all of the boxes and colors meant, though I did eventually get it. There was also a pretty abstract visualization in Tufte's "Envisioning Information" book on p. 25 that attempted to give a graphical representation of all of the train routes in Java. I'm sure if I studied it for a while, I'd eventually understand it, but it effectively made me blind. And the stereogram was also lost on me -- merging two images side by side into one with my mind? Hmmm. On the other hand, the Napolean visualization, the map of Manhattan, and some of the other bar charts and ordered spreadsheets made sense to me.

One final example of audience subjectivity in visualization interpretation: for assignment 1a, something I thought to be a "good visualization" turned out to be Ljuba's "bad visualization." The colors, groupings, and proportions really made sense to me, but perhaps it just appealed to the way my mind works -- maybe Ljuba likes stereograms and market maps :). Perhaps I overlooked some absolute or fundamental concepts, but I think what's really going on is simply two different brains at work, each making sense of visual representations differently.

Karl Rohe - Sep 08, 2008 02:40:08 pm

The things that most struck me about the Tufte reading was how far back he goes to find interesting examples (p21 and p28). The graphic that I like most is the star fish rolling over on p36. In words, this would take several sentences to describe. However, the process is not so complicated as the graphic shows with clarity.

In the next 2 chapters he appears to pick on the New York Times for their graphics lack of integrity. I have not seen any bad graphics in the New York Times in my life of reading it. It appears he has gotten through to them. However, he still needs to reach the Economist. Although theirs are rarely dishonest (See my assignment 1a for an example), they are horribly uncomfortable to decode. Has anyone had a similar experience?

Yuta Morimoto - Sep 08, 2008 02:45:12 pm

While reading Tufte, I can understand the following comment on "If the statistics are boring, then you've got the wrong numbers." - Tufte Actually, from my experience, I partly agree to it. Because most of many statistics just shows meaningless information and intractableness, so we just remember negative experiences with math in school. However, I think this problem is depending on visualization, since some statistics information can be obtained with good appearance. By using such visualization, I think people will not be almost bored.

Matt Gedigian - Sep 09, 2008 01:51:43 am

As others have mentioned, Tufte's statistics about the use of relational graphics in different publications was interesting. I find it surprising that the publications not only have bad visualizations, but that they have so many of them. Magazines like Entertainment Weekly, for instance, have tons of silly graphics which have information content on par with the "9 out of 10 employees eat in the cafeteria" graphic from page 81. I'm not sure I understand their motive for using visualizations at all. Is it just an excuse to add something graphical (besides an ad)? When I was looking for visualizations I flipped through some issues New Yorker and didn't find any (good or bad). Are their dozens of tiny cartoons filling the role that other magazines have misguidedly turned to infoviz for?

I was surprised to see that the Wall Street Journal had zero relational designs in his sample. Tufte will no doubt be pleased to learn that they've moved on to data rich and completely misleading figures.

Maxwell Pretzlav - Sep 09, 2008 10:17:16 pm

While Tufte, as always, excels at to-the-point writing that's a pleasure to read, informative, and well-illustrated, I'd like to comment on the other two readings assigned here. I agree with Seth that Shneiderman's mantra "Overview first, zoom and filter, then details-on-demand" rings home. As I think about it, all the well-designed information-intensive software I love to use follows that mantra: Google (Search results, drill-down, details by clicking a link), Adobe Lightroom (quite literally; Lightroom has a grid overview and filter mode, and zoom features which display details about individual photos), iTunes (overview in a list, search bar to filter, get info for more details), etc. It's good to see software designs have caught onto this idea.

I must confess I had major issues decoding the deconstruction-tables from the Card and Mackinlay reading. While the prose and images helped me understand the different techniques employed in the many visualizations (although the low quality grayscale xerox did not), the grid-tables used to break down the different techniques were very hard for me to read. Not only did they have many different possible values for each cell, which were only described in prose once (and no easy visual key given) so I had to continuously flip back to the explanation to understand them, but it seemed the authors kept having to make up new variable values or new odd distribution rules to satisfy unusual techniques used in different visualizations. It struck me that this visualization technique itself lacked some polish and attention to make it better fit the domain it was being applied to.

Chris - Sep 10, 2008 01:31:04 am

@Karl: I definitely noticed a trend of Tufte "picking on" the Time Magazine and the New York Times in the selection of bad examples. While the examples from Time Magazine didn't surprise me, I was surprised to see the examples from New York Times, since I do occasionally read their webpage and don't remember seeing anything objectionable. I re-visited the New York Times online and, when tuned to it, found there to be lack of statistical graphs. I did find one graph [[1]] which could have some complaints leveled at it (the distracting backdrop, perhaps the scale starting at $100). That said, I did not find any cartoon-ish graphics (like the ones from Tufte).

On a related topic, the cultural implications of the comparison of various popular publications on page 83 are somewhat disingenuous IMO. Take for example, the Economist, which is said to represent Britain. Though it is published in Britain, only 14% of its sales are in the UK, as compared with 54% of its sales in the US ([Wikipedia]). Additionally, The Economist has a very specialized and data-centric focus (as compared with the New York Times, Time Magazine, etc). I would be curious to see where other (theoretically) specialized publications would lie on this plot (Scientific American, for example).

Regarding the remark "If the statistics are boring, then you've got the wrong numbers," I think the connotations of the word "boring" are being focused on too much. I think all that all (or most) of what was intended by this remark is that "if the statistics don't add anything to the story, that is, they are irrelevant, then you have the wrong statistics," which is pretty tautological.

Matt Gedigian - Sep 10, 2008 02:21:55 am

On the issue of Tufte vs. New York Times, it's not all animosity. He compliments their infographics, and collects some of his favorites here.

David Poll - Sep 10, 2008 02:43:11 am

The first sentence of the Shneiderman article says it all to me: "Information exploration should be a joyous experience..."

I love the attitude toward visualization, but it seems like the discipline just isn't there, even though the technology to enable really amazing visualizations is.

I think getting the abstractions right is extremely valuable for making exceptional visualizations accessible to the average developer/designer. As such, I really appreciated the breakdown of the 7 tasks and the various data types. What was missing to me, however, was the notion that while the abstractions are important, it all comes down to getting the level of abstraction correct. It occurred to me that Overview, Zoom, Filter, Details-on-demand, etc., are all just special cases of pivots and filters. But such an observation's not all that useful, since it doesn't guide visualization creation at all. I wonder if these 7 tasks are really specific enough (or perhaps not general enough). For example, where are the "composite" tasks, such as "Relationship-overview" or "Filter-relationships"? Maybe it would suffice to say that any pivot might have these tasks, and the "Relate" task is a means for accessing the various pivots.

@Sarah Van Wart I just wanted to second the comments about subjectivity of visualization quality. Just as we discussed that effectiveness of visual parameters (hue, value, size, etc.) is based on how the brain is able to perceive this visual information, the effectiveness of the visualization is probably related to how we process that information. Are we right-brained or left-brained? In my case, partial color-blindness makes certain visualizations either ineffective or confusing. What's to say that someone who is accustomed to (or even naturally/biologically inclined towards) different types of visualizations won't get more value out of them than others?

Witton Chou - Sep 09, 2008 05:20:23 pm

While it is very difficult to combine various data into one comprehensive visualization, one that is well done can prove to be very useful. Albeit some information may be lost in the process of encoding the data to a visual representation, the creator will have had a purpose in mind, which is ultimately what is expressed and the first attribute we notice.

I actually disagree with Sarah about Smartmoney's market map. I found it to be very intuitive and immediately gives the user a sense of how the market is doing and how much each company reflects the overall status of various sectors as a result of the use of hue to determine whether stocks went up or down and size of each company as a reflection of how their market cap influences various performance indices. It really gives the users a sense of how companies are performing without requiring the user to perform various analyses of the numbers to know how the market is doing - one glance will tell you most of what you want to know when you are limited on time.

Jeff Bowman - Sep 10, 2008 12:18:44 pm

@Ljuba - I think the quote "If the statistics are boring, then you've got the wrong numbers" speaks more to the deficit of attention than the quality of the visualization. While I see instances where the data can be the end in itself, most of the time it is the means in which to make a decision. I believe Tufte is making the point that visualizations are made to convey information worth conveying, and if there is nothing interesting to convey (or the visualization is poor) it only worsens the problem in deficit of attention.

In general, though, I think Tufte likes his examples, and dislikes nearly everything else. The New York Times gets hammered for its visualizations, but Tufte seems to dislike anything that doesn't resemble the complexity and elegance of that famous Napoleon/Russia image. That's not to say that his points are invalid, it's just to say that he seems very critical of visualizations that really aren't all that bad.

James Hamlin - Sep 14, 2008 03:22:50 pm

I just got Tufte's book, so I'm playing catch-up. The book itself is gorgeous, from the dust jacket to the page layout to the way examples are inserted in the middle of compound sentences in the prose. My only complaint with the first chapter is that it felt directionless, like we were just meandering through a field of good visualizations but without any framework with which to evaluate them. This was intentional, and I suppose it serves as a good introduction, so maybe reading the chapters late (after week three) had me expecting something a bit more meaty.

The chapter on graphical integrity was very good, showing not only crude tricks and oversights, but how facts about the human perceptual system can make certain mappings, even if proportional in some sense, produce dishonest visualizations.

Given the discussion that's taken place, I feel I should visit Tufte's claim that "if the statistics are boring, then you've got the wrong numbers." I find this to be right and wrong in different ways. It's right because clearly if the statistics are worth working with, they'd better be showing us something interesting and non-trivial. If they don't show us anything, then they there's no reason to work with them. Taken perhaps differently, Tufte might be saying that if the numbers bore you, then they're the wrong numbers - as has been mentioned, this might be related to a learned aversion to maths. Kids might just be working with the wrong numbers. If this is the right interpretation, I would say Tufte is wrong. The focus on 'information-preserving' mappings to the visual space seems to hide some real creative, interpretive work that goes on in constructing the mappings to produce the visualizations. A visualization doesn't just present the numbers in an interesting way, it gives them form. It gives them meaning. To say that all of the information is already available in the raw data is like saying that Michaelangelo's David is already in the solid block of marble. It's true in a sense, but it takes a lot of work to sculpt your own David.



[add comment]