Discussion of Good and Bad Visualizations

From CS294-10 Visualization Fa08

Revision as of 19:53, 2 January 2009 by Christopher.cameron (Talk | contribs)
(diff) ←Older revision | Current revision (diff) | Newer revision→ (diff)
Jump to: navigation, search

Class on Sep 8, 2008

Contents

Readings

  • Chapter 3: The Power of Representation, In Things That Make Us Smart. Norman. (pdf)
  • Chapter 4: Data-Ink and Graphical Redesign, In The Visual Display of Quantitative Information. Tufte.
  • Chapter 5: Chartjunk, In The Visual Display of Quantitative Information. Tufte.
  • Chapter 6: Data-Ink Maximization and Graphical Design, In The Visual Display of Quantitative Information.

Optional Readings

  • The representation of numbers. Zhang and Norman. (pdf)

Seth Horrigan - Sep 05, 2008 06:10:09 pm

Tufte admits that his changes in data-ink maximization may not always be "better" although they do fit the theory better. In many of his examples, I would say his changes make the graphs much worse. He seems quite pleased with his "improved" Tukey box plots. Although they do use more data-ink, and would thus be easier to construct by hand, they are no where near as readable. This is not merely reaction against unconventional ideas; rather, picking out the limits of the quartile and the identifying the median goes from trivially simple to a difficult chore - especially for those with vision loss. Box plots are not elegant, but they are clean, clear, and effective. Tufte's version is the exact opposite. They are elegant, but are confused, unclear, and ineffective. Tufte's version emphasizes form over function, but I would choose Tukey's original functional model any day.

I do strongly endorse some of Tufte's suggestions though. Namely, adding a white grid visible across the bar graph is immensely useful and makes identifying exact comparisons much easier. Likewise, in his discussions of "chartjunk" I found many useful gems - the heavy dark grid may be useful for plotting by hand, but the readability of the charts (especially those like Marey's train schedule) improves dramatically when replacing the heavy black grid with an understated gray grid. In all the cases Tufte presents, the removal of the chartjunk is a great improvement. It is sad that programs like Excel, Powerpoint, and OpenOffice Impress perpetuate these terrible visual precedents by making it easy and sometimes the default to use techniques leading to moire vibrations.

Simon Tan - Sep 07, 2008 11:55:26 pm

At first glance at Tufte's applications of his data-ink principles, I was hesitant to accept the redesigned charts as 'better'. Like Seth, I felt that they made the graph into something more abstract than it needed to be - and in doing so, obscured the data from being easily read. For example, his insistence on eliminating wasteful bilateral symmetry results in the oddest alteration of the comparison-bar chart I could imagine (page 101). If we follow his example, are all comparison-bar charts destined to become half-bar charts? Was there no other way to increase the data-ink ratio that wouldn't result in such a foreign-looking chart? In the later redesign of the bar chart (page 126), I was more satisfied with the representation of the bar, but was uneasy with the "white grid" that seemed to turn the bar chart into a stacked-bar chart.

On the other hand, reading through Tufte's book so far has shown me that the art of information visualization has lots of room for improvement, and perhaps what we are seeing is the gradual evolution of the bar chart that is driven by Tufte himself, a known visionary. However, the fact that I haven't seen any bar charts like this outside of his book points towards the contrary. (He mentions this in his Conclusion of chapter 6, but it has been a while since this was published, right?)

One might imagine Tufte to be something of a radical visionary, who drives the field in a certain direction while only hoping at least some of his concepts take hold. I understand and do believe in his principles of data-ink, but perhaps I am not yet ready to see it applied so drastically.

Karl Rohe - Sep 08, 2008 02:23:50 pm

It has been said here that the improvements may not always be better. Maybe. However, the example on page 101 is outstanding. The upper graph is absolutely beautiful. Another example that I really enjoyed is on p116. It is the improvement from the train schedule graphic on the front cover. One of the more shocking things about the display of this graph is that you can put it on a cylinder for easier reading on the edges of the chart.

Has anyone seen this talk at TED?

The speaker, Johnny Lee, takes a wii remote and uses the infrared camera in crazy ways. The last of his examples allows for users to explore 3D space on a 2D monitor (think virtual reality). Has any one heard of using this for data visualization? I am not so great at coding. But, if anyone wants to explore this idea for one of the projects, I would be very excited to help.

NickDoty - Sep 08, 2008 02:33:50 pm

Though I like the general data ink principle, I think Tufte takes it too far or too literally. I really liked his white grids since they made it much faster for me to compare the bars with the grid lines. But I don't think they worked because they were technically erasing a little ink from each bar: if anything they added to the complexity of the chart, increased the busy quality of the graphic, it's just that the trade-off was worthwhile.

Ketrina Yim - Sep 08, 2008 08:02:18 pm

I agree with Nick on this one. Though I do partially concur with Tufte regarding the minimization, if not elimination, of distracting chartjunk, it doesn't seem entirely necessary to go to the lengths that Tufte had. Efficiency can sometimes lead to a loss of comprehension, as is often seen in acronyms, since only those who have learned to use the shorthand can understand it. Reducing the box plot to two lines with a slightly raised line in between may be efficient use of ink, but it is somewhat more difficult to pick out relevant values, particularly if the viewer had no prior experience with Tufte's design.

Also, the data-ink efficient, duck-free approach is not for everyone. In the world of magazines, newspapers, and advertisements, aesthetics are a necessary component in visualizations. They must grab the attention of audiences before they can present their information. I find it difficult to imagine the half-bar chart being used in media applications, at least until the general public enters a phase where this kind of ink minimalism is considered attractive. I know I am still more drawn to colorful diagrams and charts with a bit of duck in them than to line graphs. The ducks may not help convey the message, but they are eye-catching.

Matt Gedigian - Sep 10, 2008 02:24:52 am

@Karl Head-tracking for 3D visualization here.

Matt Gedigian - Sep 10, 2008 01:28:06 am

Tufte's pruning of graphics down to their essential elements reminded me of one of Luis von Ahn's "Games With a Purpose" called Peekaboom. This is a two player game with two distinct roles. One player can see an image an associated label, the other player sees a blank screen. The first player selectively reveals pieces of the image to their partner, who tries to guess the label. I would link to it, but I'm not sure it's around anymore (google search). Although the game is played for amusement, the byproduct is a collection of images with specific labelled content highlighted. Time pressure forces players to highlight the most informative areas. If the images were charts, I imagine players would reveal only a single line of a bar chart, omit bounding rectangles, etc.

With regard to comments about the extremes of Tufte's reductions... Visualizations are attempting to solve many goals simultaneously, so there are necessarily going to be trade-offs. I think what Tufte is trying to do is isolate and clarify one of those goals. There are times when an efficient, miminal chart will be unsuitable and I don't think Tufte would object to violations of 'data-ink maximization' in those cases. But he's trying to establish clear principles that can be factored into the design process.

Nicholas Kong - Sep 10, 2008 01:08:55 pm

I agree with Matt's statement that Tufte is trying to outline a general set of principles upon which one can develop effective statistical graphs, but he makes sure to emphasize that all his principles should be applied "within reason". One example that particularly struck me, as it did others, of non-data ink aiding the graph was the aforementioned train schedule by Marey with the gray grid.

It appears, though, that "within reason" has a wide range of interpretations. I agree with Seth that Tufte's final quartile plot borders on unreadable, the result of over-application of a principle. However, the quartile plots shown on page 125 are, I feel, an improvement over Tukey's.

Tufte also suggests eliminating grid lines for the less complicated graphics, such as the one on pg. 105. While I do believe the aesthetics of the chart have improved substantially, I'd be interested to know whether retaining a grid would aid in making more accurate quantitative comparisons. It would also be interesting to investigate a standard black grid versus gray and white grids.

Scott Murray - Sep 11, 2008 06:40:23 pm

Tufte has many valuable points, but I actually found Norman's re-drawings more interesting, particularly the redesigns of the flight schedules for different audiences. Norman considers our perceptual abilities (and challenges), while Tufte, more often, invents mathematical "laws" that support what he intuitively believes to be true. I am a big Tufte fan, but find his rhetoric overly grandiose. Norman advocates understanding and accommodating human abilities rather than aspiring to designs reduced and reconsidered with Tuftean precision.

Michael So - Sep 13, 2008 12:33:11 am

In the Norman reading, I really liked the example about medical prescriptions. Ruth Day's suggested representation of the medical prescriptions is way more user-friendly for people who are taking medication. Ruth Day's chart is set up in a way that makes it very clear the frequency and the time to take each medication. I think doctors, pharmacists, and any other prescribing physician should adopt using this chart; the list of medication that's usually given, as Norman points out, is inappropriate for the patient's task. It works for the pharmacist who is giving the medication because their task is to provide the medication. The frequency and time of day to take the medicine are not of really much concern to the pharmacist since they're assigned the task of taking the medicine. It just goes to show that a representation of some task is a major factor in how efficiently and correctly that task is performed.

About the Tufte reading, I agree with the criticisms about the revised box plots. It is harder to understand. If I compare the old box plots to Tufte's revised ones, I prefer the old box-plots because I find them more readable and comprehensible.

I do agree with Tufte about looking over one's graphic/visualization and finding which parts of the graphic is data-ink and which part are not. I think it's good to analyze one's graphic because just like editing an essay, you may find things that you want to change whether it be removing unnecessary things, or replacing things for sake of improvement.

Yuta Morimoto - Sep 14, 2008 11:40:06 am

I am sure that an appropriate visualization must make a problem easier and more tractable. However, I do not realize that a good visualization for human is not always good for computer. by mentioning a famous game, tic-tack-toe, Norman shows that an different recognition is existed between people and computer. It shows that the choice of representation changes the task and its difficulty and the choice of representation depends on the knowledge, system, and method applied to the problem. This fact is very interesting, because whether a visualization is good or not depends on people who are going to see it. Thus, we can not make a best visualization deterministically, unless investigating people using the visualization.

Chris - Sep 14, 2008 08:52:12 pm

Regarding Norman's example of the isomorphism between the 15 game and tic-tac-toe. I found myself bridling at his description of tac-tac-toe as being the superior representation. While that particular example isn't objectionable, it can be dangerous to prefer experiential algorithms to reflective algorithms, because it can rob you of a deep understanding of the task being performed.

An example of this from calculus is doing integration by parts by the method of "tabular integration." When I first learned integration by parts, I only learned about tabular integration, because it was the fastest way to the correct answer. It wasn't until some time later that I had to be bothered to learn how to derive integration by parts from the product rule. From an educational perspective, this was putting the carriage before the horse. Yes, I was able to do the problems given to me on an exam, but I lacked an understanding of why what I did produced the correct answer. I had learned how to be a (slow, error-prone) computer, rather than a mathematician.

When it comes to teaching mathematics, there is a very strong argument for preferring reflective algorithms to experiental algorithms. This is something of a strawman, but it shows at least one area in which Norman's simplistic "reflective bad, experiential good" maxim can lead us astray. In fact, it can be something that makes us dumb (which is concerning, considering the title of the book is "things that make us smart").

Witton Chou - Sep 15, 2008 12:07:43 am

Although I agree with Tufte about reducing the data-ink and thus reducing the clutter on a visualization within reason, I'm not exactly clear as to what defines "within reason." I am assuming that this means somewhere along the lines of being able to make the visualization "easier to extract the important data but not too much that it takes away from the original purpose and understanding as used by its target audience." Often times redundancy is important to being able to quickly extract information and sometimes redundancy is a result of using traditional modes of visualization that people are familiar with using, which is why we must still consider the targetted audience in determining what aspects of a particular visualization are extraneous.

Calvin Ardi - Sep 15, 2008 04:14:30 am

@Seth I agree with you there; Tufte seems to be taking what is an already simple tool and eliminating parts which he deems unnecessary. Although the ideas is honorable, in particular, the last example of offsetting the middle half seems downright ridiculous: although I notice the offset and the minor erasure in the middle, I'm more inclined to think that it's a printing error than an actual offset.

@Karl That was an interesting talk that you posted; I watched a few key parts and it's a rather neat concept when you design a visualization with a user or user interaction in mind.

I find that many of Tufte's examples and comments are aimed in particular about the graphics generated by computers. Unfortunately, it seems that graphics (e.g., charts, graphs) are an afterthought in a presentation and aren't planned as well as they could be. With the advent of computers and ease of creating graphics using built-in tools, the styles and techniques Tufte mentions aren't available or used (as far as I know) in widely-used software like Microsoft Excel, Mathematica, or even gnuplot (without extensive scripting or configuration).

Regarding the example on page 102 and the data-ink ratio about the periodicity of properties of chemical elements, I agree that the grid marks takes away from the important parts of the graph, but I think a better solution would be to lighten the the marks. If I wanted to do a quick reference of points, the plus signs make it easier to determine the range of values I'm at.

To me, it seems like Tufte is trying to convince (or rather, tell) the reader that the techniques presented can be applied to every visualization today. Even the box plot, although extreme, can be reduced and simplified even further. According to Tufte, we may not like his revised plots due to their lack of use and exposure. This certainly seems to be the case, but I doubt that the box plot will be replaced by what looks to be a printing typo (If anything, his box plot revision depends highly on the printer and any reproduction to be almost perfect. Photocopies of photocopies would eventually degenerate the visualization to the point where everything would seem unclear..)

David Poll - Sep 15, 2008 05:41:28 am

It strikes me that, between the readings and the comments others have left, it's extremely easy to go in circles about the effectiveness of a visualization (or the effectiveness of a modification to a visualization). I get the impression that when Tufte refers to using these principles "within reason," it's highly dependent on the goal of the visualization. Sure, data can be made less obscure by removing clutter and junk, but it's only reasonable to do that if it helps further the goals of a particular visualization. It seems like the measure of "within reason" should be determined by answering the questions of "what do I expect people to do with this visualization?" and "how does changing/removing/cleaning up this element of the visualization impact peoples' ability to use the visualization?"

Maxwell Pretzlav - Sep 15, 2008 12:10:54 pm

The most striking thing to me about this reading was just how much it affected my approach to design. In creating my (very simple) redesign for Assignment 1b, I was constantly thinking of Data-density and looking for extraneous ink to remove from my design. I can't say I agreed with all of Tufte's points, however. His redesign of a complex data graphic in page 113 removed all scales and associated information, with no clear way shown to recover the removed data. Additionally I agree with Seth and others how is redesign of the box plot is considerably more difficult to read than the original, mainly because it uses both positive and negative space to show relevant (and related) data. This runs counter to how graphs we're used to seeing are made.

James Hamlin - Sep 16, 2008 03:11:35 pm

Quick note: on p. 132 of Tufte, there is nice use of the usually daft one-pixel offset box-plot in the frame of the scatterplot, where it adds some non-intrusive bonus information.

@David: I think you've hit the nail on the head. The data-ink part of Tufte's theory pays little attention to the realities of the human perceptual system or the intended use of a visualization beyond the unspecific notion of 'presenting' the data. What Tufte would call 'chartjunk' could easily be non-data-bearing ink that nontheless bears some of the load of making the data and its interpretation more accessible to a human viewer.



[add comment]