Introduction to Visualization Software
From CS294-10 Visualization Fa07
Lecture on Sep 10, 2007
- Polaris (extended paper), Stolte, Tang and Hanrahan (pdf)
- Dynamic queries, starfield displays, and the path to Spotfire. Schneiderman (html)
- Chapter 8: Data Density and Small Multiples, In The Visual Display of Quantitative Information. Tufte.
- Chapter 2: Macro/Micro Readings, In Envisioning Information. Tufte.
- Chapter 4: Small Multiples, In Envisioning Information. Tufte.
 Hazel Onsrud - Sep 09, 2007 11:26:09 pm
Although, I thought Tuftes points were clear and consise (as usual), I took issue with the series of photographs of the calligraphic brush and their "magical quality." I found this image to be more confusing at first glance than enlightening. Although, as has been noted by other students, this course makes one overly critical of many visualizations, I felt that this illustration of small mulitples would have been better had the rows where the camera angles were the same were labled as such. One can figure out the realationsihp by the color of ink and the relative angles if one is paying attention, but it is unlike many of the other small mulitple illustrations in the chapter, which, though their juxtapoistion of static angles, and explict differnces do not require such careful examination. Perhaps simply outlining one photo series in one color and another in a different one would effecitvely highlight their differences. Or, it is quite likely there were only two rows of this visualization to begin with, but they had to fit it into the book somehow and so squished the comparison into six. Ah well, every visualization has constraints...and I am being awfully picky.
 Danielarosner - Sep 10, 2007 08:11:40 am
Since Polaris's visual properties are based on Bertin's retinal variables I'm now really curious if there was any scientific reasoning behind Bertin's choices. It seems like each area needs to be broken down into more categories or more refined choices need to be made for the categories to be useful. The Polaris authors were careful to choose distinct properties that don't overlap within each properties set. Color could vary by hue or value but not both. They also reserved red for highlighting, but that choice seems more a bit more arbitrary. Red could be very visible using one color palette but distracting or hard to see when using another.
 Andrew McDiarmid - Sep 10, 2007 10:23:34 am
As smug as Tufte can be, I appreciated the distinction he made in discussing serif v. sans-serif fonts between ornamentation and differentiation. I think it can be easy to take his less is more mantra too far, and it's important to note that a line's purpose graphically might not be immediately apparent. Some 'unnecessary' ink might well be more necessary than assumed at first glance.
I was frustrated by the Schneiderman paper. I felt one of the most important points was the alphaslider, and he did not define. What is it? To assign many records to a single pixel on a slider seems worth explaining. Perhaps this paper was intended as a plenary piece addressing those who are familiar with the technology, but I as a relative newcomer got almost nothing substantial out of it.
 Jack Schonbrun - Sep 10, 2007 05:04:43 pm
I just want to encourage you all to ask me any questions you might have about Spotfire. Again, to cover most of the features in DXP normally takes about 3 full days of hands-on training. So we clearly did not cover everything today. I did want to emphasize a few key pointers:
Data Access. File -> Open (.txt, .csv, .xls) ,or Copy -> Paste. Your data should ideally be formatted with a single row of column names, and consistent types within each column.
Right Clicking. Right clicking gives context sensitive menus everywhere. Many features can be discovered this way
Properties. This is where you can control all aspects of each visualization. Many of the parameters for making your visualizations effective will be accessed through this menu. You can get it by clicking on the check-mark in the upper right hand corner of each visualization. (Or Right-Click -> Properties...)
Details Visualizations. You get the "details-on-demand" table automatically. But it can be powerful to have other kinds of visualizations on demand, based on the records you have marked. The easiest way to set this up is by right-clicking within a visualization and selecting Create Details Visualization. This new visualization will only show the data you have marked.
Exporting. If you want to share your final visualizations using a tool other than Spotfire, you can export pages or individual visualizations to bitmaps, Windows Meta Format, and PowerPoint slides. To export a single visualization, make sure it is active, by clicking on it. The export menu is accessed from File -> Export.
Again, don't hesitate to contact me with any questions. Email is probably best, because I travel a fair bit. But I am actually based here in Berkeley.
 Robin Held - Sep 11, 2007 01:25:15 pm
First off, I definitely agree with Andrew's complaint regarding the Schneiderman article. He seems to be writing for an audience of colleagues already familiar with all of the cited work. As a result, he doesn't offer adequate descriptions of many of the interfaces he mentions.
I particularly wanted to learn more about the organizations that were resistant to the new interfaces, including the real estate finder. Schneiderman states that the groups not only refused to share information, but were uninterested in learning about the new software. Why? Perhaps the real estate listings are proprietary, in that people who publish or otherwise access it must pay a licensing fee to the group who assembles all the entries. Schneiderman needs to give the reader a better idea of the organizations' reasoning for refusing to cooperate. Since all visualization software requires input data, the issue seems especially relevant to our discussions.
 Amanda Alvarez - Sep 16, 2007 10:01:28 pm
Tufte's Shrinking Principle produces small multiple narratives that possess great data-ink efficiency. But I really don't think the resulting small multiples lend themselves to 'overview, zoom, filter, details-on-demand'; that sort of interactivity requires more non-data-ink than Tufte could handle. For static printed charts, small multiples are great. Density can be misleading though, especially if the data itself do not mirror this density: "The representation of numbers, as physically measured on the surface of the graph itself, should be directly proportional to the numerical quantities represented."
 David Jacobs - Sep 17, 2007 01:02:23 am
Amanda: What do you mean about density being misleading? Surely any uniform scale preserves the relationships of the numbers behind the graphic. I suppose you could make an argument that the visual density doesn't scale linearly. Remember that people aren't that great at comparing areas, and the negative space between data certianly qualifies as an area. If this is the case, does that mean that presenting the same graphic at different scales leads people to perceive their meaning differently? What then should be the "best" scale at which to present it?
 James Andrews - Sep 17, 2007 04:27:19 am
David: Assuming the important quantity is a ratio between areas, it doesn't matter what scale you choose for looking at the images. Recall that the equation is perceived_magnitude = k*actual_magnitude^n. So if we have two actual magnitudes in ratio of 3:1, the perceived ratio is k*(actual_magnitude1*3)^n : k*actual_magnitude1)^n = k * actual_magnitude1^n * 3^n : k * actual_magnitude^n = 3^n : 1, which is independent of the actual magnitude.
Your suggestion of evaluating the negative space around the data points is somewhat troubling -- in most scatter plots, I would think that this is simply a bad idea at any scale, since that negative space is likely to be somewhat arbitrarily determined by the size of the data points even on a static chart. The sizes of the data points would need to be chosen such that the ratio of negative space to non-negative space was an accurate reflection of the data.
 Athulan - Sep 17, 2007 09:53:18 am
I really enjoyed Tufte's examples on Small Multiples. I try to use this type of visuals whenever I can, especially to show the evolution of some trend or pattern. But one potential problem with them is that the frames need to show an easily perceived difference (rather, a preattentive difference). Without this using small multiples may actually lead to less clarity as the viewer spends more time searching the frames for the changes instead of looking at the changes themselves. Also, if the changes are also relatively minor over the scale of the frame, it will make the frames hard to read. One example is the Railroad visual from pg. 68 in Envisioning Information. I had to scan the images intently to notice the color and positional difference of the little circles in the trains (especially for the gray circles). That all the trains were identical did not help and they actually detracted from the differences between them.
 Daisy Wang - Sep 17, 2007 08:30:39 am
Both Spotfire and Polaris are fantastic work. Two works are targeting at different applications, thus using different visualization approaches. Both works are targeting at performing data query, anaylysis and visualization at a high level of abstraction than the database SQL language could provide. So that non-computer-science/engineering people can also play with data.
Spotfire is an earlier attempt, which allow pre-defined handlers to visualize and query the data. All the knobs that it can changes can be realized by prepared statement in database. On the other hand, Polaris is a later attempt, which allows a lot for freedom, and incoporate automatic choice of visualiazation and on-the-fly generation of query. In addition, Polaris is targetting at large multi-dimensional data.
I would say Spotfire is more light-weight, which is suited for small/medium size business/coorporations for simple applications. Polaris is more heavy-weight (a lot more smarts includes automation), which is suited for large size dataset with complex/real-time analysis tasks.
I am interested to find out the performance of both systems.
 N8agrin - Sep 17, 2007 12:38:40 pm
I appreciate Tufte's point about small multiples, and found myself naturally scanning and comparing each multiple from his examples, contrasting their differences. The one example I found to be a bit confusing was the detail of the train and the train lights. I think that the train visualization would benefit from having the outline of the train deemphasized in contrast to the elements of each train that differ. This would be inline with Tufte's general principle regarding graphs, that the underlying grid need not be displayed or displayed prominently.
The papers about Spotfire and Polaris were interesting and provide good touchstone resources for remembering various visualization packages, as well as to trace the paths others have taken before us.
 Ariel Rokem - Sep 17, 2007 09:01:37 pm
A comment about Tableau's line graphs: from the demo in class, it seems that Tableau's default for line graphs is a line with no ticks or markers. That is a problematic way to create line graphs since it doesn't explicitly specify what part of the line is based on measured data and what part of the line is an interpolation
 Mark Howison - Sep 18, 2007 06:03:40 pm
I also had trouble initially determining what the train diagram was showing. One thing that occurred to me was how it would be perceived differently if it were shown as an animation rather than a small multiple. That is, by making each multiple a frame in an animation. Would this more readily draw out the differences among the multiples since it would require less scanning across different regions of the graphic?
 James O'Shea - Sep 18, 2007 09:26:36 pm
I thought it was interesting that Tufte brought up the Vietnam War Memorial in Washington, DC as a good example of a micro/macro visualization. Throughout the readings and class so far, I have mainly been thinking about visualizations with respect to scientific data, maps, quantitative information, etc. I forget that something like a memorial or a piece of art can also be categorized as a form of visualization. Often, they may be visual representations of abstract or intangible things, but the Vietnam Memorial clearly shows that they can also be inextricably tied to large data sets (the names and dates, in this case). As mentioned, one could simply have a book of names and dates to explore the data, but the memorial presents this data in a way that makes a statement and provides an overall sense of the scope and impact of the war.
 Karen Hsu - Sep 29, 2007 11:17:55 pm
Not to overkill on the ineffectiveness of the train visualization presented on p. 68 in Envisioning Information, but I too didn't feel it to be a strong example using small multiples. To help draw our attention to the signaling lights, I agree with N8agrin that the outline of the train must be deemphasized. Furthermore, as Athulan pointed out, there is a problem with the tiny size of the signaling lights relative to the entire frame. The colors' shade could also be more eye-catching; the yellow lights nearly blend in with the background and require a doubletake to notice. Lastly, I think there should be some division between visualizations of the forward end and the rear end of the train, as this distinction is not at all clear.
For my bad visualization in assignment 1, I chose an advertisement for beef that poorly (and by poorly, I mean utterly incorrectly) visualizes the bodily benefits of beef consumption. In a redesign, one approach would be to use small multiples to illustrate the body areas benefited by each of the listed beef-abundant nutrient. The body outline per frame would remain the same, but the highlighted areas would change per nutrient.