Introduction to Visualization Software

From CS294-10 Visualization Fa08

Jump to: navigation, search

Lecture on Sep 15, 2008

Slides

Contents

Readings

  • Polaris (extended paper), Stolte, Tang and Hanrahan (pdf)
  • Dynamic queries, starfield displays, and the path to Spotfire. Schneiderman (html)
  • Chapter 8: Data Density and Small Multiples, In The Visual Display of Quantitative Information. Tufte.
  • Chapter 2: Macro/Micro Readings, In Envisioning Information. Tufte.
  • Chapter 4: Small Multiples, In Envisioning Information. Tufte.

Simon Tan - Sep 08, 2008 12:59:21 am

Macro/Micro Readings in Tufte's Envisioning Information reminded me of zoomable UIs. Similar to a good high-data-density visualization, zoomable UIs provide for an insightful experience at both macro and micro levels. The difference, it seems, is that zoomable UIs require the user to use some control to affect the zooming manually, while a single data visualization would provide for both macro and micro readings simultaneously as to let the viewer's eyes do the zooming on their own accord.

What would Tufte say to such interfaces such as Live Maps, Silverlight's Deep Zoom (powered by Seadragon), and other such zoomable UIs? The density of data is certainly there, while also providing for more personal micro-readings at high zoom levels. The smaller details of the data (especially with Maps) fade away at far zoom levels; hence the system is controlling the layers of contextual reading instead of the viewers' eyes.

Ketrina Yim - Sep 15, 2008 04:30:01 pm

I can't help but notice that Spotfire and Tableau are both limited to producing 2D visualizations (though you can put multiple variables in each axis). Is there any software out there for generating 3D or even 4D (when time is involved) visualizations? These could be useful for visualizing data that is not easily represented by bar graphs and plotted points or lines in flat 2D, such as the position, altitude, and shape of clouds over a state during a 24-hour period.

Also, it seems for certain datasets, animation could address the issue of change blindness in small multiples. Like the blank frame that somehow manages to prevent people from seeing the difference between two very similar images, having to move the eye over to the next image in the set of small multiples creates a subconscious blank. With images flashed immediately one after the other, animation can eliminate the blank and emphasize the changes in the data of each multiple. However, it seems highly unlikely that anyone will be handing out visualization flipbooks during a presentation.

Dmason - Sep 17, 2008 11:24:33 am

If you examine the original article http://www.psych.ubc.ca/~rensink/publications/download/PsychSci97-RR.pdf you'll find some very interesting effects on change blindness. For instance, location changes were by far the least obvious to viewers, which I found surprising. I was suspecting during the lecture that this blindness had something to do with with the change being in the central focus of the images or in the background, but the paper indicates that while it is a factor, it only accounts for about 50% of the blindness. Also, verbal cues seem to have a tremendous effect in shortening the time to identifying the change. (There are some excellent graphs -- check it out!).

It is worth noting that the key ingredient is the flash and not how long the image lingers. In fact, the latter has no effect on the blindness, but as we would suspect, increasing the flash does.

Razvan Carbunescu - Sep 17, 2008 12:54:25 pm

With respect to possible data sets for analysis especially for assignment 2 I think one of the biggest datasets out there that is available is from the census bureau. I found this data while looking for my own data for assignment 1 and it provides allot of options down to a county level.

Census Bureau Data

Matt Gedigian - Sep 17, 2008 01:20:34 pm

There was a nice post about using small multiples on signal vs noise here.

Sarah Van Wart - Sep 17, 2008 02:02:46 pm

To remark on Tufte’s discussion of small multiples, I've seen many good examples of small multiples that enhance a viewer’s ability to make sense of geographic data. By grouping layers of geographic data into distinct categories – wetlands, habitat, population, archeological, streets, etc. – the cartographer can give a single area many different profiles, relate spatial data by proximity, and easily reveal relationships that might otherwise remain hidden. As Ketrina mentioned, I agree that a very helpful substitution for small multiples would be some sort of animation or interaction, where a series of snapshots could be presented either in some sort of sequence, or on demand. Most interactive mapping software already has this feature and enables users to interact with spatial data "layers," turning them on and off on demand. It seems that if data needs to be overlayed in order to be compared, an animation or interactive component would be a useful analytical tool (though of course, there are many other situations in which animations / interactions are useful too).


Also, I have a question for the class: has anyone had any experience in asking individual institutions for data sets, and if so, how did you go about doing it and do you have any tips? It seems like people can often be quite protective of their datasets, but I feel like finding a more local or less accessible dataset seems kind of interesting. Please share any anecdotes.

Seth Horrigan - Sep 19, 2008 01:10:13 pm

What all Tufte was trying to communicate with his small multiples chapter was not entirely clear to me until I read the first chapter of Envisioning Information. There, he introduces small multiples with a Japanese Color Coordination chart using a child's A-shirt over a child's T-shirt. This visualization clearly allows you to compare the differences that color make irrespective of the garment since all of the illustrations are rendered the same. With that introduction, the emphasis that consistency of design and co-location provides because readily apparent, and the objectives of small multiples becomes clearer. I found certain parts of that chapter especially interesting, such as the Mural with Blue Brushstrokes and the illustration of the problem of placing the map of distribution of temples of Matsu. As with any other aspect of visualization, the other rules still apply though. Placing the multiples in a distracting grid reduces their effectiveness substantially, just as placing a short while screen in between the images we saw in class makes identifying the changes much more difficult.

Nicholas Kong - Sep 19, 2008 11:09:31 pm

I had not encountered stem-and-leaf plots before the "Micro/Macro Readings" chapter in Envisioning Information, but they really struck me in how they effortless overloaded variable-to-position encodings. The Japanese train schedules on pgs. 46-47 I found particularly striking, for one because they were able to show both running times and frequency simultaneously. If my very limited Japanese is correct, the little characters further encode properties about the trains; from what I can deduce the characters denote directions or lines while the shapes denote the type of train. I think the design decision to specially mark out the limited express trains in circles was wise, in that a pre-attentive visual encoding allows for quick identification of the fastest trains. Given the year (1985) color may not have been an option, but I think color would much better encode the line direction than the small characters, as it is done in most transit systems today.

There's also a connection between Tufte's statement about small multiplies that "Comparisons must be enforced within the scope of the eyespan" (emphasis his, Pg. 76) and change blindness. Needing to flip the page to compare charts probably invokes change blindness just as we saw with the gray screen in between the changing images in class. I wonder if this could be extended to referencing images in text: if one has to flip multiple pages to reach an image, does one lose a little comprehension due to the interrupting process of searching for the image? This is a different concept than change blindness, more about interruption of action, but it would still be interesting to explore just how much comprehension, if any, is lost. Part of the reason Tufte's books are so visually appealing, I find, is his placement of all his images in close proximity to the referencing text. That the text is so approachable is partially due to this.

I also found the "Path to Spotfire" article quite interesting as a history from idea to product. However, it wasn't immediately clear to me what exactly was meant by "alphaslider", so I delved a little deeper and found this paper (obtainable through your favored library access method). The concept of many items to a single pixel seemed like an excellent one, and in fact after skimming the paper it turns out that the scrollbars so ubiquitous in today's windowing interfaces are themselves alphasliders: they provide both coarse and fine tuning controls, and were specifically tested as a possible slider alternative.

Maxwell Pretzlav - Sep 20, 2008 01:10:40 am

What I found most fascinating about this latest set of Tufte readings was not actually what Tufte said but how he executed it. I realized that Tufte's writing and design style changed significantly between The Visual Display of Quantitative Information and Envisioning Information, namely it seemed to me that Envisioning Information was written with the page layout directly in mind, while The Visual Display... was written more like a traditional book, with the layout coming after writing. It seems Tufte has been following his own advice, as his writing in Envisioning Information appears to have been carefully edited and crafted to fit perfectly with the images and subsequent layout (see pgs. 73 and 61 for good examples)—this precision is quite impressive when contrasted with the layouts of many of the scholarly papers we have been reading which reference figures sometimes as much as four pages later in the document.

Ljuba - Sep 20, 2008 05:02:33 pm

I'm a little surprised by the lack of Mac software for data visualization. I would think that the preference of Macs among graphics types would lead to some pretty great software packages for data visualization.

The one piece of software that I do know is called DataTank [1]. The application is extremely robust and can handle all kinds of data (there are about 1700 data modules for it). Unfortunately, I don't think it's being worked on as actively as most commercial software since it is the pet project of a professor at the University of North Carolina at Chapel Hill. He's a really nice guy and there's probably a serious student discount for those interested. I think the software is worth a look.

Michael So - Sep 20, 2008 09:54:08 pm

The Micro/Macro designs in the Tufte reading seems to relate to the Detail-On-Demand feature in visualizations. They give you a broad overview that depicts the whole dataset, which I guess is to give you an idea or estimate on how large the dataset is, and a "details-on-demand" feature where the user can choose a specific group of data or an individual data element to analyze or study. The example of micro/macro design in the Tufte reading that I liked the most would be the Vietnam Veterans Memorial. Looking at from a distance, you get an overview of the whole dataset; as you get closer, you get start getting your "details-on-demand". And I think the memorial is also a good example of how the micro information, which in this case are the individual names on the memorial, can be personalized. Each user has their own individual questions and tasks, so giving a "details-on-demand" feature (or micro-information) I believe will satisfy the user's goal(s).

Another good example of micro/macro design that I really liked were the Japanese train schedules which used a stem-and-leaf plot structure. I never really considered steam-and-leaf plots as a micro/macro design until this reading. The stem-and-leaf plot gives a visual measure and distribution of the dataset as well as the micro details such as individual departure times for the trains. I find those train schedules to be really effective at the task of finding departure times and it's really nice how it saves and makes efficient use of ink.

Chris - Sep 21, 2008 12:24:59 am

I found the example of the Dighton Writing Rock (in Micro/Macro) particularly interesting. I think the two images of the rock -- the "whole picture" one on page 72 and the "ghost" one on page 73 complement each other very well in that they are useful to iterate between when trying to extract more understanding about the source (particularly were one in the absence of the source). For example, in trying to decipher a given symbol (say, the "ghost") from a number of representations of it, the "whole image" corresponding to each different "ghost" image can help one distinguish between what elements of that image are stylistic (that are pervasive differences between the representations) and what elements are intrinsic to the "ghost." On the other direction, differences between two different "ghosts" can help to amplify the stylistic differences one sees in the "whole picture" image.

What was with the first paragraph of page 50 in Envisioning Information? Was that the million-plus-one synonyms supposed to be an example of verbal chartjunk?

James Hamlin - Sep 20, 2008 04:49:07 am

It's interesting that on page 51 of Envisioning Information, Tufte explains that the equation "simpleness of data and design = clarity of reading" is false. "Simpleness is ... not an information display strategy, not a guide to clarity." The data-ink theory, or at least Tufte's own interpretation of it in The Visual Display of Quantitative Information, certainly seemed to make simplicity (minimality) of the visual mapping apotheosis. This chapter served as a good clarification of Tufte's position.

Scott Murray - Sep 21, 2008 02:02:08 pm

There's no question that Tableau and Spotfire are light years beyond, say, Excel when it comes to exploring different visualizations of a given data set, but they both come with limitations. As discussed in class, the visual forms output by each application are limited to what the developers (people with no familiarity with your particular data set and end goal) have conceived and included in the program. Bar charts, scatterplots, and basic geographic mapping are helpful, but do not begin to approach the Tuftean ideal of designing and refining a unique visual form optimized for every individual data set. (It may not be possible to encapsulate Tufte's—or any other designer's—skill and experience into a design pattern algorithm.)

The visual forms are limited, but so are the types of data accepted by these applications. They do well with a narrow definition of data: numeric values, categories and colors. But what about a broader definition of data that includes things like geospatial elements [2] or even 3D objects [3]? It seems these tools are limited to creating traditional data graphics with traditional data sets.

Matt Gedigian - Sep 22, 2008 02:16:22 am

We've learned about various types of visualization tools. Some of them use information about your data model and do mappings to suitable visual encodings. Some let you programmatically describe the mappings into any sort of visual representation. Some allow you to interactively adjust the visualization. Since I haven't tried them all, I can't say which features and limitations actually matter most to me in practice.

But the limitations that @Scott mentioned above also concerned me. It would be nice if the tools like Polaris and Tableau could generate code which could be altered externally. That way you could do the type of data manipulations that Polaris handles or the drag and drop design that Tableau allows but instead of exporting a graphic, you'd click a button to generate some Flare source code. Just because you are going to need the full flexibility of Flare eventually, doesn't mean you shouldn't be able to start with the interactive GUIs that these other programs offer. To use a bad example from last decade: I want to be able to layout my website with Microsoft Frontpage 98, even if I later decide to tweak the frameset dimensions in Notepad.

David Poll - Sep 22, 2008 04:26:10 am

To me, the biggest advantages of software such as Spotfire and Tableau are that one can use them to build a visualization, rather than being force-fed one. I loved the "path to Spotfire" paper, because it inadvertantly describes the logical leap from "interactive visualization" to "on-the-fly visualization construction". Spotfire simply takes the notions of sliders and buttons to filter and illuminate data to its logical conclusion: generation of visualizations by the user as appropriate.

I'm a political junkie, and I love reading polls, looking at statistics/trends relating the economy to spending/tax cuts, and so on. I'm just getting to use them, but these visualization tools are changing the way I think about how data in general should be presented. It's getting to the point where I'd much rather have a Tableau-like widget on a webpage with a default visualization that I can then go customize than to be stuck with whichever static (or even slightly interactive) visualization that article's author chose for me. In the past, I've gone searching for excel spreadsheets with historical data just so I could make my own graphs and analyze the data myself. I still want my simple, clean, clear visualization, but give me the option of drilling down into it however I want! Without that, it's far too easy to make the statistics lie (which is part of why reading political/economic articles can be so frustrating -- you only ever get a tiny little piece of the picture).

Witton Chou - Sep 22, 2008 12:24:57 pm

Tufte brings up some good examples and reasoning of why to condense visualizations. However, I think the use of small multiples is more situational, sometimes unclear, and can contradict the data density concept. My reaction to the chromosome visualization was one of confusion. Had an explanation not been given as to what data is being represented and compared, I probably would have never understood what it aimed to detail. Even now I don't understand what I should be focusing on when comparing the various chromosome strands. And the second cromosome comparision is very confusing.

I never really appreciated stem and leaf plots before when I learned about them in Jr High/High School math. To see its application in a real time table for a Japan subway system is refreshing. I love how I can immediately see the frequency of trains within an hour and easily reference when they arrive/depart. The design process that goes in to yield different micro/macro readings is phenomenal and very fun to observe in action.

Jeff Bowman - Sep 22, 2008 02:00:03 pm

One of the most interesting things about the concept of dynamic queries is how avidly we use them today. One of my favorite mashup sites, housingmaps.com, uses the ability to dynamically filter and search. From autocomplete to iTunes search-as-you-type, the concepts of instant gratification and filtering are quite actively in play in a regular basis right now.

Yuta Morimoto

After reading "Macro/Micro Readings in Tufte's Envisioning Information", I think of a function of zoom in/out in cameras and maps. I think they are the instance of utilizing zooming to watch something more detail. In the context of Google map, we would like to search the location for we want to go or to know, and in some case we can change interactively the visualization with using slider. So, nowadays most interactive map already has the feature like Tufte mentioned.

I almost did not think of Japanese train schedules. Is this because I am Japanese? Come to think of them, they include preattentive visual features in many ways allowing us to quickly identify the trains. However, you catch train schedules easily in Japan, you may ride wrong train in the situation of transfer. Since Japanese subway map is so complex that even local people can not get to the train they want to ride on.

Tokyo Subway Route Map http://www.tokyometro.jp/rosen/rosenzu/pdf/rosen_eng.pdf

NickDoty - Sep 24, 2008 01:32:14 am

I'm suspicious as some others seem to be of the use of small multiples.

For example (although we didn't talk about it until the Interaction class meeting), I was unimpressed by the small multiples used for Fisher's wheat growing diagrams. I wasn't able to draw comparisons very quickly (which appeared to be the cause of the problem that had gone unseen for such a long time). And when the two datasets were plotted on the same axes the effect was immediately clear. I fear that the popularity of small multiples may lead designers away from plotting two sets of data (differentiated by color or shape or what-have-you) on the same grid. (In Tufte's defense, I don't think any of his examples have made this particular mistake, but I would have appreciated an explicit warning.)

Calvin Ardi - Oct 07, 2008 04:30:53 am

Chapter 8 in The Visual Display of Quantitative Information presents some reinforcement of some of the earlier chapters and the idea of small multiples. I was drawn to the extremely detailed graphic of France (page 166) and I would have liked to see a larger map of the galaxies (or at least a reference to the original graphic). Like some of the comments above, a zoomable UI is a perfect fit for these sorts of high data density graphics. Although Tufte states that higher data density is always better, at some points there just might be too much data in the graphic. The bar graph on p163 certainly illustrates the other extreme of not enough data density but some of the examples that are "in-between" could probably be found in some of the earlier chapters.

Tufte's other chapters in Envisioning Information present more examples and techniques for graphics or data with a large amount of related data. I thought a few examples stood out; notably the Japanese train timetable (p46) as opposed to the one on p47. The data presented is two-fold: the exact time of when a train is scheduled to arrive, as well as the frequency of trains per hour. (Unfortunately, this sort of thing wouldn't work with AC Transit in Berkeley as the buses are generally late most of the time). As said above, the level of detail and complexity would do well with digital graphics, which are now capable of displaying even more bits of information (it would be interesting to have a similar graphic of space debris around the earth visualized in Google Earth or something similar).



[add comment]
Personal tools