From CS294-10 Visualization Fa07

Jump to: navigation, search

Lecture on Sep 19, 2007



[edit] Readings

  • Visual information seeking: Tight coupling of dynamic query filters with starfield displays, Ahlberg & Shneiderman. (html)
  • Visual exploration of time-series data, Hochheiser & Schneiderman. (html) (pdf)
  • Postmortem of an example, Bertin (pdf)
  • The visual design and control of the trellis display. Becker, Cleveland and Shyu. (ps)

Optional Readings

  • Table lens, Rao and Card, (acm)
  • Human guided search: Survey and recent results, Klau, Lesch, Marks & Mitzenmacher. (pdf)
  • Design and evaluation of incremental data structures and algorithms for dynamic query interfaces. Tanin, Beigel & Schneiderman (citeseer)


[edit] Igor Pesenson - Sep 12, 2007 11:25:33 pm

[edit] Ahlberg and Shneiderman VIS paper

It struck me that the key concepts of VIS

  • vis rep of objects and actions
  • rapid, incremental, reversible actions
  • selection by pointing
  • immediate, continuous display of results

are how humans generally interact with tangible stuff - ie carving a block of wood. We seem to want the same principles in computer data manipulation; this is relevant to my tangible UI class more I suppose.

In creating FilmFinder the authors had defined their questions and data. The created system is then, in theory, going to be used by other people - key assumption is that these people have the same questions and similar data schema in their head. Nothing revolutionary but a revelation for me. From Bretin's perspective defining questions and data are crucial first steps - so is the system still useful should it be the first thing a user encounters? Perhaps this is where Tufte's point that good visualizations make viewer ask questions comes in.

[edit] Hochheiser and Shneiderman Visual Queries paper

Interesting, but perhaps I missed something key. A visual display system where one is presented with the 'master' large plot and is then capable of selecting boxes around different parts of the data and viewing the zoomed in plots. It's an interesting prototype but I can see issues when trying to generalize. For example, just scaling the axes automatically for outliers in plots is a problem. If the data is messy enough the drilling down by boxes wouldn't be that different that in a text input interface (ie IDL, R, etc).

[edit] Bretin - postmortem of an example

I like it - a good example to demonstrate the approach principles.

  1. Define the problem
  2. Define the data table
  3. Adopt a processing language
    1. Ensure maximum visual efficacy
    2. Ensure mobility of the image
  4. Processing the Data: Simplifying Without Destroying
    1. Play with patterns - the eye simplifies
  5. Interpreting, deciding, communicating

Of course a clerk that can do this isn't going to be working for a small hotel for too long. As for the 'simplified' graph the owner planned to present to his employees... it aint that simple. This context aside (small time hotel) I think that is fine - too often it's assumed that any data can be reduced to a readily understood plot. This rather involved plot is very useful and consistent with the principles, but not simple.

[edit] N8agrin - Sep 18, 2007 09:31:32 pm

Visual Information Seeking:

For being developed and published in 1996, the work on this paper seems fairly advanced. From a UI perspective the use of concepts like tight coupling are very interesting for its time. Some of these concepts seem to finally have permeated into web based applications, though it's surprising how often developers leave these guidelines out. From an info vis perspective however, I'm not sure how well the visualization methods chosen work for the types of data discussed in the paper. Clearly, for geospatial data, as in the HomeFinder app, using a map interface makes quite a bit of sense for the user. In the movie finder app, however, the 'starfield' display doesn't seem to encode any particular information other than year released on the x-axis. This seems to be, essentially, a fun interactive interface to a large dataset.

Graphics and Info-Processing:

The points in this paper are somewhat well taken. Clearly, the regrouping of rows seems to highlight particular trends between segments of data, however, there does appear to be at least one discrepancy with the 'South America' row. This row has somehow snuck into the 'Discovery Factors' section, when it does not appear to necessarily belong there. The same holds true for the general reordering of the rows. It seems as though the author suggests that simply looking at overlap and searching for trends is enough to justify the creation of a particular visual order. Instead, I think this is an iterative process, one which requires attempting to fit the visual data into patterns and then justifying their placement there within the context of the data. Following this iterative approach might not have caused South America to be grouped with other dissimilar data.

Visual Queries for Finding Patterns in Time Series:

The title of this paper is confusing to me. I couldn't understand anywhere in the reading where the 'patterns' were being found through the use of their technique. In particular, I thought the idea of applying the timebox technique to transcription factors was a compelling idea, but I did not see any evidence in the paper that they positively identify transcription factors based on the microarray data they presented. I'm probably being a bit too picky, but it seems to me that the 'patterns' they are talking about are self-evident in the data plots even without timeboxes. For all of my criticism, the timeboxes were successful in the TimeSearcher app they developed. Here highlighting a particular time and range, and then selecting a specific group of stocks allows the user to quickly see what is happening with discrete pieces of data during particular points in time. This reminds me very much of Shneiderman's other paper 'The Eyes Have It' where he emphasizes the notion of 'overview, zoom and filter, then details on demand'. TimeSearcher does that elegantly and unsurprisingly, makes the timebox concept utterly useful and provocative.

[edit] Ken-ichi - Sep 18, 2007 11:10:07 pm

Schneiderman's papers seem somewhat dated, even the time-series paper, which is relatively recent. I don't really see how a "starfield display" is that different from a scatterplot, and while constraining data ranges with slider bars and ensuring that inputs and outputs are tightly coupled in an interface might have been important points back in 1994, these days we sort of expect that degree of sophistication from interactive visualizations (well, good ones, anyway). I guess it's good to know about the research underpinning some current designs.

The time-series paper is just kind of perplexing. How do you write an entire paper about drawing rubberband selection boxes on time-series data, especially in this decade? It seems fun and useful, but hardly a dramatic departure from other selection and zooming interfaces.

[edit] James O'Shea - Sep 24, 2007 07:05:36 am

I thought the two Schneiderman papers described some interesting features for interactive displays. Perhaps it goes without saying, but I think the most critical aspect of these systems is the real-time responsiveness of the controls. In particular, the tight-coupling only works because the dynamic queries controls can be instantly updated according to the new constraints selected by the user. Although computers are always becoming faster, making real-time interactive systems easier to achieve on some level, data sets are simultaneously getting larger (sometimes exponentially so), and maintaining real-time control is not always practical with respect to these data. I wonder if researchers have looked into methods for alleviating time-lag issues with large data sets in order to maintain the "feel" of interactivity without completely achieving it. In my opinion, this will always be a concern.

The Bertin paper (chapter?) on transcribing the graphic reminded me a lot of NASA's Challenger disaster assessment. In both cases, data were originally presented in such a way that the intrinsic relationship was hidden, and only after reformulating the presentation of the visualization did the pattern emerge. Given the technical limitations that Bertin faced, he was essentially trying to develop an interactive visualization tool. In this case, the interactivity involved rearranging the rows of data. More broadly speaking, I think he was simply investigating the possibility of being able to explore the data through reformulated visualizations.

[edit] McD - Sep 25, 2007 06:04:50 pm

I think the Bertin example and zipdecode reveal something important to consider when creating visualizations: that the patterns emergent from visualizations emerge from structure in the data. Zipdecode is an example of visualization revealing patterns and structures in data (namely that zip codes increase from west to east in 10 contiguous regions. The vis didn't impose this structure, simply made it plainly apparent.

I struggled with this a bit as I played with datasets and Tableau. I had trouble keeping assumptions at bay and avoiding forcing visualizations that confirmed them. I think that adjusting and being open to changing your mind is an important part of the iteration and refinement that the assignment mentioned.

Also, the Bertin example shows the genius of trellising (much more simply than Becker et al). By comparing multiple dependent variables to one independent variable, we can shuffle the results in any way that maintains the independent dimension, and if they're there to see, patterns will emerge. I also appreciate that for the hotels purposes, continuous and precise data was not necessary, and the above/below average distinction is a great way to simplify.

[edit] James Andrews - Sep 25, 2007 06:23:34 pm

James -- it seems the Time Series system already works with larger data sets than it can directly handle -- above a certain threshold it solves this by representing the data with discrete bins (so instead of lines, you get squares). Binning approaches should be able to work with the same speed regardless of the size of the data set (speed instead is affect by how many bins are used), although preprocessing time to create the binned representation still scales with the size of the data set.

There are, of course, more tricks you could do. For example, if the data were such that details of individual data points were also important, you could present a growing random sample of the data in its full form on top of the binned aggregate representation. Or, if very small local details were important, you could provide a zooming interface which only considers in detail the section of data that it's zoomed in to look at (like how google maps loads in finer levels of detail as you zoom in with it). Beyond that, there's certainly lots of work done in computer graphics for viewing and querying scenes which are too large to fully process, which is a very specific sort of data visualization, and I imagine much of that work could transfer back to non-graphics-related visualization.

As an aside: thinking about this kind of levels-of-detail data reduction might actually help improve visualizations beyond just responsiveness -- I think it's often the case that if a data set is too big for a computer to handle, it's also too big for a human to handle; the whole, unprocessed body of it could be cluttered and overwhelming. The problem of reducing the data into something manageable for the computer could be quite similar to the problem of aggregating data so that the big picture is more visible to a human.

[edit] Robin Held - Sep 26, 2007 12:16:23 am

One of the first things that struck me about the Bertin reading is that the example graphic seemed a bit confusing to me. However, when it is read in the context of the hotel management decisions to be made by the manager, the graphic seems much more intuitive. I think this shows how one should be careful not to judge a visualization without first becoming familiar with its original setting and intentions. For instance, Tufte could easily criticize the diagram for having a data-to-ink ratio that is too low. The data points could be represented as variable-size dots, rather than bars with variable shading. However, the shading, which differentiates between above-average and below-average values, and the use of a bar graph actually works very well for the data. The vertical lines helps one compare values between the neighboring plots, thereby making the interrelationships between the variables a bit easier to explore.

[edit] Athulan - Sep 26, 2007 07:55:01 am

I liked how Bertin made the idea of creating a visual a tangible, physical process by talking about how the mobility of the image is ensured. One thing which was non-trivial when working on Assignment 2 was that I needed to "draw" information from various corners of the data set (which was very large) and put it on one image to show something meaningful. Since the software package was powerful enough it was easy to do this to match the abstract ideas I had in mind. I can imaging how this would have been harder when people were not using sophisticated tools to create visuals. One of the biggest problems could have been precisely what Bertin was talking about - the images were not visual enough and people could not recognize trends or patterns.

When I first saw the visual I was thinking is a matrix of small charts would present the data more efficiently. While this may be the case, it would have still needed a highly mobile image to identify which are the axes of interest. It would have been difficult to perceive which are the best axis to use without quickly comparing them.

On a side note, given how easy it is to create interactive software, I wonder if its possible to create small visuals and just drag them across a screen to see if any sets matched (basically an electronic version of what the assistant did)? This could be a step prior to defining the axes more formally.

[edit] Omar - Sep 26, 2007 09:04:22 am

bertin's example made me rethink the purpose of a visualization, especially a non-interactive visualization. nothing happens with the visualization without someone encountering it and thinking about it -- there is no effective visualization without the consumer. and yet so many visualizations we encounter are finished products: you don't tear the paper, write on it, look at it from some other perspective. sure, cost is probably prohibitive, and so the consumer's mind is relied upon.

but then bertin gives this lovely example taking great advantage of the affordances of paper. i loved how the different parts were put on separate pieces of small paper for easy visual comparison -- in fact, this really got me thinking that the nytimes should print their weather data on easily tearable paper so i could have fun comparing weather in all the different cities that interest me (or they could do this online, which is perhaps cheaper). but other interaction modes with paper-based visualizations should be considered: a sketch area in the corner of a visualization (you can right your thoughts, or doodle some ideas); instructions for how to take apart and compare different pieces of the visualization (beyond the default comparison, if that's all that can fit in the space); 'incomplete' visualizations that ask the consumer to 'complete' the visualization by structuring instruction to get them to think about the final implications.

i was reminded of the importance of active participants after seeing the olafur eliasson exhibit at the sfmoma. highly recommended.

[edit] Amanda Alvarez - Sep 28, 2007 06:33:11 pm

It seems like all the key concepts of VIS can be incorporated in Bertin's broader idea of retaining mobility of the representation: "direct manipulation" with a "visual representation of the world of [rapid, incremental, reversible] action", and "selection by pointing" (or some other physical (non-typing) interaction. The only thing that Bertin's image-files lack is the immediate and continuous display of re-arranged results afforded by fast computers. What really limits Bertin's system from having dynamic queries is the nature of the data table (and he notes this himself). The assistant decided on a single table because the data varied along a common dimension, and things like rooms etc. were not included; multiple tables would have mixed up unrelated problems. So does this mean that by introducing dynamic queries and filters, we have already conceded that the data we are dealing with are not sufficiently homogenous to constitute the answer to one single problem? Seems to be so.

I was glad to see that Bertin pointed out the steps of the decision making process that can and cannot be automated. The parts that can be automated are those which produce all the possible different features and insights contained in the data (both useful and useless), eg. all the lines in the timeseries, the starfield display before filtering. The user needs to define the question and make the interpretation; transcription, reconstruction and mobility are automated, though some intelligent queries from the user might also be necessary. With this dichotomy Bertin has parceled out those steps of the process which just require speed from those which require reflection. (So interaction only benefits from speed, and does not involve (much) reflection...)

[edit] Hazel Onsrud - Sep 30, 2007 06:49:59 pm

In the Bertin article, he states on pg two that “graphics is not an art.” Yet, in my opinion he undermines his argument by noting that anyone can learn to use the strict system of signs to effectively communicate, which, under some definitions, could also be considered an art. Although there are many ineffective ways of communicating, there also cannot only be a single way of effectively communicating and thus, this space left to interpretation is where I would argue the “art” of graphics comes into play.

[edit] Kenrick Kin - Oct 02, 2007 01:33:04 am

I agree that the Visual Queries paper by Hochheiser and Shneiderman seems dated, considering it was written in 2002. Range querying is not a new idea, but I guess it's all about coming up with an interactive way for a user to sift through the data. N8, I imagine where the pattern part of the title comes to play is the amount of data is much too cluttered to view anything, so by culling the data you're not interested in (or rather to focus only on the data that you are), you can try to find patterns in the remaining data. Rather than having to view all 1430 stock prices, you can narrow down to just the stocks that had a price in some price range during some time span (matching your criteria), and view only those stocks through the entire time period and see if there are any patterns there.

[edit] Daisy Wang - Oct 03, 2007 11:32:17 am

We can use dynamic query for user to select, aggregate the source data, and visualize the result. On the other hand, visualization could also help the processing of the dynamic query, and speed up the anaylsis by dynamically viusalize the result processed so far (and maybe the confidence and error range), so that the analyst can do an early stop or re-focus of the dynamic queries they are posing over the data. This work is pioneered by Professor Hellerstein here in UC Berkeley, in CONTROL project.

[add comment]
Personal tools