Introduction to Visualization Software
From CS 294-10 Visualization Sp10
Lecture on Feb 8, 2010
- Polaris (extended paper), Stolte, Tang and Hanrahan (pdf)
- Dynamic queries, starfield displays, and the path to Spotfire. Schneiderman (html)
- Chapter 8: Data Density and Small Multiples, In The Visual Display of Quantitative Information. Tufte.
- Chapter 2: Macro/Micro Readings, In Envisioning Information. Tufte.
- Chapter 4: Small Multiples, In Envisioning Information. Tufte.
Jeffrey Patzer - Feb 06, 2010 02:03:17 pm
Tufte (Envisioning Information): The most interesting statement in the whole chapter on Macro/Micro was the statement "to clarify, add detail". Tufte later elaborates on the subject that making a detailed visualization that is hard to read is a problem of bad design, not too much data. Rather when people make diagrams that are full of data and hard to read, its their design that lacks, not the data. I think what Tufte is really try to say is that creating a story with data is not easy, but if done correctly creates an interesting visualization that pulls the viewer in and fully utilizes the human visual system. This makes me think that creating a visualization is not so much an organization of data into something readable, but into a story that makes the graphic didactic and interesting.
Danielle Christianson - Feb 06, 2010 07:19:30 pm
Polaris: I was a bit worried about the flexibility of prepackaged interactive visual data exploration program given the previous readings, but I am really impressed with Polaris. I'm interested in the method for determining the graph type -- upon first thought, I didn't like this lack of user choice, but maybe this could be beneficial for a quick first look. It seems like it might be a little annoying when performing some of the sorting and filtering options since the data-type can be changed (resulting in a change in the graphic type). Seems like Polaris is the basis of Tableau?
Shneiderman: Nice to have this history of development. I thought the idea to first return the size of the result set of a dynamic query to be clever given that its time saving importance is easily overlooked.
Tufte: I really like the idea of having multiple layers of detail in a static graphic so that the user can explore the data at multiple levels. As Tufte states this gives a control to the viewer and allows the data to be interactive. I think these types of multilevel data visualizations work well because the exploration of the data creates excitement -- it's a complex story and the viewer is allowed to figure out the story for him/herself.
Jon Barron - Feb 08, 2010 01:39:50 pm
This chapter suffers from the fact that I'm reading a PDF of this book, and so found myself zooming into these small/dense graphics to get a clearer look. Though density seems like a reasonable thing to discuss in print media, digital media (even "static" digital media, like a PDF) can always be dynamically rescaled, rendering notions of density and area fairly meaningless. Perhaps a better goal would be a visualization whose complexity or information content increases as the user zooms in. Or perhaps we need to be more precise when discussing our goal, and talk about minimizing unnecessary negative space that reduces from the effectiveness of the graphic at all scales, for example.
I *really* like stem-and-leaf plots. I think they capture what I want from a visualization in terms of coarse-to-fine structure, in which squinting gives you one perspective, and zooming in gives you another.
Chetan Nandakumar - Feb 09, 2010 08:18:32 pm
I really appreciate the micro/macro concept. The idea that data is hierarchically depicted so that the user can easily make sense of it at different levels of granularity. This provides a powerful paradigm to communicate a large corpus of information. A big challenge here is figuring out how to showcase multiple storylines in the data. The difficulty lies in that the louder more obvious relationships pop out first and hold the attention of the viewer.
Arpad Kovacs - Feb 10, 2010 12:37:02 am
Polaris: This seems like a very user-friendly system for dynamically building queries to visually explore relational databases. Unfortunately, Polaris seems to have a few limitations, in particular it can only display 2 quantitative variables at once, and I imagine that there is also an implicit limit to the number of ordinal-quantitative variables that can be encoded (since it is probably difficult to visualize more than 3 or 4 SQL JOIN operations). I think that this could be mitigated by a 3D scatterplot rendering, which would allow display of an extra quantitative dimension, and perhaps other creative approaches such as varying the size of data points (although this could reduce perceptual accuracy compared to the existing 2D approach). Overall though, I am impressed that Polaris managed to implement a very useful subset of the SQL language in a quite elegant manner.
Spotfire: What I found most interesting was the evolution from the very compact, but perhaps less accurate alphaslider-based interfaces in HomeFinder and FilmFinder to the more expansive conventional select-list boxes, checkboxes, and drop-down lists in the Baltimore Learning Community and NASA EOSDIS systems. It seems that the range-based alphaslider (eg two arrows pointing inwards that define a range) is a very efficient widget for narrowing down the set of available choices in an extremely large pool of quantitative or ordinal values, but would be cumbersome for selecting a discrete value among a few dozen nominal choices (in particular, how would the available choices be sorted... alphabetically?). Perhaps an increase in screen sizes and mouse precision allowed designers to display a greater quantity of the available choices at once, and therefore obsoleted the abstract alphaslider (the knobs represent your position in the alphabet) in favor of the more checkboxes and lists, which allow the direct selection/deselection of specific values.
Tufte: So far I have found Tufte's data-ink maximization advice to be insightful, but I think that his call to shrink graphics to half size to maximize "data density and the size of data within reason" is going a bit too far. Coupled with his advice to use smaller dots for scatters and thinner lines for time-series, this just seems like a recipie for eyestrain. From the other side, extremely large graphics have a physical presence that calls attention to themselves and their data in a way that smaller visualizations cannot match.
Stephen Chu - Feb 10, 2010 06:10:17 pm
Tufte shows the importance of analyzing ordering in visuals. The default ordering systems (high to low, low to high, alphabetical) are not always the most appropriate organization strategy. He uses the Vietnam Veteran Memorial as an example. Had the creator chosen to alphabetize the names of veterans, the grouping of similar & duplicate last names would have undermined the significance of each lost life.
Tufte states, "Simpleness is another aesthetic preference, not an information display strategy." I've always been a fan of keeping visuals simple, but I understand Tufte's point. The complexity of the actual world cannot be fully expressed by simple means, and by hiding or filtering the complexity, we lose much information and detail that can be very valuable.
Stem and leaf plots. I haven't worked with those since elementary school.
Zev Winkelman - Feb 11, 2010 12:54:39 pm
As I was reading the Polaris paper I thought that it sounded a lot like Tableau - makes more sense now.
I've built and used similar things and find them useful, but sometimes question the business model.
How much will people pay for excel on steroids, especially when free products such as open office are nipping on excel's heels already?
Also found myself wondering how it would handle various challenges involved with databases. Anything in 3NF might be difficult to work with if it doesn't automatically resolve pk/fk values (if these are even coded in the datasource - sometimes they are not).
Some of this was clarified in class when it was explained that there was a design assumption made that most of the business logic would be on the data source side presenting the software with a clean denormalized interface.
I did find the application of Polaris to source code profiling very interesting (11).
Didn't get a ton from the Spotfire paper other than the history behind the development of the business, but that's ok.
Jonyen - Feb 12, 2010 09:06:42 am
Dynamic queries, starfield displays, and the path to Spotfire:
The old days of command line interfaces and submitting queries to databases are passing quickly. In their place are dynamic queries and starfield displays that update a two-dimensional graphical display in 100 milliseconds. As users adjust sliders, buttons, check boxes, and other control widgets the starfield display containing color- and size-coded points us updated rapidly. Users feel they are in control and there is no more "RUN" button.
It's really amazing how far along we've come in terms of computer processing power. We don't need to wait for the information to process anymore, but rather it will change as soon as we want to find out more information about something. I think the processing of information is a very important aspect to consider in terms of UI design and info visualization. It's frustrating to have to wait for information to come up. That's a consideration to keep in mind when it comes to interactive visualizations as opposed to static visualizations.
Prahalika Reddy - Feb 13, 2010 08:42:12 pm
In Tufte's chapter on Data Density and Small Multiples, he encourages using graphics with a high data density. While I agree that having sparse graphics, like some of the examples shown in the book, with very low data densities are not very useful, I don't fully agree that having graphs with the highest possible data density is always the best thing. Tufte talks about how maps have incredibly high data densities, but I feel it's impossible to achieve such a high data density in any graphic that requires any sort of labeling. In fact, even the map that is shown in the book doesn't show anything but lines and intersections. The concept of small multiples is pretty interesting; often, I feel like it's like showing a trend in a trend.
Tufte's chapter on Micro/Macro Readings is also very interesting. The images shown in this chapter are good examples of graphics with high data densities, as mentioned in the other chapter. It still seems like the data points lack a sort of label, but the graphics are amazingly effective regardless. It seems counterintuitive to think that adding more detail will help make things clear, as is said in the chapter, but some of the images do a really good job with that.
Ebby Amirebrahimi - Feb 15, 2010 12:07:50 pm
I thought Tufte's data density idea was useful and interesting. I don't think this is a principle that should be followed strictly, because sometimes presenting such dense data can lead to confusion and can make visuals difficult to grasp. However, it does add to the usefulness of the visualization as you can present more information. I think what really helps with dense data is some sort of predefined spacial understand. For example, with a map, most people can localize the data quickly to find what their looking for...with a more abstract visualization this might not be possible.
In his chapter on Macro/Micro design, I think he describes the potential power of this paradigm well. I thought the Vietnam Veterans Memorial was a good example of how this concept can be used to make a very powerful visualization while still presenting a great deal of data. Of course, this concept can also lead to some very difficult to fathom visuals as demonstrated in the chapter.
The readings on Polaris and Spotfire where interesting use-cases of dynamic queries and visualization.
Shimul Sachdeva - Feb 15, 2010 03:24:20 pm
Articles on Spotfire and Polrais were interesting reads. The softwares are useful, but it seems the key to use them efficiently lies in understanding the underlying principles of algebra and database queries. The idea of dynamic queries in general is visually very powerful. Examples shown in class, including Tableau were helpful in driving this point home. I look forward to working with these tools. I would also like to look into the development of suck tools. One of the scenarios in the Polaris article showed how this tool can be used to monitor computer systems and it would be interesting to know how we can develop similar tools for analyzing performance and/or other parameters in our personal projects.
Tufte's analysis of data density is similar to his arguments regarding data-ink. It's interesting to note that our eyes are capable of understanding dense visual data easily. The "Shrink Principle" is also an idea worth some thought. At the same time, while maximizing data density is good, as Tufte says, it must be done "within reason". Too much data can lead to confusion. Sticking with increasing data-ink seems more plausible of an argument than increasing data density.
Priyanka Reddy - Feb 16, 2010 09:09:33 pm
The Bertin paper that we talked about in class was a really interesting case study. I think it really emphasizes the need for the creator of a visualization to know what story he's telling with the visualization. A lot of times, people tend to just throw a bunch of data into any form of visualization and leave it up to the viewers to spot the trends and patterns. The process of rearranging data that is shown in the paper serves as a good reminder that a good visualization creator spends a lot of time with the data. The Trellis example also serves the same purpose.
I enjoyed the Tufte chapter on micro/macro readings. I liked his description of micro/macro designs: "enforce both local and global comparisons and, at the same time, avoid the disruption of context switching". His examples of static micro/macro designs were good, but I think animations can really do lots with micro/macro designs, allowing for the addition of multiple layers of data. Google maps and Zipdecode (from lecture) are great examples of that.
Boaz Avital - Feb 17, 2010 10:47:17 am
Polaris: The setup of Polaris is similar to tableau with the shelf system. Simplifying into only ordinal and quantitave is a good idea. If the point is to model complicated multi-relational databases however, perhaps the program should try to glean possible structures and graphs instead of just relying on the user, who may be trying to understand the data in the first place. This paper also serves as a great outline and reference for data representations and groupings, sections 4 and 5, and for an effective way to query turn visual input into database queries in seciton 6. I can definitely see myself referencing this paper in the future.
Tufte: I used to think simpler was probably better for visualizaitons, but the high resolution chapter convinced me that for visualizations more data (with little clutter) is usually better. Also the idea that when you have all this data, and are representing in a small space, "approximately right" is better than "exactly wrong". I suppose many times it's enough for the visualization to just get the pertinent information across and not the exact numbers.
Akshay Kannan - Feb 21, 2010 09:12:04 pm
Upon reading Tufte’s discussions on data density and the shrink principle, Sparklines immediately came to mind. These miniscule summaries are capable of visualizing a large set of data in a very small, effective format. While it is difficult to determine exact numbers at this size, it is very easy for users to see trends over time. An idea I found particularly compelling in Polaris, which is a feature in Tableau as well, is the ability to explore individual data points to find data not on the graph. For example, when selecting a particular data point, the user can see additional information about that data point. If the point is an outlier, the user could see more information about the point, as well as what factors may have led to it being an outlier. In the case of AS3, I plotted processor performance over time, and I found that Intel had recently released a processor with a very low transistor size but low performance as well. When I clicked on it to get more information, I found that it was the Intel Atom netbook processor, which made much more sense.