From CS294-10 Visualization Fa08

Jump to: navigation, search

Assignment: Final Project

Researchers: Jeff Bowman and David Poll

For our project, we would like to visualize and test different forms of "relevance" and constraint satisfaction as it applies to search results.



Though generalized text-based web search becomes increasingly powerful, it has not replaced searches specific to the needs of individual datasets, for which filters and constraints remain one of the primary methods for selecting and retrieving information. However, current implementations of constraint rarely pay attention to near-misses—records that fail one criterion for selection, but succeed in all the rest; an ideal system would return the ideal results, but would also make available the near-misses that violate one, two, or three constraints (and so forth). Also, depending on the size of the data domain, and the number of free and constrained data dimensions therein, we see a need to visually summarize the search results, displaying and distinguishing both the ideal search results and near misses in a 2D representation of the search space.

We propose a study, sample implementation, and analysis of four related visualization methods:

  • Linear continuous: A linear continuous visualization uses predetermined or user-defined factors to calculate a floating-point "score", that is then plotted in a vertical dimension with the best score at the top. This visualization allows for additional data to be represented as color and shape in the scatter; if there is no possibility for vertical overlap, then each result record can use the entire row as a space for horizontal visualization.
  • Linear stepped: A linear stepped visualization also presents results in a vertical dimension, with the best score on top; however, the score function is constrained to be an integer. (Trivially, it is the number of constraints matched, with a greater value conveying a better match.) With this system, the score function is treated nominally, forming a vertical visualization (best group on top) with an sorting algorithm specified within each group.
  • Radial continuous: A radial continuous visualization would use a similar "scoring function" as the values in the linear continuous category, except the information would be presented in a circular manner. The ideal search results would be in the center of the circle, with broken constraints (lower "scores") plotted further away from the center. Angle would be used to create a grouping, allowing a user to see all the results where one constraint was violated, or where constraints were violated in matching combination.
  • Radial stepped (sunburst): A "sunburst"-styled visualization is the extension of the linear stepped visualization, also applied in a radial fashion. The best search results are in the center, constraint violation pushes the result closer to the edge of the circle, and angle is used to group similar results.

Initial project presentation

Slides: ZIP

Related work

Williamson, C., Shneiderman, B. The dynamic HomeFinder: evaluating dynamic queries in a real-estate information exploration system. In Proceedings of the 15th Annual international ACM SIGIR Conference on Research and Development in information Retrieval (Copenhagen, Denmark, June 21 - 24, 1992). N. Belkin, P. Ingwersen, and A. M. Pejtersen, Eds. SIGIR '92. ACM, New York, NY, 338-346. June 1992.

Ahlberg, C. and Shneiderman, B. Visual information seeking: tight coupling of dynamic query filters with starfield displays. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems: Celebrating interdependence (Boston, Massachusetts, United States, April 24 - 28, 1994). B. Adelson, S. Dumais, and J. Olson, Eds. CHI '94. ACM, New York, NY, 313-317. 1994.

Sebrechts, M., Vasilakis, J., Miller, M., Cugini, J., Laskowski, S. Visualization of Search Results: A Comparative Evaluation of Text, 2D, and 3D Interfaces. June 1999.

Mann, T., Reiterer, H. Case Study: A Combined Visualization Approach for WWW-Search Results. First publ. as paper in: IEEE Information Visualization Symposium 1999, Late Breaking Hot Topics Proceedings, Supplement to: 1999 IEEE Symposium on Information Visualization (InfoVis 99), San Francisco, CA, USA, October 24-29, 1999. October 1999.

Midpoint project presentation

Midpoint slides: PPT ZIP

Final project presentation

Poster: PDF

Paper: PDF

Source Code: ZIP (hosted remotely due to 5mb size)

Live Installation: Linear Radial

Division of labor

The major separation in this project was that Jeff was responsible for the laptop data set, linear visualization, and "clustering" grouping function; and David was responsible for the Zillow (real estate) data set, radial visualization, and "weighted average" grouping function.

We accomplished this project cooperatively, for the most part, with nearly equal contribution in:

  • the original idea and a plan for realizing it
  • the development of the project architecture
  • the initial project proposal and midpoint presentation
  • the content of the final poster
  • the content of the final paper, as delineated below

David's contributions

David took the lead on the following elements:

  • a baseline encapsulated interface for constraints and data adapters, and a modular UI system to load those constraints
  • the Zillow (real estate) data set adapter, including the web service calls and parsing
  • the basic idea and implementation of the weighted average grouping function
  • the "inital checkin" template for UI design
  • major components of the final paper
    • Introduction
    • Method, except for the Linear and Clustering sections
    • Results, except for the Linear section

Jeff's contributions

Jeff took the lead on the following elements elements:

  • the PC World (laptop) data set adapter, including data extraction scripts to produce a static file
  • the basic idea, algorithmic deconstruction, and implementation of the clustering grouping function
  • the basic idea and test implementation of the unsuccessful and unpublished force-based grouping function
  • the graphic design and printing of the final project poster
  • major components of the final paper
    • Abstract
    • the Linear and Clustering sections of the Method
    • the Linear section of the Results
    • Discussion
    • Future Work

[add comment]
Personal tools