FP-NicholasKong

From CS294-10 Visualization Fa08

Jump to: navigation, search

Contents

Proposal

Group Members

  • Nicholas Kong

Description

There are a number of reasons why visualizing text, especially large corpora, would be useful. A visualization of keywords or topics of a corpus could facilitate information retrieval or reveal interesting patterns in the data. I will be working with a large corpus consisting of 14 years of Wired articles, and I plan to develop an interactive visualization that will allow the user to explore trends in Wired reporting over time. My research is concerned with collaborative visual data analysis, so I further plan to integrate this visualization into our existing collaborative analysis framework. Since it is of utmost importance in collaborative analysis that one user can communicate effectively to another user the significance of a certain visualization, visualizations that map a variable (such as time) to a user-controlled widget are not feasible. This was the approach taken by Wise et al., although they designed for a signal analyst.

The text visualization problem is particularly difficult because of the inherent complexities of language and furthermore the multidimensionality of the data resulting from most text analyses. For example, a document may have many keywords or topics associated with it, and each of these keywords or topics may be considered a dimension. Choosing how many dimensions of a document to represent on the timeline will be challenging. It is possible that just using the most relevant keyword/topic may be sufficient.

I will be relying on Professor El Ghaoui's StatNews project for the underlying statistical data. As of this writing I am still not aware of what sort of data StatNews provides, and hence this proposal is subject to (possibly drastic) change.

References

  • Albrecht-Buehler, C., Watson, B., and Shamma, D. A. 2004. TextPool: Visualizing Live Text Streams. In Proceedings of the IEEE Symposium on information Visualization (October 10 - 12, 2004). INFOVIS. IEEE Computer Society, Washington, DC, 215.1.
  • Collins, C. 2005. A Critical Review of Information Visualizations for Natural Language. University of Toronto Ph.D. Qualifying Exam.
  • Hearst, M. Informational visualization and presentation. PowerPoint Slides. http://www.sims.berkeley.edu/courses/is247/s02/lectures/TextAndSearch.ppt.
  • Hetzler, E. G., Whitney, P., Martucci, L., and Thomas, J. 1998. Multi-Faceted Insight Through Interoperable Visual Information Analysis Paradigms. In Proceedings of the 1998 IEEE Symposium on information Visualization (North Carolina, October 19 - 20, 1998). INFOVIS. IEEE Computer Society, Washington, DC, 137.
  • Miller, N., Hetzler, B., Nakamura, G., and Whitney, P. 1997. The need for metrics in visual information analysis. In Proceedings of the 1997 Workshop on New Paradigms in information Visualization and Manipulation (Las Vegas, Nevada, United States, November 10 - 14, 1997). NPIV '97. ACM, New York, NY, 24-28.
  • Wise, J. A., Thomas, J. J., Pennock, K., Lantrip, D., Pottier, M., Schur, A., and Crow, V. 1995. Visualizing the non-visual: spatial analysis and interaction with information from text documents. In Proceedings of the 1995 IEEE Symposium on information Visualization (Atlanta, Georgia, October 30 - 31, 1995). INFOVIS. IEEE Computer Society, Washington, DC, 51.

(Semi-)relevant websites

Initial Problem Presentation

Proposal slides (.ppxt)

Proposal slides (.pdf)

Midpoint Design

Midpoint slides (.pdf)

Demo

Midpoint Design



[add comment]