From CS294-10 Visualization Sp11
Interactive Visual Queries for Complex Patterns
- Siamak Fariadani
In finance, it is important for an analyst to detect different patterns of stock price changes quickly. In this work we propose a visual querying method to facilitate this decision type of making.
Inspired by Hochheiser & Schneiderman's work on "Visual Exploration of Time-Series Data". We would like to explore the idea of interactively highlighting important patterns in Time-series. Visual queries in Hochheiser & Schneiderman are only restricted to the series that pass through finite number of bounding boxes. We would like to extend their work to be able to query all instances that a particular pattern (like head-and-shoulder pattern) appears in a time series
For example the "saucer" pattern is shown bellow.
In this project we explore methods that allow us to draw a wellknown pattern on the data and the system will highlight the instances that this pattern is repeated in the data.
Realtime demand visualization tool
Let us look at the following scenarios first (later we propose visualization techniques that can consider these scenarios) :
- Google may want to look at search queries that come from certain parts of the country and are fulfilled by their data center in Seattle
- Facebook may want to look at the type of ads that are clicked by people in area codes 94705 and 94706 and are related to businesses that are located in 94704
- A Walmart analyst may want to look at the the good that people buy from their location in Oakland
- A call center operator in India may like to know what percentage of the calls coming from Howaii and Bahamas to their Banglore center is about Dell Laptops and what percentage is about Lenovo laptops
- A language institute wants to know who are the students who enroll in their center in San Fransisco (demographies and location)
In all these scenarios typically the institute or the company would like to use the information to better forecast the future demand or adjust the resources devoted to their services in different locations.
The above scenarios are typical Operations Research problems that are faced by OR analysts. The idea of exploratory data analysis and visualization has not used in the area of Operations Research. And OR analysts have typically used numbers or tables to look at demands or forecast quantities.
The following image is a static visualization of global facebook friendship and was developed by an intern in facebook [source]. I use this image as an example and let us imagine that each major node is a call center and lighter nodes are people who are calling those call centers lines connecting lighter nodes to major nodes represent calls.
This project is aimed to use and extend the ideas that are presented in the following papers
- Generalized Selection via Interactive Query Relaxation by Heer, Agrawala, Willett. example challenge the analyst wants to visualize all the calls that come from Berkeley and Oakland area, have bought at least $300 worth of goods and are either going to Oakland or Fremont centers. For this query the analysis selects the two centers by clicking on them, joins it with the orders by dragging a rectangular area around Berkeley and Oakland and selects the condition of net purchase >$300 from the legends.
- (our own contribution) Real time data visualization based on historical queries: example challenge Let's imagine we receive streams of live data of calls that are placed through our call centers. We have defined a visual query so we can just call back that query and see the calls that come from Berkeley to our Oakland center. An example of this is shown on the image below
The following visualization is a static view of the above query (but keep in mind that the final visualization would be dynamic so we will see calls as they are placed)
- Brushing and linking: What persentage of calls over the period of march 7 to march 26 are coming from Berkeley this is shown in the following image
In this project we look at the combination of GIS, real time analytics, brushing and linking in combination of more math heavy analytics (for example after selecting the visual query the analyst can fit a generalized linear model to the data and see the rate of call arrivals on the graph)
- Installed OpenCV (not VS2010 Distribution) from this page
Presentation file File:Siamak-presentation.pdf
Jvoytek - Apr 04, 2011 03:25:40 pm
I wonder if people will be able to accurately interpret these time series. How difficult will it be to tweak the interface to show a correlation? Is there a way to show the autocorrelation data as well as relying on the person to identify the patterns in the data.
Brandon Liu - Apr 04, 2011 03:42:05 pm
It would be nice to also have an interface to explore how strong an autocorrelation is once one is identified using this method.
Julian Limon - Apr 04, 2011 07:47:05 pm
Feature selection was mentioned as one possible usage of this visualization. It isn't clear for me how these visualizations could help determine the most useful features in a dataset, but it sounds like a great idea.
Michael Cohen - Apr 04, 2011 11:54:14 pm
Of the possible data domains you discussed, I think the bus example would be the most compelling for the kind of interface you're imagining. The Kaplan (or Kaplan-like) data probably doesn't have a strong enough spatial component for the 2D navigation to be very meaningful, unless there are a very large number of test centers clustered close together. An autocorrelation matrix would probably serve just as well, since the precise spatial relationships between the centers aren't that important. The video example was interesting, but it would be pretty challenging to develop a meaningful autocorrelation measure for that within the next month or so. The bus data I think strikes a good balance in that it's very spatial but also simple to compute (and view) correlations.
Sally Ahn - Apr 05, 2011 12:03:57 am
I liked the idea of a Time Window. I think it might be nice to provide redundant time sliders for all the windows below the main window slider so that they are all scaled the same. It's possible that users might prefer to compare the windows along the same time scale.
Karl He - Apr 05, 2011 03:30:29 am
I agree the time window idea is very cool. I have a slight issue with your bus example however, it may be better to have a separate progress-line at the bottom with the 2 buses side-by-side rather than imposing one of the buses where it shouldn't be on the map. I like the idea as a whole, however.
Matthew Can - Apr 05, 2011 02:54:05 am
It seems like there are several domains and different data types to which you would like to apply your approach. I would focus on a subset of these just to make the problem statement more concrete and easier to tackle. I really liked the Time Window interactive technique for visualizing correlations. It reminded me of the Zoetrope project. http://www.cond.org/zoetrope.html
Michael Hsueh - Apr 05, 2011 07:40:47 pm
I like the time windows idea, especially applied to videos. Such a tool would definitely help in analyzing traffic situations or even the amount of foot traffic at a busy intersection. In some ways, its a sort of tightly integrated small multiples display. Following that thought, one might think about small multiples as a way to do autocorrelations, except with streaming / video data.
Saung Li - Apr 05, 2011 08:49:54 pm
The idea of the Time Window is super cool; you give a lot of neat examples. How can this be used with the Kaplan data, though? It seems like that data is mostly numerical, so are you thinking of using something else besides the Time Window? Maybe data with streaming or video might fit beter with this.
Manas Mittal - Apr 05, 2011 09:12:33 pm
I thought the idea of visually enabling auto-correlation would be useful. Will the users always be able to choose the window size (the window size used for autocorrelation) in a time series dataset (i.e., the window size to autocorrelate) and will the results be shown in real-time? I think that would permit a form of novel form of exploration, and would be extremely useful.
David Wong - Apr 06, 2011 01:14:13 am
To help better picture how the system will work, can you elaborate a bit more on the features and benefits of system, especially in terms on how you will apply it to the Kaplan problem? As autocorrelation is used to find cross-correlations of a signal with itself, you can also think of other ways of visualizing the signal, eg frequency domain.
Dan - Apr 06, 2011 12:20:46 pm
Those visualizations of the maps with brushing and linking look great. If you can produce visualization like that I think you are on to something big. Just wondering which domain is the best for you to apply it to. The time window is a great application here and should try to focus on a good use case to use it.
Michael POrath - Apr 06, 2011 03:05:17 pm
Visualizations of time series analysis usually need to be explored (consulting the ACF, frequency functions etc.) It seems however, that your visulization of time series would allow for exploration of the data, which would be novel. I'm curious to see how that will work out!