From CS294-10 Visualization Sp11
Domain and Storyboards
We are interested in the data on U.S. National Universities provided by USNews. Information such as rank, school location, tuition, enrollment, acceptance rate, retention rate, graduation rate, gender ratios, and student-faculty ratio is provided.
We scraped the site for this data, which we display here in Excel format: TopCollegesData.zip
We want students to be able to filter down the colleges by the criteria they specify, and be able to compare colleges given the constraints. We will display the data with the campuses representing by circles on a map overlay so that users will have a good sense of where the colleges they have in mind are.
The college data has 9 interesting columns excluding the campus location (and combining male/female % into 1 column). We want users to be able to select a column and have that data for each campus imposed upon the map. We want to use a monchromatic color scale to represent the differences in the quantitative data so that the user can make comparisons. We could potentially use circle size to double-encode this data, since recognizing the difference between two shades may be difficult.
Filter Within Criteria
A student using the visualization should be able to narrow down the selection based on what is acceptable to him/her. We intent to provide sliders to allow a user to hide campuses that do not fit their criteria. For example, for Tuition the slider should allow the student to specify he only wants to see campuses which cost between $5000 and $8000. The user will be able to see all the sliders and selected buttons at once so that he/she is aware of what criteria are currently selected.
Hovering over a point on the map will show that school's various features. This allows users to understand more about the certain schools that attract their attention from the above comparisons. Hovering over a point will also highlight it to allow users to see better what they have selected.
To assist users in narrowing down colleges, they will be able to zoom in using the mousewheel and move the map by clicking and dragging it. This is especially useful for tightly cluttered areas, and for those who have a specific location they want to attend in mind.
We scraped the USNews site and formatted the data using Python, and then converted it to JSON. The program is coded using Protovis.
This program works best on Google Chrome. It is slower on Firefox and doesn't work on Internet Explorer.
In the final program, the user sees a map view of the colleges, and can select from a list of criteria on the right. The dots are colored on a gradient scale to show what value they have for a given criteria with respect to other colleges. The user can filter out colleges by selecting on the private/public checkboxes and dragging the sliders to select ranges of values. Hovering over a point will highlight that college and show its information. The user can zoom in by scrolling on the mousewheel and can click and drag to move the map around.
Changes between Storyboard and Final Implementation
We were able to implement most of the fundamental ideas we had at the start. Originally we wanted to show all the range of values for each category on the right, but the rest of the visualization was already taking up a lot of space so adding in these values would make it too cluttered. Things we would still like to do are let users directly input values instead of having to use the slider, and showing the units in the gradient scale.
Saung scraped the data and formatted it. He added in the zoom and panning feature and the information when hovering over points. Karl did most of the filtering and colorization controls.
We spent 15-20 hours for the whole project.
Surprisingly, one of the most time-consuming parts of the project was pretreating the data. A lot of the fields were formatted in a way that were hard to code around. Gathering the geolocation coordinates for each campus took some time as well. This was necessary to plot the points on our visualization but was not information present in the original dataset. Working with Google's API caused some headaches, especially due to issues we ran into with XSS protection and rate-limiting.
With the geolocation data in place, we plotted the points and added the tooltips for each campus, then proceeded with further features.
Some plugins such as the tooltip seem to bug out at times, especially when coupled with other features like zooming. We did not see a good way to solve this without trying to edit the plugin.
Some difficulty was met in trying to develop the filtering mechanism. The dataset given to Protovis ended up having to be refiltered each time the filtering parameters changed. As a result, our visualization uses up a lot of processing power when dragging filtering sliders. There doesn't appear to be a way to filter the dataset Protovis uses without redeclaring the data it should be using. With large numbers of points, this approach would likely begin to fail.
A demo of our visualization is available at this address: Project Demo
A copy of the project files can be found here: File:Proj3-SaungLiKarlHe.zip
To run it, simply open visual.html in a modern web browser. From our development experience, Chrome should work best but Firefox should be fine as well.