A3-NateAgrinKen-ichiUeda

From CS294-10 Visualization Fa07

Jump to: navigation, search

Contents

[edit] Dynamic Query Interface to a Flickr Field Guide

Live Demo. It's pretty slow, so please bear with it. The exact URL may be change, so please use this link.

[edit] Description

Open content sharing forums like flickr have fostered large and diverse communities. Naturalists, in particular, have taken to flickr, forming groups with explicit posting rules which require members to accompany their photos with predetermined kinds of metadata, in the form of tags. These groups have generated a large corpus of data that might easily be mined for research purposes. However, in its current form, the data is only accessible via flickr's built in search tools. We propose building a visualization that utilizes dynamic query filters, following Shneiderman's work on the HomeFinder visualization [1]. Like HomeFinder, we will take advantage of any available geolocation data associated with each photo to plot observations on an interactive map, and provide sliders or other selectors to the remaining metadata, allowing users to dynamically query and visualize specific information. We will focus on the Wildflower Field Guide, North America group (http://flickr.com/groups/wildflowers/), which requires tags for common name, scientific name, color, shape, and location. Most photos also include information derived from the EXIF tags, so will be able to include temporal query functionality as well.

[edit] Wireframe & Storyboards

[edit] Wireframe

Image:a3-flickr_dynamic_query-interactions.jpg

[edit] Search

Image:a3-flickr_dynamic_query-search.jpg

[edit] Taxon Sorting

Image:a3-flickr_dynamic_query-taxon_sort.jpg

[edit] Timeline

This is probably the trickiest part to code, so we may not include it.

Image:a3-flickr_dynamic_query-timeline.jpg

[edit] Implementation

[edit] Technology

We chose to implement using conventional web development technologies, namely the Ruby on Rails framework for backend data management, HTML for markup and layout, and Javascript/AJAX for interface programming and interaction between the client browser application and the server storing the data.

[edit] Data

We ultimately chose to create an archive of all the photos and metadata from the Wildflowers, North America Flickr group. We were able to implement a version of the interface that operated on the live group photo pool, but it was prohibitively slow. Likewise for the taxonomic data, we ultimately opted for a local store of taxonomic names rather than constantly querying a web service (in our case, the Universal Biological Indexer and Organizer, or uBio to check the taxonomic validity of certain photo tags. We ended up using a local copy of the USDA Integrated Taxonomic Information System database for taxonomic name queries, and opted not to use uBio's service at all because of its speed.

[edit] Problems

Implementing what we thought would be a modest interface turned out to be very troublesome. We had assumed at least one of the many Flickr API wrappers for Ruby would be meet our needs, but all of them were missing key features, like support for groups or tag extraction. In the end we ended up choosing one (Net::Flickr) and adding the missing functionality ourselves.

We also spent a great deal of time writing our own wrapper to the uBio API, in an attempt to preform dynamic taxonomic querying, only to abandon it for performance reasons. uBio provides some great functionality at reasonable speed, but not all methods return full objects, often resulting in multiple calls to get all the data about a given taxon. For instance, searching on a common name would return multiple common name objects, but getting the full scientific name object would require a second call to a different method. These kinds of limitations lead us to the ITIS database, which had its own performance problems, being rather massive and relatively slow to search. Our strategy of extracting taxonomic data out of the photo tags also prove slow, because the only indication that a given tag was worth investigating was by testing if it had two words, with the first capitalized (e.g. "Homo sapiens"). Since many tags matched these criteria, extracting taxonomic entities out of a given tags photos required several searches over a very large database.

Query building using Rails' implementation of ActiveRecord proved to be fairly difficult. There was no way to iteratively build SQL queries easily based on some logical conditions. We worked around this limitation by utilizing some built in methods for querying a tag space and a single conditional WHERE clause. This method, however, does not scale well and would easily break down when more states of the visualization needed to be relayed to the logic which ultimately queries the database. Building an iterative query building interface for ActiveRecord would be a major improvement to the Rails technology and a help to the Rails community.

In the end, we were only able to implement a limited subset of our original design specification. Geographic, color, shape, and taxonomic filtering work, but we were unable to implement taxonomic search, a dynamic taxa list that self-updates in response to the other filter widgets, or timeline scrubbing. Clearly the number of dimensions we chose was too large to produce something of significant quality, and we would have been better served limiting the scope of our interface.

[edit] Successes

While we did not manage to create our ideal visualization (nor one that is optimized for widespread use), our current work shows some promising direction. Using a simple tag based infrastructure, we were able to successfully mine a large corpa for metadata, and quickly pivot on various elements of that metadata during querying. Patterns of usage quickly arose from the data, showing pockets of activity in certain areas (Seattle, Bay Area, national parks, etc), and the display of the flowers' images provided a means for further inquiry. It seemed that with only a few inputs that the average naturalist could know, such as approximate location they observed a flower and the color of flower, an amateur would be able to successfully identify the taxonomic name of a given species.

[edit] Surprises

I found that when querying the interface I was most drawn to the map, and the spatial layout of the markers for each photo, as opposed to the photos themselves. Although proper usability testing would be required to definitively determine the most appropriate interface, this heuristic insight might suggest that we make the map more prominent in future revisions. The appeal of the map may be that it relays information even the casual naturalist might understand (place) and shows patterns of activity, such as groups of images in an odd location, possibly indicating the contributions of a single user.

[edit] Future Work and Implications

We would like to continue building this interface, adding the missing selectors from our current implementation and improving the overall performance of the data querying. Because the initial interaction of this visualization seems compelling, it would be useful to port this querying interface to other biodiversity data sets. Furthermore, we hypothesize that these sorts of interactive information exploration tools encourage greater contribution to the underlying data set and would eventually like to explore methods allowing users to annotate the visualized data, making it further accessible.

[edit] Requirements for Achieving Original Designs

1. Visual Information Seeking: Tight Coupling of Dynamic Query Filters with Starfield Displays. Christopher Ahlberg and Ben Shneiderman.



[add comment]