From CS294-10 Visualization Sp11
- Sally Ahn
Visualizing Aggregate Image Edits
Often times, visualization techniques focus on transforming numerical or text data into a single image that conveys the story of millions of numbers/words. However, sometimes the data are images to begin with. This is the case in my current research in computational aesthetics of photographs. One way of gaining insight into general aesthetic preferences of photo compositions is to collect images that represent individual preferences from many users. The data domain then consists of many images (50-100), and aggregating these images in a meaningful way can be challenging, especially if some of these images are unreliable and should be discarded. A good visualization tool can aid in analyzing such data by allowing the user to interactively hide or view specific subsets of images. The relevant subsets can be specified through the image itself, through image-based filtering. For example, brushing and linking techniques could enable the user to select an section of the original image that should (or should not) be present in the cropped image from the dataset. The filtered set of images could then be presented in a way that highlights visual patterns or similarities to aid in detecting aesthetic preferences, along with its statistical data. Although I am focusing on photo croppings for this project, I think this visualization tool can be extended to analyze other aspects of image editing, such as color, contrast, etc.
Initial Problem Presentation
- Source and Executable: File:Source-SallyAhn.zip Live demo
- Paper: File:FinalPaper-SallyAhn.pdf
- Poster: File:FinalPoster-SallyAhn.pdf
Jvoytek - Apr 04, 2011 03:40:36 pm
This is a very interesting and difficult problem. The challenge seems to be to make an intuitive link between the crop data and the statistical measures.
Brandon Liu - Apr 04, 2011 03:42:58 pm
Q: I really like the idea of showing a density map for many user's crops. How will you handle validating input on Mechanical Turk? It seems like you run the risk of getting a lot of bad data from Mechanical Turk; one strategy is to give Turk users a "training task". What training task would you use that has a known optimal crop? Also, what about giving users a picture with more than one area of interest and then asking them to pick out individual items?
A: Use a verify step on MTurk to make sure crops are valid.
Julian Limon - Apr 04, 2011 05:48:57 pm
This looks like a very interesting challenge. Since you already plan to use Mechanical Turk, I believe you could take advantage of the task and ask people for more detailed questions. For example, you could have a question that says "Which picture is more pleasant?", another question that says "Which picture do you prefer?" or "Which picture is more interesting?". Taste is a very subjective task, so if you're able to determine a few different characteristics of people's preferences the results could be even more interesting.
I also like the idea of using pixel voting to chose the best image. However, I wonder if this could lead to non-conventional crops such as diagonal lines.
Michael Cohen - Apr 04, 2011 11:59:39 pm
In addition to the options you mentioned, it might be interesting to look at color as a way of showing the aggregated croppings. You could vary along the saturation dimension so that rarely selected pixels are grayscale and selected ones are in their usual colors. Of course, there would be a lot of crosstalk with the color information present in the image -- for instance, an image that has dull colors to begin with won't give you very much room to show intermediate degrees of selection. It might be more effective to grayscale the whole image, then add saturation of a single color to represent the level of selection; I believe photoshop does something like this for masks that support partial selection of pixels.
Siamak Faridani - Apr 05, 2011 01:18:55 am
This looks like an interesting contribution both from the visualization perspective and from the human computation perspective. I have one comment about some related work. My adviser, Ken Goldberg and his former PhD student Dez Song have worked on a number of interesting spacial voting models and these might be helpful for you work. For example see the following
Unsupervised Scoring for Internet-based Collaborative Tele-operation, Ken Goldberg, Dezhen Song, In Yong Song, Jane McGonigal, Wei Zheng, and Dana Plautz, IEEE International Conference on Robotics and Automation (ICRA), April 2004
Matthew Can - Apr 05, 2011 01:55:19 pm
You defined the problem well and seem to have a solid idea of the tools and visualizations that you plan to build. In addition to the density map, you might want to create some dynamic query tools to search through all of the cropped images. For example, if I select a portion of the original image, the system returns all cropped images that include that portion.
Michael Hsueh - Apr 05, 2011 08:23:25 pm
This tool is not only practically useful, but potentially revealing of aesthetic principles as well. It definitely touches on the subject of identifying regions / objects of interest from an image, as well as the aesthetic challenge of effectively cropping photographs. I'm wondering how an "optimal" result crop can be determined once pixel votes are tallied, given that the candidate pixels that exceed the culling threshold form a non-rectangular region, or are disjoint, etc.
Manas Mittal - Apr 05, 2011 09:16:58 pm
This made me think mainly of how you'd encode these multiple results (i.e., the sum of pixel counts). Would you use alpha shading? I'm wondering if you might want to show the different individual selections (say, the most varied ones) in a small multiple views. It would be important to consider the outliers - they might be more artistic rather than the 'average'.
Also, I'm curious to know what, if any, evaluation metric can be described. I think you could perhaps run a set of perception experiments here (i.e., in your images, use perception related material) and track how much time users took to answer your questions (average time?)
Saung Li - Apr 05, 2011 08:58:58 pm
This is a very interesting topic; I really like the idea of pixel voting, and the density map displaying the most popular areas. How will you categorize aesthetic preferences, though? Are you looking for things like what objects and colors people prefer to have in their cropped images? I see you mention preferences like corner locations, which are interesting ones to explore. Perhaps there are more that can be looked at.
David Wong - Apr 06, 2011 01:11:13 am
I was wondering what exact features you are looking to pull out from each image cropping (eg corners, actual pixel locations with pixel voting, or maybe features of the objects). I like how you mentioned how categories of certain pictures and features within those pictures can psychologically inform the user's cropping. Perhaps you can use some psychology theory to inform your selection of features when you aggregate different photo crops. Also, what domain are you planning to look at first?
Karl He - Apr 06, 2011 01:17:20 am
Pixel voting seems like a good way to consolidate user input. I'm not entirely sure about relying on what people think is a good crop however; as a case in point, majority of family photos have really bad image composition.
Dan - Apr 06, 2011 12:20:48 pm
The domain of image cropping is very specific, and I think you definitely have well-defined challenges and user interface concepts. There could be some interesting findings when you let other people choose croppings through crowdsourcing.
Michael Porath - Apr 06, 2011 03:08:24 pm
This is a very compelling proposition. Cropping pictures requires some knowledge about how a picture should look like. I'm curious about the results you'll get with the Mechanical Turk process. A contrast to the rule of thirds could be nice to show that this actually works better than the rule of thirds. Other than that, what I could see working is a histogram type of visualization horizontally and vertically that shows the cropping areas. An inspiration could be how Microsoft's Photosynth shows overlays of images of the same subjects (e.g http://photosynth.net/view.aspx?cid=9955c093-6276-44b6-b94a-121c53ae27b6)