Group Members

  • Dhawal Mujumdar


Web analytics is a technique of collecting and reporting data based on Internet usage. Web analytics is used for multiple purposes like measuring traffic, user research, marketing success, advertising spend etc. Web analytics has matured over the last decade and remains the cornerstone of continuously evolving world wide web. Given the amount of data that is collected using web analytics, every solution for web analytics (ex. Google Analytics) provides number standard data visualizations like geo-location visualizations, traffic summary charts etc. And it has remained the same over a decade now. Lot of research has been done on what kind of metrics need to be collected while web data mining. But no attention has been given towards rich media elements like images, videos etc. Using this project I would like to create visualizations for rich media like images for web analytics.

File:Dhawal Mujumdar Project Proposal.pdf File:PivotAppCode.zip Report:[1]

Dan - Apr 06, 2011 03:13:12 pm

Visualizing web analytics is a very saturated area, however, you seem to have found a sector within that field that people have really lacked in. You are definitely tackling a good problem here. Choosing images instead of captcha is interesting, how does this data get used in a visualization though? Won’t most humans select the correct images? What is the question for that particular set of data?

Jvoytek - Apr 06, 2011 03:07:58 pm

What questions would you want to answer with this visualization? In the context of your thesis project, visual mobile CAPTCHA, would statistical analysis of the number of images clicked give useful feedback?

Siamak Faridani - Apr 06, 2011 03:18:45 pm

Having played with Google Analytics I completely understand why this problem is important. I just didn't understand how you would like to solve this problem. I was hoping to see some basic prototype or a drawing of the idea.

Also it might be helpful to clearly identify your audience and their needs. For example web marketers have different needs than UI designers.

Great idea though I think I will use your tool sometime soon :) - Apr 06, 2011 03:23:33 pm

I think if you have a few high level questions that you want to answer with the visualization in regards to how visualizations of web usage in multimedia heavy websites, this can help inform what type of data you want to use to visualize, ie quantitative or qualitative, and how exactly you'd like to construct the visualization. I think it's an interesting idea and I look forward to seeing a more concrete idea!

Also, I found this paper, but I'm not sure if it'll be very helpful :) http://cs.sfsu.edu/techreports/06/summer/Characterization%20and%20Analysis%20of%20Usage%20Patterns.pdf

Matthew Can - Apr 06, 2011 04:52:51 pm

You explained the problem you're trying to tackle and the motivation behind it pretty well. But what I didn't get was a clear sense of what you intend to create. It would be useful to think about how someone would use your visualization. What do you want to help people do that was hard or that they couldn't do before.

Julian Limon - Apr 06, 2011 05:21:36 pm

I believe there are many interesting problems you can tackle that are specific to multimedia web analytics. For example, how does the position of the image affect its click-through? Or, what metadata characteristics are correlated with certain actions? I wonder whether size, color, brightness, or even pixel composition could have an effect on users' likelihood to click. The problem is that marketers won't know what exactly they're looking for, so your visualization will need to be very granular but also informative (i.e., provide certain correlation metrics).

Krishna - Apr 06, 2011 08:04:53 pm

I was wondering if eye ball metric would be useful for you: It may be a useful statistic to find out how much time users spend on individual images in your captchas (?) Naive ways to compute this could be to track mouse movements over your captchas and/or how far each image of the captcha is from the center of the screen. etc.. You might have to come up with something that is analogous to click-through rates for images.

Saung Li - Apr 06, 2011 09:15:03 pm

I like your alternative look at captchas. I find them a bit tedious to type in random letters and numbers, and using images could be a nice way to test whether an input is from a human or machine. How are you going to choose the images? I'm not too sure where you are going from here, so it would be good to define what specific problem you are trying to tackle and how you think you will do that.

Michael Porath - Apr 07, 2011 04:11:31 pm

I could see the following things to be valuable targets for analytical visualizations:

  • Again, to reiterate my statement in class, I would focus on qualitative metrics rather than numbers
  • Which class of images do people usually pass well (= often), and which ones do people struggle with (= pass rarely). That could give some insight into what images are best to show as captchas. Organizing them by target objects would make sense ("what Coca Cola cans don't people recognize properly")
  • When listening to your preso, I was reminded of web site heat maps that show where people clicked. Although I don't think that metric is relevant here, it could be an inspiration.

Michael Hsueh - Apr 07, 2011 11:49:05 pm

Interesting take on visualizing captcha performance. In addition to the graphical capcthas, I've been seeing some more logical captcha-style mechanisms. For example, some forums ask the user to solve simple arithmetic word problems that are presumably generated randomly. Visualizing the challenge mechanisms might help us get a good idea on how to design these.

Sally Ahn - Apr 08, 2011 04:22:24 am

Firstly, I want to apologize for missing your presentation; I was late to class that day. However, I gather from others' comments that you are trying to visualize users' performance with image-based CAPTCHAs to see the efficacy of various images for this purpose. I think that is an interesting question, but you would have to decide how to gather your data so that you what reveal in your visualization allows for meaningful comparison between an image that succeeded and an image that failed. I'm not sure how you would do this, but perhaps you already presented an approach and I just missed it. Again, sorry about that. :(

Brandon Liu - Apr 09, 2011 10:15:33 am

Image based CAPTCHAs have some security concerns you may want to investigate: eprint.iacr.org/2006/258

