From CS294-10 Visualization Sp11
An Exploration of Mint.com data.
This is a project about exploring the mint.com dataset.
Mint.com is a popular personal finance website. Specific information about a users financial data (i.e., transaction level credit card information) is downloaded and presented to the user in a nice user interface. The company downloads 10-12 million new credit card transactions a day. No individual dataset/privacy information will be exposed.
Exploring this dataset is likely to lead to new ideas and techniques for this domain specific dataset. Some of the ideas are presented in the powerpoint.
About Mint.com: Mint.com is America’s #1 online personal finance service, providing over 1.5 million users a fresh, easy and intelligent way to manage their money. And it’s free. Launched in September 2007, Mint.com has quickly grown to track nearly $200 billion in transactions and $50 billion in assets and has identified more than $300 million in potential savings for its users.
Mint.com’s innovation is in applying advanced technology to deliver breakthrough ease-of-use. Using patent-pending technology and proprietary algorithms, Mint.com allows users to see all their financial accounts in one place, makes it easy to set and keep to budgets, and helps identify money saving ideas. Mint.com is so effective that more than 90 percent of users say they have changed their financial habits as a result of using the service.
Siamak Faridani - Apr 11, 2011 03:12:53 pm
I think you have a wonderful dataset. Here are some of my ideas. (1) look at spending increases and come up with a suggested portfolio of stocks that reflect which companies have had sale increases (2) focus on users an allow them to see the effect of their behavior for example a flower that grows when I have zero debt and perishes when my debt grows.
Jvoytek - Apr 11, 2011 03:12:22 pm
For the clock-like visualization, why not show the data on a horizontal plane? It seems like it would be difficult to compare the relative size of the bars when they're pointing many different ways in space.
Dan - Apr 11, 2011 03:20:02 pm
Good background on Mint and the granularity of information that you have available. Sounds like there is a lot of queries that can be constructed, and various trends that you can explore. I would try to find out exactly what you would like to focus on. I like the clock visualization with the budgeted spending idea. Maybe you could allow users to construct Hadoop and Aster queries?
David Wong - Apr 11, 2011 03:25:32 pm
Very cool data set to use! Like the suggestions in class, it will be helpful to identify who the audience is and begin exploring the data to see what's interesting. I like your ideas for the visualization, especially how to handle the large amounts of data, ie offline versus online visualization. As you're looking into correlations, it might be cool to build an interactive tool that allows users to run data mining algorithms (clustering/regression) and visualize the results, taking a crowdsourcing-esque approach to the exploratory data analysis.
Matthew Can - Apr 11, 2011 06:02:24 pm
I think the idea of visualizing correlated spending is interesting, and it's something you should explore in your project. As was mentioned in class, it sounds like you're building an exploratory data analysis tool. Based on statistical analysis, it would be nice if your tool could suggest interesting views, then let the user interact with and make modifications to them.
Brandon Liu - Apr 11, 2011 06:18:50 pm
I think exploring the entire Mint dataset may be too broad, have you considered focusing on one segment such as online retail spending?
Sally Ahn - Apr 12, 2011 12:11:07 am
Your dataset is very rich and this makes it both an interesting domain to explore and a challenge to reduce it down to a clearer representation as a visual image. Many have suggested narrowing your audience and scope of data first, and I think that's a good idea. Visualizing one person's spendings for introspective analysis would be quite a different task from visualizing the patterns among a large group of people.
Michael Cohen - Apr 12, 2011 12:43:39 am
Your project reminds me a bit of the okcupid blog where they mine their data set for interesting (and sometimes discouraging) trends in race/gender interactions, etc. The visualizations are serviceable, not spectacular, but I think there's some value in looking at the kinds of questions they ask. Along those lines, I'll put in a plug for coming up with something that uses Mint data to tell us all something that we don't know about the world. That's the real value-add between what I can do with my own financial data, and what you can do with access to the whole database.
Krishna - Apr 12, 2011 12:14:02 pm
I think figuring out the class of events you want to analyze is going to be critical for your visualizations. Static visualizations tend to be biased, you might want to add interaction - I understand your problem of analyzing huge datasets though, you may want to try some sampling techniques ?
Saung Li - Apr 12, 2011 03:30:59 pm
This is a really awesome domain. It looks like there a ton of possible correlations you could look at, so be sure to pick a few you like and focus on those. As mentioned in class, it's important to add interactivity for users so that they can further explore the data sets. Creating a tool that allows users to predict things like holiday spending and new competitors to existing companies from the data would be extremely beneficial.
Saung Li - Apr 12, 2011 03:30:59 pm
This is a really awesome domain. It looks like there a ton of possible correlations you could look at, so be sure to pick a few you like and focus on those. As mentioned in class, it's important to add interactivity for users so that they can further explore the data sets. Creating a tool that allows users to predict things like holiday spending and new competitors to existing companies from the data would be extremely beneficial and interesting.
Julian Limon - Apr 12, 2011 10:34:02 pm
Like others have mentioned, I believe you have a terrific dataset. A lot of interesting visualizations can be generated to make sense of this domain. I believe that once you have defined your audience you will be able to find some interesting questions to ask.
If you end up building visualizations for the end users of mint.com, I believe that a cool problem to solve would be the one of vacation. At least for me and for other people I've talked to, it's always hard to categorize the expenses that one makes during a vacation. Of course, if you only have data at the merchant level some transactions will appear as normal restaurants or services. However, I believe you can use some visualizations to show people how much their expenses vary right before a hotel checkout. This can later be used to further categorize a portion of the expense as budgeted (like food) while the rest may go into a "vacation" category. Moreover, you can use comparisons to other people who have stayed in the same place to allow people to create a budget for their next vacation.
If you just want to analyze a large number of features for internal mint.com use, you can create a system that generates parallel coordinate visualizations based on the strongest correlations. Then, techniques like brushing and linking could be used to analyze patterns more closely.
Michael Hsueh - Apr 12, 2011 10:54:47 pm
I think the data on mint.com is a great resource to explore. You've enumerated a good number of interesting questions to explore with the given dataset. I agree that interactivity and supporting exploratory analysis is key to being able to wrangle with such an extensive corpus of data. That, and designing the few key visualization styles/types that are flexible enough to accommodate them.