From CS294-10 Visualization Sp11

Jump to: navigation, search


I first wanted to play with data about India, but soon found that the Govt. there like to put out data in a non-computable format (specifically, as pdf's that they are required to upload as per law). The data that I did find was 30-40 entries long, and I was concerned about how much exploration would be permitted by such a small data set. In retrospect, I think that the size of the data set does not matter as much.

While exploring Tableau, I found that they had something called the Microsoft Azure Data Market which would enable me to import data directly from the server (and I though that Data Market was a cool concept). I downloaded a few data sets, and then finally settled for a dataset indicating the different Crimes committed across the United States over the last few years.

My question was an obvious one, how has crime varied over the years, and are there any trends - regional or otherwise?

Assesment of Data

The Data was high density (City and State Level) Crime Data - segmented into different Crime types. There was also a column for population. The data was spread over 3 years - 2006 to 2008. There were about 25,000 rows in the data.

I made a strategic decision at this point to look at Data per state (The data given is per city). I noticed that big cities tended to be in big states, and Tableau identified the states automatically.

I figured that comparing the data would only make sense if it were normalized for population. To do so, I introduced extra columns using the Tableau's add column feature.

Exploratory Analysis

I started by looking at the variations of crime rates across a few 'big' states (based on population). I initially looked at Violent Crime. I didn't see a clear obvious trend. I plotted this first for 50 states (which quickly proved intractable for comparing), and then for the 10 largest (by population) states. The visualizations looked like this:


So it seems that crime is going down over the years. Great! I wanted to see the variations for all the states, and perhaps detect any spastial trends (for example, does one rotten apple (state) spoil others (neighboring states)). Here again, there was some data curation - in order to see relatively zoomed in images, I did not plot data for Hawaai and Alaska. Note that Y axis are crimes per 1000.

For 2006-2007: Mm-crime-trends2006-2007.png

For 2007-2008: Mm-crime-trends-2007-2008.png

It seemed that for Alabama and Georgia, the crime rate seemed to be increasing. I wanted to investigate more, so I tableaud the data for 2006-2008.

For 2006-2008: Mm-crime-trends-2006-2008.png

So it does indeed seem that while crime is going down in general, the (neighboring) states of Alabama and Georgia seem to have an increasing crime rate.

Next, I wanted to make pie charts of the breakup of different crime types, and see any abnormalities around that. I had trouble with Tableau with this, since this involves unpivoting the data. Based on internet searches, it seems that Tableau does not allow you to unpivot the data! I would need access to the raw data (which I dont have in this case). I'll try and redo this by Wednesday with a new set of data.

[add comment]
Personal tools