From CS294-10 Visualization Sp11
For this assignment, I thought it would be interesting to examine crime rates in the United States. In particular, I wanted to know how the incidence of crime in the US has changed over time and whether it is related to macroeconomic conditions.
I found the FBI Uniform Crime Reports website. It has detailed crime statistics, but they are broken up by year and only go back to 1995. This was a problem for me because I wanted to analyze historical data. Luckily the same organization has the UCR Data Tool that lets you build and export your own data table using crime statistics from as far back as 1960. The downside is that this tool only provides the basic set of crime data variables (absolute numbers and rates for common categories of crime). And, the data are crime estimates rather than actual reported crimes. Nonetheless, I chose to use the historical data knowing that I might need to refer to more detailed data later on. The UCR Data Tool exported the data into a CSV file, except that it wasn't properly formatted (at least not for Tableau). After some annoying manual cleanup the data was ready for playing with.
As for macroeconomic data, I could have augmented my data set with unemployment rates or some other economic indicator. Instead, I would look for deviations in crime trends around the years of US economic recessions. The major recessions since 1960 are (wiki):
Hypothesis: Crime peaks in the US are correlated with economic recessions.
A broad overview
The first thing I wanted to see was the overall crime trend in the US over the time period in my data set (1960-2009). The various crime variables split up broadly into two groups: violent crimes and property crimes. I had originally wanted to plot both of these on the same time series display, but that required converting my "Years" dimension into a date data type. I tried doing the conversion in Tableau but it didn't work correctly. In the interest of continuing my exploratory data analysis, I displayed violent crime rates and property crime rates separately, using bar charts. I think this may have worked out better since property crime rates are an order of magnitude larger than violent crime rates. So, violent crime scales appropriately in its own display, making it easier to see year to year changes. The rates are per 100,000 inhabitants.
Looking at the charts above, it does seem that, to some degree, spikes in crime rate occur when the US economy is in a recession (perhaps with a slight lag). This is most evident with property crimes. Note the spikes in the property crime rate in the early 1970s, mid 1970s, and early 1980s. But, it's hard to notice these spikes because for most of this period (1960 to 1991), the crime rate was increasing. On that note, the most salient information in these charts is that both crime rates rose steeply during this period, almost a factor of 5 for the violent crime rate. It would be interesting to know why this happened. As I noted before, these statistics are estimated crime rates. So, one theory is that the crime rates didn't increase (or at least not by much), but rather crime reporting improved and became more prevalent.
I'd also like to note the obvious similarity in the crime rate over time between the two categories of crime. This suggests that whatever factors affect the crime rate tend to have the same effects on both violent crime and property crime, and that this is consistent over time.
These charts made me interested in understanding why the crime rate rose sharply in the US between 1960 and 1991, and then dropped off thereafter.
A closer look at the types of crime
I used stacked bar charts to display the rates of violent and property crime over time, broken up by their respective components.
Although it is harder to compare lengths on the stacked bar chart, it is still apparent that the general trend of increasing crime rates, a peak in the early 1990s, followed by a general decrease, holds for all of the categories of crime. Because of the scale for violent crimes, it is difficult to see how the data vary for forcible rape and murder and nonnegligent manslaughter. Nonetheless, these displays provide some useful information on the makeup of violent and property crimes. Violent crimes are most heavily comprised of robbery and aggravated assault. It also seems that over time, the composition of violent crimes has changed, becoming more dominated by aggravated assault, less by murder and nonnegligent manslaughter. As for property crimes, the composition has changed over time so that a greater proportion is made of larceny-theft. To some extent, it looks like the composition of violent and property crimes has shifted away from the most severe crimes in each group.
At this point I wasn't sure how to proceed with answering my question of why crime rates increased to a peak and then decreased. But, I wanted to delve deeper into state-by-state data to see how crime rates are distributed across the US.
Data map visualizations
I wanted to experiment with Tableau's data map visualization, so I used that to display a state-by-state comparison of crime statistics. I also hypothesized that states in the same geographical region might exhibit similar crime data. The data map would help me visualize the clusters.
Below, I visualized the violent crime rate for each state in 2009. The crime rate is encoded in the size of the mark over each state. I found out that Tableau comes packaged with various economic data by state, so I added the per-capita income of each state to the visualization in attempt to revisit my original question.
A few things are clear from the data map above. Ignoring the per-capita income, we can spot a cluster of relatively low violent crime states in the northwest of the US. We can also spot some of the states with high violent crime rates (DC, Nevada, Alaska) and those with low rates (in the northeast). Now, looking at the per-capita income of each state, there does not appear to be a relationship to the crime rate (at least not in 2009).
After further exploring the data map in Tableau, I discovered that it has data on each state's male/female ratio. I wondered whether this is related to the rate of forcible rape in each state. So, I generated a display similar to the one above, that shows the 2009 rate of forcible rape in each state as a mark, with the size encoding the rate. Each state is colored according to its male/female ratio.
Examining the map above, there does seem to be a correlation between the rate of forcible rape in each state and the state's male/female ratio. This isn't surprising, but it does confirm intuition. Notice also how the states with the highest rates are concentrated in the west. Especially noteworthy is Alaska, which has the highest rate.
The data maps above only include data for a particular year. What I would really like to know is how violent crime has evolved in the US over the past several decades. To show this, I generated my final visualization, below.
This data map shows the distribution of violent crimes for each state by decade, from 1970 to 2009. I chose to start in 1970 instead of 1960 because one of the states had missing data. I also chose 10-year increments to make it easy to see changes over long periods of time. One downfall to this is that some of those years may have been outliers for particular states. Nonetheless, it's still possible to spot clusters of states with similar distributions. Also, Hawaii and Alaska were left out so that the continental US is more spread out. Otherwise, the Northeast would have been one huge blob. A detailed description of what this visualization communicates can be found in the caption.