From CS294-10 Visualization Sp11
I'm currently involved in a class/project at the journalism school focusing on doing stories about food issues, and so I wanted to find a dataset and question around that topic. I had a lot of ideas for specific topics and questions, but quickly found that not all of my topics would be appropriate for this assignment, and was of course limited by the datasets available to me. The USDA has a pretty extensive collection of datasets about food, nutrition and agriculture, so while I looked around a bit for data elsewhere, I mostly looked on the USDA site.
I initially wanted to find data and ask a question about food spending, and percentage of income spent on food, comparing internationally, by state, or by locality. I did find some good data on this topic, but realized the data wasn't complex enough to ask a question that was really exploratory - it would be more of a simple visualization exercise.
I then started looking at data about spending on food at home and away from home, over the years, as well as vegetable and meat consumption over the years. While I could formulate questions about these topics, the data wasn't all that interesting and didn't provide very definitive answers, so I decided to go for a question and topic that would tell more of a story through the data.
The question I settled on was "Is there a relationship between patterns in meat consumption in the US? For example, does beef consumption go up when pork goes down, or vice versa?" I used a dataset for per capita consumption of various types of meat - beef, pork, veal and lamb, from 1909-2008, in pounds. I particularly liked that this dataset had information for such a long time period and would (potentially) show real changes in consumption patterns.
The dataset is titled "Red meat, poultry and fish" and is available for download in Excel format on this page: http://www.ers.usda.gov/Data/FoodConsumption/FoodAvailSpreadsheets.htm#mtpcc
All the data I used was available for download in Excel, so I didn't have to do a whole lot of work to reformat it for Tableau. I just took out some titles, retitled some columns and simplified the spreadsheet before importing it, and it seemed to work fine.
The first visualization I made was a side-by-side bar graph.
I could see some patterns through the bar graph, for example, it was clear that beef consumption overtook pork during the 1950s and has maintained its position as the most popular meat since then. But because there were so many years and four types of meat for each year, the graph was way too long to be able to look at easily, and it was also hard to see any relationships between the types of meat.
I then removed the measure names from the columns, and made a stacked bar graph visualization.
The visualization was better than the previous one, but still a little too crowded and difficult to pick out patterns of relationships. Also, there was the issue of not being able to compare heights very well when they're stacked.
So, I switched that data line format.
In this visualization, I could see much more clearly when beef overtook pork in consumption, as well as patterns in how beef and pork consumption related to each other, and settled on this as my final visualization to use to answer my question.
- Beef consumption surpassed pork for the first time in 1953, and never dipped below pork again.
- Consumption patterns seemed to be slightly different before 1953 and after. Before, the two meats were closer in consumption numbers, and while there were points where one seemed to clearly react to the other - i.e., when beef went down, pork went up, this didn't always happen in the same year. Also, there was a year - 1935- where both went down to a close number, probably in response to the Great Depression. Both also rebounded the next year in similar proportions.
- After 1953, rises and falls in each seemed to correlate a little more closely, in that when one went up the other went down, although there were still a couple years (e.g. 1973, 1997) where both fell.
- It's clear from the visualization that people ate much more comparable levels of beef and pork before 1953, and after that, the difference remained more consistently wider. (Though it has become narrower in the last few years.)
I would have liked to find a dataset that included numbers for chicken consumption as well, because I knew from looking at other data that Americans are eating a lot more chicken now than they were 100 years ago. It was not included in this particular set, however, and while I could have perhaps found data to combine and worked it in somehow, I couldn't initially find one that would work.
I spent a lot of time - too much time - looking for the right dataset and deciding on a question. Next time, I think I would force myself to make a decision on the topic earlier, or commit more to a question or topic rather than sifting through so much data to try and find the right set.
I also ran into some roadblocks with the technology, and just not being familiar enough with it to know either how to do things I wanted to do, or whether it was possible to do those things. I could remedy that by simply working with it more and learning more about it for the next time.
Some of the datasets that I found seemed great at first, and then upon closer inspection, I realized that some of the terms were ambiguous, or there wasn't quite enough explanation along with the data to make me feel comfortable about using it. From now on, I will look for those issues earlier in the process, rather than getting excited about a dataset and then realizing it wasn't going to work.