From CS 294-10 Visualization Sp10
People (or at least I) use "Google" on a regular basis: When searching for websites, for certain information, for flights, for products, for reviews and much more. Personally, I use it multiple times a day, without even thinking about it and so do millions of other useers. In the meantime, Google stores all the information about what people search for and makes it available to us through Google Trends.
Initially I wanted to look into the question if the rate of checked baggage and/or the number of customers has declined since airlines have introduced baggage fees on domestic flights. I have always wondered about that when booking flights, and it seems like a lot more people have carry ons than a couple of years ago.
Unfortunately, I did not find datasets. Since I didn't have another specific idea in mind, I started looking through the other data sets provided and stumbled upon Google Trends.
I began to play around with it a little bit, wondering if certain search words have a certain regular yearly pattern or if they are more or less random.
So I thought about asking the question:
Do people's search behaviour exhibits certain yearly or monthly trends, are there regional differences and is the search behaviour correlated to other factors?
About Google Trends
Google Trends is a search engine that allows you to analyse the weekly search behaviour of the past 5 years. The user can specify a word or phrase and see how often people have searched for that word/phrase. The result is normalized, so you cannot get absolute values. It has the following features:
- Search globally, or within one country or region (e.g. a certain state of the U.S)
- Compare the search traffic between two words
- See a graph and export the detailed results to a .csv file
This allows you to produce data for your needs.
Evolving the question
So I wondered if there is a correlation between the number of searches for an item, say the Apple Iphone and the number of people who search for the word "iphone" on google. I retrieved data for global Iphone sales (http://techcrunchies.com/quarterly-breakup-of-iphone-sales/), so I compared them to the to the global search numbers for "iphone" on Google.
Unfortunately I could only find quartely figures of Iphone sales, which did not really give me enough data to draw a conclusion. The graphic only shows a trend that both Iphone sales and searches have increased, but no hint of a direct relationship. Possible reasons could be that people google for Iphone in other contexts, say when looking for apps or accessories.
Instead I needed more detailed information, so I thought about investigating car sales. I was wondering:
Is the amount of people that search for "used cars" on Google related to those who to buy one?
Since I was not able to aquire data for certain company sales information, I found data by the Maryland department of transportation regarding Car sales statistic for new as well as used cars.
This graph doesn't allow any conclusions to whether there is a relation between people who search for "used cars" and those who buy a used car.
I needed to find another area to pose a question. I was wondering if people search for "rain" more often if it is actually raining, or "sun" if the sun is shining. This would require overlaying the search data with weather data to find trends. The problem that emerged was that I can only specify states and not cities to search and I couldn't acess climate information for a whole state.
When I googled "flowers" the normalized search value remained relatively constant over the past 5 years, however I noticed two peaks with unusually high requests, one around the middle of February (Valentine's day) and the other one around May (presumably mother's day). Altering the search request to "flower, valentine, mothers day" reveals, that the February flower peak also conincides with a February "valentine" peak, the second "flower peak" coincides with the "mothers day" peak.
(This is the image from the Google Trends homepage http://www.google.com/trends?q=flower%2C+valentine%2C+%22mothers+day%22&ctab=0&geo=us&geor=all&date=all&sort=0)
These graphs immediately make a relation between searches for "valentine" and "flowers" as well as "mothers day" and "flowers" obvious. The pattern remains remarkably constant over the years between 2004 and 2009. People only search for "valentine" and "mothers day" around the actual date of those events and we might also conclude that flowers are gifted for these occasions.
This were pretty interesting but not very surprising results. To add more depth to it, i wondered whether I could find out, if some flowers are especially popular for either Valentines Day or Mothers day.
The main time of interest of roses, orchids and tulips is restricted to the time around Valentine's Day and Mother's Day, whereas lilies have a bigger time with a high amount of queries. While roses seem to be especially popular for valentine's day, tulips seem to be more popular for mother's day.
Google trends seems to capture the atmosphere of the users. People's search behaviour certainly follow's yearly trends, like a high in queries for "flowers" around Valentine's Day or Mother's Day. It is doubtful though that it can give a reliable answer to whether people who search for a certain product is an indication for people to actually buy those products.
About working with Tableau
Although I had some troubles at first, I did enjoy working with tableau. Some of the problems originated from the fact that I have a German/European computer using the German version of Windows. In Germany, we use a "," instead of a "." for a decimal seperator, as well as "." instead "," for seperating 1,000. This led to a lot of trouble when exporting/importing data and my only solution was to change the number formating for Windows to English(US). Afterwards, data import to Tableau worked well.
However, I found it rather unconvenient, that I was not able to diplay the Google timeline in Tableau without grey bars dividing months and years.