From CS294-10 Visualization Sp11
I picked a dataset that shows all entries of songs on the Billboard charts from 1890 to 2008. My hypotheses was that every period is dominated by a genre. Thus, my goal was to demonstrate how music preferences changed over time.
Improving Data Quality
The spreadsheet has been collaboratively assembled in a crowdsourced effort. The data quality is mediocre, so I had to go ahead and clean the data.
It turned out that some fields had a plus character or an asterisk to indicate a comment within the spreadsheet, which had the effect that Tableau didn't recognize the data properly, and instead interpreted the data as Null values. Unfortunately, cleaning up the data only had a limited effect; it helped to the extent that each song had a data assigned. However, a closer look at the data revealed that the genre was missing from a lot of songs
Introducing The Genre
In order to find out about how the music taste has changed over time, my goal was to indicate the genre by color:
This visualization came closer to what I intended to show. The colored areas of the stacked graphs clearly showed changes in music preferences. For example, the growing share of Rap music starting from the mid-80's becomes apparent.
There are, however, a couple issues with this visualization. First of all, having so many songs with no genre indication (light gray areas) is problematic when the main message I want to convey banks on changing genre preferences over time. Also, even though it would tell an interesting story, it is not obvious why the number of songs per year shrinks after 1967. Since data quality has been a problem with this data set, it could very well be that this is a matter of missing data. A third problem is that there are by far too many genres which makes the coloring confusing and the visualization less compelling.
Top 30 Artists of All Time
I decided to include artists in some way or another. With thousands of artists, however, I decided to only include the top 30 artists of all time.
This removed the coloring problem but still left me with too much missing genre data. Breaking up the data into small multiples for the 30 top artists improved the situation. I sorted the artists by the date they released had their first song in the charts.
Even if the genre is missing, the visualization tells a compelling story. One can see when the bands had their songs in the charts, and in what period of time they released their songs.
When I broke down the visualization by artists, I expected to see names like Michael Jackson, Madonna, or Hip Hop acts like LL Cool J, who were all active over a stretched period of time.The problem is that despite the fact they were very successful, they didn't put out enough records to appear in the Top 30. Madonna, for example, had over 50 singles in the charts, which is still too little.
To see those, I wanted to restrict the time range to songs released after 1980, while still showing the top 30. I didn't find out how to use a nested filter with Tableau ("show Top 30 artists of all songs after 1985) instead of an additive filter ("Show only songs after 1980 of all Top 30 artists"). I worked around this restriction by just filtering for artists with more than 30 songs and, restricted the time range to only songs released after 1980, and manually removed the few artists that showed up in that list, but released most of their songs prior to 1980.