A2-AndrewMcDiarmid

From CS294-10 Visualization Fa07

Jump to: navigation, search

Contents

[edit] Question

Initially, I thought it would be interesting to see if there was a recent increase in correlations between place names and baby names in the US (e.g. Austin, Madison, Dakota, etc.) However, using both Spotfire and Tableau, I was unable to filter the names data using another table (cities and towns in the U.S. with lat/long).

As I played with the software and dataset, I wondered if the percentage of children being named the most common names had been consistent over the past hundred years or so. This is the question I ultimately explored in my visualizations.

[edit] Data Set

A .csv file of the top 1000 baby names, by decade, since the 1880s (cut and pasted by hand from 13 requests at http://www.ssa.gov/OACT/babynames/).

[edit] Alterations

The data came with both male and female names recorded in the same row, so I split them, adding a column for sex, and splitting and thereby doubling the number of rows. The final list of columns is: Decade, Rank, Sex, Name, Number, Percent of Total Births. All numerical data are per decade.

[edit] Notes

I did explore Chris Volz's scraping of the top 100 names by state since 1960, but pulling out percentages in Spotfire proved difficult. I could get the percentage of an individual row compared with the total births for the top 100 names, but not the percentage per name, aggregated across states.

The 2000s decade is complete through February 2007

[edit] Visualizations

Image:Intermediate.png

This plot of Chris's data showing the top 15 male names since 1960 suggested to me that the top names might be decreasing in their share of names.



Image:Mary.png

This line chart of the top 100 female names shows the rapid drop in Mary's share through the 20th century.



Image:Men.png

Males lacked the consistent front-runner that Mary was for females, but trend is equally apparent.



Image:TopRanked.png

The percentage of total births receiving the top-ranked name. Shape encodes sex, changes in color indicate changes in the top-ranked name. The decreasing share over the 20th century is unmistakable.



Image:BottomsUp.png

At the same time, the proportion allotted the 1000th-ranked names has risen slightly. Note the variety in color (encoding names as above).



Image:TopThirty.png

The aggregated percentage of the top 50 names. Again, the decreasing share of these names is clear.



Image:Top1000.png

While the top 1000 names still account for nearly 70% of female births and 82% of male births, there has been a marked drop-off since the 1950s.


[edit] Conclusions

It is evident from these visualizations of the top 1000 popular baby names by decade that variety is on the rise. The dramatic decrease in the percentages of babies receiving the most popular names, and even the top thousand show that the distribution of names has not been consistent over the last century.

Though frustration led me to a more straightforward question, I am still curious about the correlations and patterns between place names and baby names, and will continue to play with the dataset and Spotfire in search of such a pattern.



[add comment]