From CS294-10 Visualization Sp11
This page contains links to data sets available online. Feel free to add new links.
Public Data Repositories
In recent years, a number of web sites hosting public data repositories have been created. The available data sets include both user-generated content and official data from various organizations.
- United Nations Data
- OECD Statistics Center
- NationMaster and StateMaster statistics repositories
- The Sunlight Foundation maintains a list of resources for political transparency.
- Many-Eyes, site for public data and visualization
- Data360, yet another data sharing site
- CMU Statistical Data Repository
- NIST (National Institute for Standards and Technology) Scientific and Technical Databases
- Statistical Science Data Sets - Large index of data sets from fully processed to raw.
- LexisNexis Statistical Universe - Just about everything. Be sure to check the box that says "Limit to Documents with Excel Spreadsheets".
- The Journalists Database of Databases - A good collection of interesting data, mostly government, social, and economic.
- Fathom Data Sets - Various nice data sets meant for use with the visualization program fathom.
Agriculture, Food and Nutrition
- World wine statistics - Information on worldwide wine production and consumption.
- USDA food nutrient data - Information about the nutrients contained in a number of different foods and food groups.
- USDA PLANTS Database - The PLANTS Database provides standardized information about the vascular plants, mosses, liverworts, hornworts, and lichens of the U.S. and its territories. It includes names, plant symbols, checklists, distributional data, species abstracts, characteristics, images, plant links, references, crop information, and automated tools.
- Frequently occurring first and last names - U.S. Census Bureau genealogical data on names.
- Popular baby names - Social Security Administration data on distributions of given names.
- DHS Yearbook of Immigration Statistics "The Yearbook of Immigration Statistics is a compendium of tables that provides data on foreign nationals who, during a fiscal year, were granted lawful permanent residence (i.e., admitted as immigrants or became legal permanent residents), were admitted into the United States on a temporary basis (e.g., tourists, students, or workers), applied for asylum or refugee status, or were naturalized. The Yearbook also presents data on immigration law enforcement actions, including alien apprehensions, removals, and prosecutions."
- Human Mortality Database - The Human Mortality Database (HMD) was created to provide detailed mortality and population data to researchers, students, journalists, policy analysts, and others interested in the history of human longevity.
National Surveys of 8th Graders
A nationally representative sample of eighth-graders were first surveyed in the spring of 1988. A sample of these respondents were then resurveyed through four follow-ups in 1990, 1992, 1994, and 2000. On the questionnaire, students reported on a range of topics including: school, work, and home experiences; educational resources and support; the role in education of their parents and peers; neighborhood characteristics; educational and occupational aspirations; and other student perceptions.
The .xls file contains 2000 records of students' responses to a variety of questions and at different points in time. The codebook explains the question and answer codes.
- Bureau of Labor Statistics - From the Department of Labor.
- King County department of assessments - Data on housing and properties in King County, Washington state.
- Baseball Statistics - The Lahman baseball database, 1871-present.
- Google Trends - Track the average worldwide traffic of any search term. Once you get the results, scroll to the bottom of the page and look for "Export this page as a CSV file". You must be logged into Google for the feature to work
Politics and Government
Florida 2000 Ballot Data
This data set is Florida election data from the CMU Statistical Data Repository. (Note: when downloading these files, be sure to use the correct "save-file" operation for your browser ... IE tends to add extra characters that confused the programs.)
U.S. House of Representatives Roll Call Data
This contains roll call data from the 108th House of Representatives: data about 1218 bills introduced in the House and how each of its 439 members voted on it. The data covers the years 2003 and 2004. The individual columns are a mix of information about the bills and about the legislators, so there's quite a bit of redundancy in the file for the sake of easier processing in Tableau.
Government Spending Data
Have you ever wanted to find more information on government spending? Have you ever wondered where federal contracting dollars and grant awards go? Or perhaps you would just like to know, as a citizen, what the government is really doing with your money.