A2-SarahVanWart

From CS294-10 Visualization Fa08

Jump to: navigation, search

Creating Visualizations with Existing Visualization Software

Contents

Data Source

I am interested in many different types of data, as finding meaning in statistics is fascinating in almost any arena. For this particular visualization, I selected an environmental dataset which describes the population density, GDP, land area, and various health and environmental indicators worldwide. The data is a product of a joint research effort between Yale and Columbia Universities (http://epi.yale.edu/CountryScores), and attempts to summarize the environmental profiles of various nations according to scores, as described below:


"The 2008 Environmental Performance Index (EPI) ranks 149 countries on 25 indicators tracked across six established policy categories: Environmental Health, Air Pollution, Water Resources, Biodiversity and Habitat, Productive Natural Resources, and Climate Change Change. The EPI identifies broadly-accepted targets for environmental performance and measures how close each country comes to these goals. As a quantitative gauge of pollution control and natural resource management results, the Index provides a powerful tool for improving policymaking and shifting environmental decisionmaking onto firmer analytic foundations."

The Question

The question that I'm interested in answering is: "How do population and economic factors effect a country's environmental health and ecosystem vitality?" In order to answer this question, I will use a number of the visual data manipulation techniques that the Tableau software package affords.

Question 1: What do the EPI scores look like for each region of the world, and can geographic location predict EPI Score?

Discussion

Tableau provided a variety of methods for obtaining a snapshot of EPI data by country -- both tabular and map-based. I first wanted to get an overview of the EPI scores for each country. I first generated a map, where the darker red circle represent lower EPI scores (more environmental problems) and the lighter shades represent higher EPI scores:

Image: Spatial Representation of EPI Scores


Since EPI index is half based upon an "Environmental Health" (EH) score, and half based on an "Ecosystem Vitality" (EV) score, I next decided to output the EPI scores by country, along with the EH and EV scores. Note that I was also able to encode a third variable, "Region of the World," using color (Figure 1). I was also able to easily filter the data by region using Tableau's filtering feature. In Figure 2, I have output only Europe and North America (high EPI scores), and in Figure 3, I have output Sub-Saharan Africa (medium to low EPI Scores).

Images: Indicators By Country

Figure 1: All Countries
Figure 2: Europe and North America
Figure 3: Sub-Saharan Africa

I could see from charts above that although DPI scores were made up of EV and EH scores, these two inputs factored into the overall score very differently, and hence needed to be analyzed with finer granularity.

Question 2: Is environmental health dependent on per capita GDP?

By looking at the overview, it was clear to me that certain regions of the world have much higher EPI index scores than others. To further delve into what may be some of the underlying variables linked to the variations in "Environmental Performance Index", "Ecosystem Vitality," and "Environmental Health" scores, I created a graph that compared GDP to Environmental Health (EH), for (1) all regions, and (2) only Europe and Sub-Saharan Africa. As seen from the graph, as per capita income rises, so does environmental health. The two, for the most part, seem strongly correlated. Note that from the metadata available from the website, Environmental Health is made up of (a) the environmental burden of disease, (b) the effect of clean water on humans, and (c) the effect of air pollution on humans.

Images: Per Capita GDP Versus Environmental Health Indicators

Figure 4: All Countries
Figure 5: Europe, North America, and Sub-Saharan Africa

Conclusion: Yes, environmental health seems to be dependent on per capita income. The richer the nation, the more healthy the nation.

Question 3: Is ecosystem health dependent on per capita GDP?

When I took a look at the Ecosystem Vitality Index data versus per capita GDP, no strong correlation seemed to exist. Certain wealth countries had very poor environmental scores(United Arab Emirates, Kuwait, Belgium, and Australia), while certain poor countries had good environmental scores (Laos, Congo, and Malawi).

Images: Per Capita GDP Versus Ecosystem Health Indicators

Figure 6: All Countries
Figure 7: Europe, North America, and Sub-Saharan Africa

Conclusion: No, ecosystem health does not seem to be dependent on per capita income. Some rich countries have low environmental scores, and some poor countries have high environmental scores.

Question 4: Is ecosystem health dependent on land area or population density?

Since GDP and ecosystem health did not appear to be strongly linked, I tried to plot ecosystem health versus population density, and gave a third variable, land area, a size encoding. Neither land area nor population density seem to be correlated at all with ecosystem health. The position and land area variables have no coherent pattern.

Images: Population Density and Land Area Versus Ecosystem Health Indicators


Conclusion: No, ecosystem health does not seem to be dependent on land area or population density.

Final Visualization

After exploring the data, I found two trends: (1) a weak correlation between location and a higher EPI score [as seen in Figures 2 and 3], and (2) a stronger correlation between Per Capita GDP and Environmental Health [as seen in Figures 4 and 5]. That is:

  • countries in Sub-Saharan Africa and some coutries in Asia tend to have the lowest EPI scores, while countries in Europe and North America tend to have the highest EPI scores, and
  • the more wealthy a country, the more likely it is to have a stronger environmental health score.

I find that the visualization below best describes this phenomenon:

Using the "Small Multiples" technique, each of the nine geographic regions of the world are divided up. On the Y-Axis, Per Capita GDP is plotted, and on the X-Axis, each country's Environmental Health Index score is plotted (per region). A trend line is then superimposed to "best fit" each cluster of points, showing an upward trend (everywhere but North America, that is). By dividing each region of the world into a smaller multiples, we can make sense of some global environmental health trends from a large spreadsheet of data, and see that for each geographic region, the wealthier a country is, the more healthy it is.

Evaluation of Tableau

I found Tableau to be extremely helpful in answering the questions in which I was interested. Although I'm only a novice user, I would have liked to have been able to figure out correlation statistics between two variables. It's highly likely that such a feature is available, but I couldn't find it.



[add comment]
Personal tools