A2-DouglasMason

From CS294-10 Visualization Fa08

Revision as of 18:22, 24 September 2008 by Dmason (Talk | contribs)
(diff) ←Older revision | Current revision (diff) | Newer revision→ (diff)
Jump to: navigation, search

Contents

Fun With Plants

Motivation

In sixth grade I was required to collect a sample of fifty wildflowers in the Ohio region. I remember a number of questions that came to mind, such as what caused the colors to be different, why were some plants edible and others weren't, why did some grow in some areas and not others?

When I came across the PLANTS database, my imagination was set afire! I could finally answer all these questions and more! A decade-old debate could be settled with the help of Tableau! So in frenzy of activity, I embarked upon a voyage of discovery, and the rest (as you will read) is history...

Are there environment influences on flower color?

Data Source

This dataset comes from the USDA PLANTS database [1]. This is an excellent website, since you can filter the output to regions and all varieties of plant properties, but it gives you a huge text file with well-formatted CSV entries. The major problem is that regional information does not read easily into Tableau. Getting it to "fit", however, would require a sophisticated filter on my end, so I have ignored the geographic information for the time being. I have restricted this dataset to the lower 48 states.

pH Balance, the Red Herring (or purplish-blue, depending on the litmus paper)

Image:ph_color.jpg

There are some flowers (hydrangeas, for example) whose color can be dramatically affected by the environment's pH level. In fact, gardeners use this effect for diabolical plots such as embedding graphical numbers in the garden to test a visitor's color vision. At least, that's what I do with MY garden!

As it turns out, unfortunately, I cannot find any correlation to flower color in general.

Precipitation, a Better Measure

Image:precip_color.jpg

There are a number of changes I made to the organization of this graph, although it is structurally similar. The colors are now organized by the maximum tolerated precipitation average for each color population. The black population has been removed since there is only a few data for this dimension (the others have 100s) and is not representative. I was surprised to find that Brown achieved the highest precipitation demands, but not so much by Green.

Fire-Resistance and Precipitation

Image:fire_color.jpg

The next question came up -- could there be a connection between precipitation demands and fire resistance? Obviously at this point I am experimenting with combinations of data, with some startling results! It is not too surprising, for instance, that having fire resistance has a general, although slight, increase in the precipitation demands. What is interesting, however, is that this trend /reverses/ for green-flowering plants. There are no extreme points to have caused this on their own, but the origin of this trend remains a mystery. (Note, I have omitted orange-flowering plants from this graph since there are no fire-resistant plants in this category)

Time-Series Correlating Frost-Free to Precipitation

Image:bloom_scat.jpg

Continuing with the experimentation, I then wondered how the tolerance to frosty days (listed as a minimum number of days the plant can tolerate below-freezing temps) correlated to the maximum tolerated precipitation, which proved a good metric earlier. Obviously over all the correlation wasn't impressive, but when I performed a time series (small multiples) I found that the correlation had a marked connection to the bloom period of the plant, increasing with the lateness of the season. I have included the size of the plant in the size and color information which is intuitive to me since it also demonstrates the change in size of plants which grow in each season

Bringing it all together with color

Image:color_all.jpg

The next step was to do perform the same small multiples and bring the color in. I love this plot since it combines a fair amount of information in one viewing -- it groups each color of flower, plots it x-y with precipitation maximum along the x axis and its minimum temperature tolerance along the y-axis, with each point's size representing the plant's height at maturity. In addition, the density of dots indicates the relative frequency of these plants. I think this does a great job of conveying a fair amount of qualitative rather than quantitative relationships. For instance, you can see that blue flowers are all very small with lower precipitation requirements but a variety of freezing temperature tolerances. Now the tendency for brown flowers to require a lot of precipitation is apparent, since there are so few of them (which are also very large) but spread out in the high range of precipitation. Alternatively, purple flowers seem to group into a very small range on the scatter plot with similar plant sizes. As a side note, I found that I needed to "delete" certain entries that exhibited an extremely large precipitation requirement, since they caused the graphs to scale poorly for viewing the vast majority of the data.



[add comment]
Personal tools