A2-MattGedigian

From CS294-10 Visualization Fa08

Jump to: navigation, search

Assignment 2

Contents

Question

I decided to look at transportation statistics. My question was whether light rail had more accidents, but fewer fatalities, than heavy rail.

Data Source

I was interested in either California state-level or United States national-level data. I found interesting data from the Bureau of Transportation Statistics (BTS) and the Census, which were available in PDF and CSV. But these were almost entirely broken down into very small pieces. Each dataset included only a portion of data supporting a single report, basically answering a single question. I settled on using data from BTS's National Transportation Statistics.

I realized that there were actually three types of rail service. From the BTS glossary:

COMMUTER RAIL
Urban passenger train service for short-distance travel between a central city and adjacent suburb. Does not include rapid rail transit or light rail service.
LIGHT RAIL
A streetcar-type vehicle operated on city streets, semi-exclusive rights-of-way, or exclusive rights-of-way. Service may be provided by step-entry vehicles or by level boarding.
HEAVY RAIL
An electric railway with the capacity to transport a heavy volume of passenger traffic and characterized by exclusive rights-of-way, multicar trains, high speed, rapid acceleration, sophisticated signaling, and high-platform loading. Also known as “subway,” “elevated (railway),” or “metropolitan railway (metro).”

By "heavy rail", I had meant CalTrain-style rail service, which is actually "Commuter Rail". I altered my question to examine commuter rail instead.

The CSV files I found were not really tabular, in the sense that each row did not contain the full information for a record. They had section headers and labels which applied to subsets of the rows. It wasn't immediately obvious to me whether Tableau had facilities to reformat this, so I did it beforehand, partially by hand and partially using a python script. Rather than having several measures on each row, I split everything apart so that each row contained only one dependent value.

Metric Transit Type Year Value
Accidents per 100M VM Light Rail 1997 301

This is very space inefficient, but it was the only way I was able to get Tableau to correctly interpret things.

Exploratory Visualizations

When I opened my data in Tableau, the first thing I did was a basic Text Table (Cross Tab) visualization. This basically produces a formatted table which looks like the formatted table I had started with.


Although the Cross Tab looked fine, I could not get Tableau to properly recognize the year values as dates. I manually changed the data type in the Dimensions field (to Date and Date-Time, alternately) but it was interpreting values from 1980 to 2000 as being dates in 1905. I specified a custom date format string "YYYY" but that didn't work. Eventually I exited Tableau and manually rewrote the dates as 1980-01-01, 1981-01-01, etc.

Rate per 100M Vehicle Miles
Injured Persons Fatalities Accidents


To clarify the change over time, I followed some of the steps here. Initially this yielded these:

Change since 1990 in Rate per 100M Vehicle Miles
Injured Persons Fatalities Accidents



Those seemed okay (beside the vertical scaling being less than ideal). However, I realized that normalizing the starting point really only works for comparing percentage changes, not absolute changes. So I changed it to percent.

Percent Change since 1990 in Rate per 100M Vehicle Miles
Injured Persons Fatalities Accidents


I then turned to answering my question. I filtered out other modes of transit, leaving Heavy Rail, Light Rail, and Commuter Rail. Here are the views of the injury and fatality rates.

Rate of Injured Persons per 100M Vehicle Miles


We can see that, in general, there are dramatically more injuries per mile for Light Rail. Over the selected time period, this rate does decrease rather dramatically. There are two interesting spikes on the graph, one for Commuter Rail injuries in 1992, the other for Light Rail injuries in 1996.

Rate of Fatalities per 100M Vehicle Miles


We see that, unlike the injury rate, the fatality rate is comparable for Light Rail and Commuter Rail. During the 90's it was a mixed bag, each outperforming the other about half the time (Light Rail did better in 1990, 1994, 1996, 1997). Since 2000, Commuter Rail has had consistently lower fatality rates.

To directly address the original question, I computed the difference between Light Rail and Commuter Rail. I did this as a percentage of Commuter Rail, since that was lower. The result was the following visualizations:

Rate per 100M Vehicle Miles
Fatalities Injured Persons



Presentation

Figure 1: Light Rail and Commuter Rail Comparison
Figure 1: Light Rail and Commuter Rail Comparison

Light Rail transit systems operate on city streets while Commuter Rail systems connect urban areas with surrounding suburbs. Figure 1 shows a comparison of injury and fatality statistics for these two modes of transit in the United States, over a 15 year period. The rate of injuries, per passenger mile, is significantly higher for Light Rail. This might be expected since Light Rail operates on city streets and sometimes lacks exclusive right of way. As shown in the figure, there is a high degree of variability from year to year, ranging from over 40 times to less than 5 times. The rate of fatalities, per passenger mile, is also generally higher for Light Rail. This is somewhat surprising given the low-speed of operation. For most of the period surveyed (1990-2005), the difference in fatality rates was moderate, with Light Rail occasionally outperforming Commuter Rail. Since 2001 however, Light Rail has been consistently more lethal, with two to five times the fatality rate.

Evaluation

I enjoyed using Tableau even though I don't think I mastered it. My main difficulties with it were about controlling aspects of the visual presentation (e.g. labeling the axes, the line styling, the value markers). I found the Table Calculations to be quite useful, but they seemed unnecessarily tied to the visual presentation. So when I tried to remove the Commuter Rail line it broke the calculation of percentage relative to that value. Similarly, when I changed the colors of the line (to differentiate Injuries from Fatalities, rather than Commuter Rail from Light Rail), it broke the calculation.



[add comment]
Personal tools