# A2-MichaelCohen

### From CS294-10 Visualization Sp11

## Contents |

## Michael Cohen - Feb 05, 2011 02:13:23 pm

For my A2, I will be performing some exploratory data analysis on a data set from another class I'm taking this semester, "Energy and Environmental Markets" at the business school. A big part of this class is participating in an Electricity Strategy Game where each team is in charge of selling power from a portfolio of generating units into a simplified deregulated wholesale electricity market. The timeline for A2 conveniently overlaps with the two initial "practice days" for the energy strategy game and the beginning of the actual game. For the practice days, each team is assigned a portfolio, so I'll get to warm up by asking a fairly simple question: what price should my team bid for each of our generating units, given the uncertain market demand and actions of other players? Answering this will involve constructing a traditional supply/demand curve (except for the uncertainty in the demand) and will give me a chance to get comfortable with the data and Tableau.

Once the practice days are done, all generation portfolios are returned to the pool and an auction will be held where the teams bid on the portfolios for use in the real game. This will give me an opportunity to explore a trickier question: how much is each portfolio worth to my team, given uncertainties about demand? I am less sure of what kind of visualization will be best for this, but I expect that playing with the data will give me some ideas.

The data available are:

- Capacity, variable costs and fixed costs for each plant in each portfolio
- Demand forecasts in the form of a linear equation: the y-intercept (essentially the demand if supply were free) and slope (the degree to which demand diminishes with price) predicted for each hour of each game day. (Each game day has four hours.) The demand forecast line is considered to have a standard error of 3%.

The plant data will need to have a column added explicitly tagging each plant (row) with its portfolio ID; in the Excel sheet we're given, the portfolio name is listed above each group of rows, rather than in each row. I'm hoping that Tableau will have a way to plot a line based on the intercept and slope data given, but if not I'll also need to have Excel generate sample points along each line.

## Michael Cohen - Feb 05, 2011 06:39:34 pm

Bah. Well, the good news is that after a lot of twiddling with Tableau (and adding a data column for cumulative capacity available at less than the marginal cost of the current plant, so Tableau would know where to plot the start of each plant's line segment) I was able to plot a nice supply curve. It's color coded so that each power plant line segment can be matched up to the portfolio it comes from. It's just what I wanted.

However, I then tried to add the demand curves for various days and hours. Since Tableau doesn't seem to have a facility for plotting lines with equations, I created a spreadsheet with calculated values for the end points of each demand curve. (Since the demand curves are just straight lines, I only need two data points for each curve.) A glance at the supply curve showed me that the supply price will range up to about $70/MWh, so I used 0 and 70 as my price end points, and calculated the demand quantities at each end point from the linear equations given for each hour in the data set. In addition to calculating a midpoint estimate for each hour, I also calculated a low estimate (midpoint - 6%) and a high estimate (midpoint + 6%) so I could plot a ~95% confidence interval for the demand as well.

I set up my demand data set in these columns:

Day (1-2) | Hour (1-4) | Range (low, mid, high) | Price (0 or 70) | Quantity | Label (just a concatenation to label each line, like "D1H1, low")

That way I can easily filter to curves for specific days, hours, and ranges.

The demand curves were straightforward to plot in Tableau, but trying to combine the supply and demand charts on the same set of axes has been a non-starter. Tableau is plotting information about power plant rows in my supply curve, and raw price/quantity points on my demand curve. Even though I know the end results are in the same units, any time I try to drag demand data into a supply chart axis, Tableau complains that it can't blend the measures because no join method has been defined. But I don't think I can really "join" these two data sets in a relational sense -- they're not connected attributes of related items, they're two unrelated data sets that I happen to want to plot together.

It seems that I'll need to either do some serious Excel munging to make it so that the supply and demand values can be expressed in the same data table (which I think would involve reducing the supply data to a bunch of price/quantity end points, which is awkward) or I'll need to use a different tool.

## Michael Cohen - Feb 08, 2011 11:02:47 pm

I decided to swallow my pride and just combine the supply and demand curves using an image editor (Pixelmator) since that was the fastest way to answer the question and I wanted to have something to work with before our first bids were due. The result is below - click for a larger version. If this were intended for public use, I would spend time cleaning up the color scheme, overlaps, etc., but the current version gets the job done.

One insight that came directly out of this graphic is that the "Big Gas" portfolio (red) may try to exercise significant market power in hour 3 (H3) because they control much of the marginal capacity near the demand forecast for that hour, and the price rises fairly steeply in that area. We used this knowledge to tweak our bids for that hour. (We have the "Bay Views" portfolio (blue) for the practice round.)

Now that I have a system (albeit a little clumsy) for helping us figure out how to bid in the practice rounds, it's time to turn my attention to the more complex problem of visualizing how much each portfolio is worth for the auction next week.

## Michael Cohen - Feb 12, 2011 01:57:15 am

Much more success with Tableau this time around.

In order to answer the question of how much we should bid for each portfolio, I set up a Monte Carlo simulation using Excel and Crystal Ball. The simulation is based on the provided supply and demand data for year 1 of the game. The simulation of bidding behavior is fairly naive (nobody exercises serious market power) but it's good enough to get a conservative idea of how each portfolio will do. The simulation runs 1000 games, varying demand randomly for each hour within the distributions specified by the game rules. Thus, it spits out 1000 simulated profit values for each portfolio.

I did need to transform this data to make it Tableau-friendly. Crystal Ball initially generated it with the trial numbers (1-1000) running down the first column and the portfolio names across the remaining columns, with the profit value for each portfolio-run in the cells beneath. This caused Tableau to interpret each portfolio as its own measure, which was not what I wanted. I used a pivot table in Excel to move the portfolio names into a column next to a column of profit numbers, so Tableau could see the portfolio name as a dimension and the profit numbers as a measure.

I could tell from Crystal Ball's initial summary that the distributions of simulated profits had quite a "long tail" on the high end, because in the rare event that electricity demand is very high, that tends to result in a windfall for all teams. (This is also why the mean is so much higher than the median -- when you beat the median, you can beat it by hundreds of thousands of dollars; when you don't make the median, you tend not to miss it by much). Since the distribution was "interesting" (but not the most relevant data for coming up with a specific bid) I decided to make it visible in the background of my visualization. This thread was helpful in figuring out how to get the plot I wanted, especially the file QuantileSample.twbx that was attached. I played around with a couple other ways of showing the distribution (e.g., binned histograms) but this version seemed to give the best intuition for the data while allowing the mean and quantile lines to stand out as the dominant elements, which I wanted since they're most important for boiling the data down to a single bid number.

After some experimentation, I decided to label just the mean and median lines. These are the two most likely candidates for an "expected value" and labeling the other quantile lines made things too cluttered. If my team agrees that we'd rather be conservative and use, say, the 25th percentile, I can always update the chart.

This visualization can't outright answer the question "what is the maximum bid we should make for each portfolio?" because the answer depends strongly on our risk tolerance (not to mention our faith in the simulation!) which is a subjective decision. However, I think the chart gives a good sense of the possible outcomes, relative values of the portfolios, and interesting extreme scenarios that we'll want to keep in mind as we play the game.