A3-SethHorrigan
From CS294-10 Visualization Fa08
Contents |
Data Domain
Choosing a data domain has been very difficult for me. I gradually narrowed the choice down to two options: CMU StatLib's Breakfast Cereal Data and the Daedalus Project's (http://www.nickyee.com/daedalus/) questionnaire response set that I used in assignment 2. On the one hand, I really like breakfast cereals and would love to have an interactive visualization of their properties to explore. On the other hand, the data set is hardly new and visualizing the MMO data is much more likely to be useful to MMO researchers like Nick Yee; plus, I have gotten to explore the data using statistical methods, but being able to visually interact with it would be just as, if not more, interesting. In the end, I have chosen to attempt a useful visualization of the Daedalus Project's questionnaire responses.
The data represents answers to 62 questions by 3,250 distinct MMO players:
variable name variable type variable label Age numeric R Age other text The game I play the most of currently is (other) Hours numeric I spend about x hours each week playing the game percents interest How interested are you in the precise numbers and percentages underlying the gam optimize importance How important is it to you that your character is as optimized as possible for t template frequency How often do you use a character builder or a template to plan out your characte collecting howmuch How much do you enjoy collecting distinctive objects or clothing that have no fu customize howmuch How much time do you spend customizing your character during character creation? exploreworld howmuch How much do you enjoy exploring the world just for the sake of exploring it? findstuff howmuch How much do you enjoy finding quests, NPCs or locations that most people do not group1 grouppref Would you rather be grouped or soloing? group2 importance How important is it to you that your character can solo well? group3 howmuch How much do you enjoy working with others in a group? leadgroup howmuch How much do you enjoy leading a group? lookunique importance How important is it to you that your character looks different from other charac matchcolor importance How important is it to you that your character's armor / outfit matches in color recognition importance How important is it to you to be well-known in the game? takecharge frequency How often do you take charge of things when grouped? agitated agitated How agitated do you get if the servers go down unexpectedly? complain frequency How often do your friends or family members complain about your game play behavi happiness amount How much of your happiness in life currently is derived from playing the game? limittime difficult How difficult would it be for you to limit your playing time? socialsuffer suffer Have your personal relationships suffered because of your game-play? spendtime toomuch Do you spend more time than you think you should playing the game? worksuffer suffer Has your work/school performance suffered because of your game play? level importance2 Leveling up your character as fast as possible. acquire importance2 Acquiring rare items that most players will never have. powerful importance2 Becoming powerful. accumulate importance2 Accumulating resources, items or money. mechanics importance2 Knowing as much about the game mechanics and rules as possible. selfsufficient importance2 Having a self-sufficient character. immersed importance2 Being immersed in a fantasy world. escaping importance2 Escaping from the real world. help enjoy Helping other players. know enjoy Getting to know other players. chatting enjoy Chatting with other players. competing enjoy Competing with other players. killing enjoy Dominating/killing other players. explore enjoy Exploring every map or zone in the world. cguild enjoy Being part of a friendly, casual guild. sguild enjoy Being part of a serious, raid/loot-oriented guild. roles enjoy Trying out new roles and personalities with your characters. annoy enjoy Doing things that annoy other players. converse frequency How often do you find yourself having meaningful conversations with other player personal frequency How often do you talk to your online friends about your personal issues? support frequency How often have your online friends offered you support when you had a real life stories frequency How often do you make up stories and histories for your characters? roleplay frequency How often do you role-play your character? avoid frequency How often do you play so you can avoid thinking about some of your real-life problems relax frequency How often do you play to relax from the day's work? provoke frequency How often do you purposefully try to provoke or irritate other players? gender gender R Gender game game The game I play the most of currently is buy buy Have you ever bought virtual currency or items flirt flirt Have you ever flirted with someone in the game that you didn't know in RL? hours10 hours10 Have you ever played continuously for 10 hours hours6 hours6 Have you ever played continuously for 6 hours macro macro Have you ever used a third-party macro in an MMORPG oppgen oppgen Has your main character ever been a character of the opposite gender? raid4 raid4 Have you ever been in a raid / dungeon group that lasted for at least 4 hours raid8 raid8 Have you ever been in a raid / dungeon group that lasted for at least 8 hours tryquit tryquit Have you tried to quit the game but was unsuccessful?
Visualization Techniques and Storyboard
First off, I needed to decide how this data could be visualized interestingly. There are two aspects to the data that stood out immediately:
- How does one variable relate to another one (or two or three)?
- How can you visualize more information about a specific respondent?
Design and Justification
The design looks a bit like HomeFinder/FilmFinder and a bit like SpotFire (Why re-invent the wheel? Use familiar design...). In order to support both of the interactions I described above, I have two different types of interaction with the visualization software. A scatterplot works well for the data set because we have a large number of individuals who responded to a large number of questions. The scatterplot allows us to represent each individual on a specific set of questions with a dot, allowing us to see both details on the individual and an overall picture of the system. In the case where I show a large number of bar graphs (individual drill-down), I do so because each one represents just one dimension of variables (hence bar graphs), but displaying them together allows the advantages of small multiples (or not-so-small multiples, in this case 62), while preserving the information about the one individual in comparison to all the others on any given question.
The sketches/storyboard does not address one major concern: how does this overall design scale to 62 distinct variables? The sketches shows only a handful of variables. I am still trying to figure this out. In the worst case, I will just implement this for the top 10, or 15, or 20 variables of interest to me. I am biting off a massive chunk to chew here - I just have to hope I don't choke on it.
Investigate relationships among variables
First the visualization provides two axes. This will be implemented for this project. Each axis allows you to select any one of the 62 variables to plot on it. Thus any two variables can be compared to all others using scatterplots.
Initial sketch, axes design
Second the visualization provides constraints. This is the last step and probably will not be done in this project. By selecting a specific set of constraints, the viewer can visually emphasize a subset of the points that match the specified conditions. The others remain visible but are colored to indicate they do not meet a subset of conditions specified as with Attribute Explorer.
Initial sketch, filtering
Investigate a specific respondent
First the visualization will allow you to click once on a respondent in the scatterplot. Doing so highlights that respondent and you can then change the variables in the plot and still see where the respondent falls (because he remains highlighted). This may be done in this project if time allows.
Initial sketch, highlighting
Second the visualization allows you to double-click to drill down (details-on-demand), whereupon histograms of all of the variables will be shown, and the column containing the respondent will be emphasized. This is analogous to double clicking on a film in FilmFinder except that it provides much more detail, and it does not maintain the same display with and overlay (much as I would like to). Actually, the optimal would be to show a reduced-size version of the graphs in the corner, then allow the user to make that full screen if he wished, but I do not yet know if I can do that. This is almost a separate program and probably will not be done in this project.
Initial sketch, drill down
Implementation
Overview and Variation from Storyboards
This program visualizes 61 variables collected from 3,250 respondents in phase 17 of Nick Yee's Daedalus project. 62 variables were collected but one of them, "other", was merely a clarification of which game the respondents played most often. This has been excluded from the visualization at this stage. Otherwise, the only major change from the story board is how the variables are selected. In the storyboard, 62 buttons were placed on both the x and the y axis to control which two variables are being viewed; however, space and design constraints at this stage necessitated using two drop-down selection boxes instead (one for the x and one for the y). The usability of this is somewhat reduced because both are along the x axis, which increases the cognitive load of mapping (putting it on the y axis reduced graphing space too much), and because this does not allow tooltips that describe the variable in more detail so the user must map between the one-word variable name and the actual question it represents.
The visualization allows you to plot any two of the variables against each other in a scatterplot. Since many of the variables are categorical, simply potting them on their respective axis would not be very informative (one point appears the same as 600 points), so I introduce random jitter in these cases to allow the user to see density of responses.
Also, the main reason for providing this as a scatterplot is to investigate the variables as a sum of individual contributors, so I provide facilities to track specific points across variables via brushing (click to highlight in red). As you roll over a point, a tooltip describing its exact values also appears: non-responses to specific questions are represented as -1. The ordinal values show the numeric representation of the response rather than its textual equivalent.
Although I am quite happy with the program as it stands now, I must admit that it provides very little functionality that could not be achieved with Tableau, and it is harder to change the data set underlying the program. The ability to highlight a specific point and follow it through different variable comparisons can be quite useful, but the visualization is limited to two dimensions (x and y position) and to scatterplots. On the other hand, now that this is available, it will be quite easy to extend it to provide much of the functionality outlined in the storyboards for future phases of the project (e.g. drill-down on a specific respondent, filter variables and color them like Attribute Explorer).
Executable
In most operating systems, you can run the jar file by double-clicking (or the equivalent action). Alternative, open a command line, navigate to the directory with with jar file, and type "javaw -jar Daedalus.jar".
Source Code
Source code is provided in two different forms:
- If you have Prefuse installed in Eclipse already, download this file.
- If you do not have Prefuse installed in Eclipse already, download this file which includes Prefuse instead. (Note: link fixed now).
To run the program,
- open Eclipse
- select File -> Import Project
- select General -> Existing Projects into Workspace (then click "Next>")
- click "Select archive file" then browse to find the zipped file
- once the project has been added, find Daedalus.java in src in the visualization project, select it and press "Alt+Shift+x,j" to run the program
Development Process
Not counting the initial storyboarding and the time spent creating this page (that is, just counting the time spent installing Prefuse and coding/testing the program) I spent 32 hours developing this program. The bulk of the time went to working around the limitations of Prefuse. A large portion of that time was spent on getting the axes labeled and sized correctly. I had both ordinal and continuous variables. For Prefuse to plot the ordinal data correctly, I needed to give it the numeric representations of the data, but for it to be useful for the user, it needed to be the labeled values. In order to achieve this, I had to read through and understand the entirety of AxisLayout and AxisLabelLayout then create subclasses of them that would work as I needed (overriding parts as necessary). Also, RendererFactory could not provide the necessary functionality so I had to subclass and modify it as well. Thus, much of the time was spent extending the functionality of Prefuse to allow me to do what I needed.
Additionally, Java does not support vertical labels, but, horizontally, the full text of the questions is far too long to be practical when labeling the y-axis, so I had to find way to generate vertical text. Happily, someone had written a class called VTextIcon that I read through and incorporated into the program allowing me to generate vertical text.
The actual functional part of the program - graphing the points, allowing users to change the variables graphed, and highlighting specific points - took very little time at all. Much of it was already present as needed within Prefuse.
Seth Horrigan - Oct 13, 2008 03:01:06 pm
Resizing the window queues update operations, which, when combined with random jitter, makes for an interesting animation. The queued updates could be fixed, but it was too entertaining to leave it as it is.
