A3-SethHorrigan

From CS294-10 Visualization Fa08

Jump to: navigation, search

Contents

Data Domain

Choosing a data domain has been very difficult for me. I gradually narrowed the choice down to two options: CMU StatLib's Breakfast Cereal Data and the Daedalus Project's (http://www.nickyee.com/daedalus/) questionnaire response set that I used in assignment 2. On the one hand, I really like breakfast cereals and would love to have an interactive visualization of their properties to explore. On the other hand, the data set is hardly new and visualizing the MMO data is much more likely to be useful to MMO researchers like Nick Yee; plus, I have gotten to explore the data using statistical methods, but being able to visually interact with it would be just as, if not more, interesting. In the end, I have chosen to attempt a useful visualization of the Daedalus Project's questionnaire responses.


The data represents answers to 62 questions by 3,250 distinct MMO players:


variable name           variable type          variable label

Age                     numeric      R Age
other                   text         The game I play the most of currently is (other)
Hours                   numeric      I spend about x hours each week playing the game
percents                interest     How interested are you in the precise numbers and percentages underlying the gam
optimize                importance   How important is it to you that your character is as optimized as possible for t
template                frequency    How often do you use a character builder or a template to plan out your characte
collecting              howmuch     How much do you enjoy collecting distinctive objects or clothing that have no fu
customize               howmuch     How much time do you spend customizing your character during character creation?
exploreworld            howmuch     How much do you enjoy exploring the world just for the sake of exploring it?
findstuff               howmuch     How much do you enjoy finding quests, NPCs or locations that most people do not 
group1                  grouppref      Would you rather be grouped or soloing?
group2                  importance     How important is it to you that your character can solo well?
group3                  howmuch     How much do you enjoy working with others in a group?
leadgroup               howmuch     How much do you enjoy leading a group?
lookunique              importance     How important is it to you that your character looks different from other charac
matchcolor              importance     How important is it to you that your character's armor / outfit matches in color
recognition             importance     How important is it to you to be well-known in the game?
takecharge              frequency      How often do you take charge of things when grouped?
agitated                agitated    How agitated do you get if the servers go down unexpectedly?
complain                frequency      How often do your friends or family members complain about your game play behavi
happiness               amount      How much of your happiness in life currently is derived from playing the game?
limittime               difficult      How difficult would it be for you to limit your playing time?
socialsuffer            suffer      Have your personal relationships suffered because of your game-play?
spendtime               toomuch     Do you spend more time than you think you should playing the game?
worksuffer              suffer      Has your work/school performance suffered because of your game play?
level                   importance2      Leveling up your character as fast as possible.
acquire                 importance2      Acquiring rare items that most players will never have.
powerful                importance2      Becoming powerful.
accumulate              importance2      Accumulating resources, items or money.
mechanics               importance2      Knowing as much about the game mechanics and rules as possible.
selfsufficient          importance2      Having a self-sufficient character.
immersed                importance2      Being immersed in a fantasy world.
escaping                importance2      Escaping from the real world.
help                    enjoy       Helping other players.
know                    enjoy       Getting to know other players.
chatting                enjoy       Chatting with other players.
competing               enjoy       Competing with other players.
killing                 enjoy       Dominating/killing other players.
explore                 enjoy       Exploring every map or zone in the world.
cguild                  enjoy       Being part of a friendly, casual guild.
sguild                  enjoy       Being part of a serious, raid/loot-oriented guild.
roles                   enjoy       Trying out new roles and personalities with your characters.
annoy                   enjoy       Doing things that annoy other players.
converse                frequency     How often do you find yourself having meaningful conversations with other player
personal                frequency     How often do you talk to your online friends about your personal issues?
support                 frequency     How often have your online friends offered you support when you had a real life
stories                 frequency     How often do you make up stories and histories for your characters?
roleplay                frequency     How often do you role-play your character?
avoid                   frequency     How often do you play so you can avoid thinking about some of your real-life problems
relax                   frequency     How often do you play to relax from the day's work?
provoke                 frequency     How often do you purposefully try to provoke or irritate other players?
gender                  gender        R Gender
game                    game        The game I play the most of currently is
buy                     buy         Have you ever bought virtual currency or items
flirt                   flirt       Have you ever flirted with someone in the game that you didn't know in RL?
hours10                 hours10     Have you ever played continuously for 10 hours
hours6                  hours6      Have you ever played continuously for 6 hours
macro                   macro       Have you ever used a third-party macro in an MMORPG
oppgen                  oppgen      Has your main character ever  been a character of the opposite gender?
raid4                   raid4       Have you ever been in a raid / dungeon group that lasted for at least 4 hours
raid8                   raid8       Have you ever been in a raid / dungeon group that lasted for at least 8 hours
tryquit                 tryquit     Have you tried to quit the game but was unsuccessful?

Visualization Techniques and Storyboard

First off, I needed to decide how this data could be visualized interestingly. There are two aspects to the data that stood out immediately:

  1. How does one variable relate to another one (or two or three)?
  2. How can you visualize more information about a specific respondent?

Design and Justification

The design looks a bit like HomeFinder/FilmFinder and a bit like SpotFire (Why re-invent the wheel? Use familiar design...). In order to support both of the interactions I described above, I have two different types of interaction with the visualization software. A scatterplot works well for the data set because we have a large number of individuals who responded to a large number of questions. The scatterplot allows us to represent each individual on a specific set of questions with a dot, allowing us to see both details on the individual and an overall picture of the system. In the case where I show a large number of bar graphs (individual drill-down), I do so because each one represents just one dimension of variables (hence bar graphs), but displaying them together allows the advantages of small multiples (or not-so-small multiples, in this case 62), while preserving the information about the one individual in comparison to all the others on any given question.

The sketches/storyboard does not address one major concern: how does this overall design scale to 62 distinct variables? The sketches shows only a handful of variables. I am still trying to figure this out. In the worst case, I will just implement this for the top 10, or 15, or 20 variables of interest to me. I am biting off a massive chunk to chew here - I just have to hope I don't choke on it.


Investigate relationships among variables

First the visualization provides two axes. This will be implemented for this project. Each axis allows you to select any one of the 62 variables to plot on it. Thus any two variables can be compared to all others using scatterplots.

Initial sketch, axes design


Second the visualization provides constraints. This is the last step and probably will not be done in this project. By selecting a specific set of constraints, the viewer can visually emphasize a subset of the points that match the specified conditions. The others remain visible but are colored to indicate they do not meet a subset of conditions specified as with Attribute Explorer.

Initial sketch, filtering



Investigate a specific respondent

First the visualization will allow you to click once on a respondent in the scatterplot. Doing so highlights that respondent and you can then change the variables in the plot and still see where the respondent falls (because he remains highlighted). This may be done in this project if time allows.

Initial sketch, highlighting


Second the visualization allows you to double-click to drill down (details-on-demand), whereupon histograms of all of the variables will be shown, and the column containing the respondent will be emphasized. This is analogous to double clicking on a film in FilmFinder except that it provides much more detail, and it does not maintain the same display with and overlay (much as I would like to). Actually, the optimal would be to show a reduced-size version of the graphs in the corner, then allow the user to make that full screen if he wished, but I do not yet know if I can do that. This is almost a separate program and probably will not be done in this project.

Initial sketch, drill down


Implementation

Overview and Variation from Storyboards

This program visualizes 61 variables collected from 3,250 respondents in phase 17 of Nick Yee's Daedalus project. 62 variables were collected but one of them, "other", was merely a clarification of which game the respondents played most often. This has been excluded from the visualization at this stage. Otherwise, the only major change from the story board is how the variables are selected. In the storyboard, 62 buttons were placed on both the x and the y axis to control which two variables are being viewed; however, space and design constraints at this stage necessitated using two drop-down selection boxes instead (one for the x and one for the y). The usability of this is somewhat reduced because both are along the x axis, which increases the cognitive load of mapping (putting it on the y axis reduced graphing space too much), and because this does not allow tooltips that describe the variable in more detail so the user must map between the one-word variable name and the actual question it represents.


The visualization allows you to plot any two of the variables against each other in a scatterplot. Since many of the variables are categorical, simply potting them on their respective axis would not be very informative (one point appears the same as 600 points), so I introduce random jitter in these cases to allow the user to see density of responses.

Also, the main reason for providing this as a scatterplot is to investigate the variables as a sum of individual contributors, so I provide facilities to track specific points across variables via brushing (click to highlight in red). As you roll over a point, a tooltip describing its exact values also appears: non-responses to specific questions are represented as -1. The ordinal values show the numeric representation of the response rather than its textual equivalent.


Although I am quite happy with the program as it stands now, I must admit that it provides very little functionality that could not be achieved with Tableau, and it is harder to change the data set underlying the program. The ability to highlight a specific point and follow it through different variable comparisons can be quite useful, but the visualization is limited to two dimensions (x and y position) and to scatterplots. On the other hand, now that this is available, it will be quite easy to extend it to provide much of the functionality outlined in the storyboards for future phases of the project (e.g. drill-down on a specific respondent, filter variables and color them like Attribute Explorer).

Executable

In most operating systems, you can run the jar file by double-clicking (or the equivalent action). Alternative, open a command line, navigate to the directory with with jar file, and type "javaw -jar Daedalus.jar".

Daedalus Java executable

Source Code

Source code is provided in two different forms:

  1. If you have Prefuse installed in Eclipse already, download this file.
  2. If you do not have Prefuse installed in Eclipse already, download this file which includes Prefuse instead. (Note: link fixed now).


To run the program,

  • open Eclipse
  • select File -> Import Project
  • select General -> Existing Projects into Workspace (then click "Next>")
  • click "Select archive file" then browse to find the zipped file
  • once the project has been added, find Daedalus.java in src in the visualization project, select it and press "Alt+Shift+x,j" to run the program

Development Process

Not counting the initial storyboarding and the time spent creating this page (that is, just counting the time spent installing Prefuse and coding/testing the program) I spent 32 hours developing this program. The bulk of the time went to working around the limitations of Prefuse. A large portion of that time was spent on getting the axes labeled and sized correctly. I had both ordinal and continuous variables. For Prefuse to plot the ordinal data correctly, I needed to give it the numeric representations of the data, but for it to be useful for the user, it needed to be the labeled values. In order to achieve this, I had to read through and understand the entirety of AxisLayout and AxisLabelLayout then create subclasses of them that would work as I needed (overriding parts as necessary). Also, RendererFactory could not provide the necessary functionality so I had to subclass and modify it as well. Thus, much of the time was spent extending the functionality of Prefuse to allow me to do what I needed.


Additionally, Java does not support vertical labels, but, horizontally, the full text of the questions is far too long to be practical when labeling the y-axis, so I had to find way to generate vertical text. Happily, someone had written a class called VTextIcon that I read through and incorporated into the program allowing me to generate vertical text.


The actual functional part of the program - graphing the points, allowing users to change the variables graphed, and highlighting specific points - took very little time at all. Much of it was already present as needed within Prefuse.

Seth Horrigan - Oct 13, 2008 03:01:06 pm

Resizing the window queues update operations, which, when combined with random jitter, makes for an interesting animation. The queued updates could be fixed, but it was too entertaining to leave it as it is.



[add comment]
Personal tools