Quantitative Evaluation

From CS160 User Interfaces Sp09

Jump to: navigation, search

Lecture on Mar 11, 2009

Slides

Contents

Readings

Saung Li - Mar 09, 2009 02:07:18 pm

What are some of the ways we could use the experimental methods described in the readings in designing graphical user interfaces? Perhaps we could use experimentation to figure out if a component of the interface is efficient or favored by the users. The independent variable could be the different things the component does and the dependent variable could be the reaction of the users. Other variables should be controlled (which increases precision of the results but reduces generalization) or randomized (which increases generalization but reduces precision), or randomized with constraints, so that the results are not mixed with confounding variables, which affect the dependent variable and may be independent of the independent variable. Biased selection of subjects (picking them non-randomly), maturation of subjects (aging may affect how they act), history (events that may occur during the experiment), mortality (many subjects of one group may drop out, messing up the results), testing (the test itself may change their behavior), interactions with selection (the varying effects of nonequivalent groups), and statistical regression (the regression effect) may threaten internal validity by giving rise to confounding variables. To actually start the experiments, one needs operational definitions to determine the independent and dependent variables so that the procedures and definitions of the variables are consistent across experiments. The dependent variable needs to be reliable and valid and the experiment needs to actually measure what we are looking for. Along with a single dependent variable (which may not describe everything we are looking for), multiple dependent variables can be measured and combined to form a composite dependent variable that can determine the overall effect of the interface. Physiological and behavioral measures are ways to indirectly measure things like emotion, learning, and intelligence, but may not be very precise. We could use these techniques to design experiments to test user interfaces or to determine what users would like to tailor the interface towards helping them by gathering data and making proper conclusions and changes to the interface. I think that these methods provide for a great way to do experimentation that yield the best results not contaminated by confounding variables. These techniques are often mentioned in the statistics field, in which gathering data and drawing accurate conclusions about them are vital.

Kevin Huey - Mar 10, 2009 07:09:06 pm

Oho it is psychology class! (as I pointed out a few weeks ago)

I feel like the most difficult part of conducting tests is assessing their validity. How exactly do we know if we've correctly kept our independent variables uninfluenced by bias? How are we to carefully compare results from different testers when each person is different and comes from various backgrounds? How do we rationalize the data and form conclusions, without making farfetched assumptions? For that matter, how do we hold our own bias towards our prototypes from the test results? It kind of feels like psychology of the brain, measuring the degree of unbiased judgment we can form. If we as observers can push aside our personal attachments to our prototypes, we can better decide which tasks the testers will perform (and not just the easy ones that we know will "validify" our design).

Chang Su - Mar 10, 2009 09:14:36 pm

Reading about pyschology is always interesting. (Incidentally I am reading a book titled Stumbling On Happiness by Daniel Gilbert of Harvard University — less of an instruction but more of a discourse on psychological topics.) But coming back to relevance to our course, I think the reading provides a formal framework for evaluating the usability of our design prototype (and the final product). Recall the table of average times for computer interface actions from the previous reading, and one naturally induces a fair observable dependent variable: the time a test user spends on a particular task. The hypothesized independent variable is of course the ease (or difficulty) of the user interface. A good control variable could be the aforementioned table of average times. As long as random variables such as environmental constants are managed well, I suppose the findings would be a reasonable measurement of design excellence.

Alan Young - Mar 10, 2009 10:32:08 pm

Martin's Chapter 2 article on "how to do experiments" focused on a lot of terms that are familiar to people who have taken science classes and have learned about how to perform experiments. His mention of external validity did get me thinking about how psychologists strike a balance between too difficult to generalize applicability and too un-controlled to get meaningful results.(I found out that he goes over this exactly in Chapter 7). I did find statistical regression to be an interesting topic and something I have not thought about before. The idea is that evaluating people who scored low on a test more likely leads to their retest average scores to be higher and closer to the mean. This is due to errors in testing and many error variables, such as feeling unwell, that affect the low-scoring people more than the high-scoring people. The retest's error component does not bias away from the true score. Operational definitions determine what the independent variable is and how it is differentiated from other variables. Physical scientists agree upon operational definitions such as mass and units readily while behavioral scientists have a much harder time because the operations can be very subjective. Chapter 7 is about how to approach the problem of creating variables for an experiment. Reliability and validity are 2 distinct things that are important. I was interested in the discussion of physiological measures and how fMRI worked. By measuring how much blood is entering a specific part of the brain, it can be inferred that a lot of neurons are firing and that part of the brain is processing for the action in question. Thus, we can use fMRI to obtain mapping from a part of the brain to cognitive process and this information can be used to improve adjustment of variables.

Chao Michael Zhang - Mar 10, 2009 09:59:07 pm

The example of the star tracing in the readings was very interesting in that it opened my eyes to how experiements with vague purposes could have an infinite number of subjective ways to measure. The purpose of the star tracing is to measure the amount a person improves from the first time they trace the star in a mirror to the 10th time. However, the person who devised the experiment probably thought how accurately the star was traced was the only measurement to base improvement on. This omits the possibility that other variables came into play, such as if the user became frustrated with not being able to following the star accurately, and thus slowed down their tracing significantly. If this were the case, the measure of improvement has to include the variable of time.

When conducting experiments for our project's user interface, we will need to keep in mind what this reading warns us of, and make sure we account for all the variables that could possibly affect the outcome of our experiment. This means we need to both obtain the measurements of correct variables, and interpret the measurements correctly.

Siddharth Shah - Mar 11, 2009 12:15:04 am

I LIKED CHAPTER TWO!! It was easy reading, but it still made several important points. You really have to pay attention to all the threats to internal validity that he mentioned; self-selection is a big one for my group project because the people who are likely to use our study game are the people who really WANT to do better, and they will have different results from the sloths. I thought statistical regression was pretty cool (but still a bad thing overall), and it was not something I'd thought about.

Chapter 7 was pretty decent as well. I liked the three methods of determining test reliability; I think the split-half method is probably more widely used in the academic setting than are the other two methods. I thought the experiment about tracing a star using a mirror was a really good way of demonstrating the flaws of single-dependent-variable designs.

Mark Dhillon - Mar 11, 2009 12:12:55 am

Good call Kevin, the experiment variable breakdown is straight out of a psychology class. I found the section on "mortality" to be interesting, particularly the idea itself. I never considered the effect of people dropping out of different levels of a particular independent variable to be a real issue. Having recently taken a psychology course where we had to design an experiment and define independent and dependent variables, along with the range and any potential confounds, nothing else in this reading really jumped out at me as useful.

Stephanie Shih - Mar 11, 2009 01:03:37 am

The reading made a good point about pointing out confounding variables, and how once control of everything else had fallen into place, then the variable would as well. This time's reading was kind of like psychology. Mixed with statistics. It was also relatively straight to the point, without using a lot of extra language to dress it up, thus making it easier to understand.

Cuong Ngo - Mar 11, 2009 01:08:13 am

The chapters are relevant to our previous readings in the sense that psychology plays an important role in designing user interfaces. Dr. Martin discusses what constitutes a psychology experiment and how to conduct it. I find it pretty interesting how variables (independent, dependent, control, random, confounding) can affect the outcome of an experiment. For instance, suppose we want to know how efficiently the user can perform a task with our interface under certain environments, then we can introduce some control variables such as lighting conditions, temperature, noise level and so on. As the author put it, our ultimate goal is to "ensure that stimulus characteristics are consistent and that responses are carefully measured."

Denise Ngai - Mar 11, 2009 01:07:26 am

Reading Chapter 2 made me feel like I was prepping for a science fair back in middle school (LOL) mixed in with a little bit of CS70 (random vars, etc.). I suppose it is useful in our case, however. We will be conducting "experiments" on users/participants with our prototypes; therefore, it is useful to consider aspects such as independent variables, random variables, mortality, and most importantly, internal validity. I would personally say that in our particular experiment, mortality and selection would be the top threats to internal validity, especially because it would be easy to pick participants whom we know are computer-savvy or familiar with our designs, and it would also be easy for our participants to drop-out and get lazy to participate in our testing if they somehow find something better to do with their time other than to help us out. :(

It's interesting to read about these familiar scientific terms used with regards to psychology and human interaction rather than with regards to the number of plant specimens one uses in an experiment or the amount of sunlight available to a specimen, etc. It is helpful to consider these factors for use with our prototype experimental analysis in order to better understand the results of our experiment in scientific/psychological terms.

Szu-Chun Mao - Mar 11, 2009 01:35:39 am

This reading is very informative, especially since I do not have any psychological study background. It is well organized and clearly demonstrates how to perform the experiments using scientific methods. In choosing an independent variable section, I like the idea of pilot experiment where we conduct an informal experiment to iron out those small bugs. We get to change our independent variable as needed during the experiment. In choosing a dependent variable section, it mentioned operational definitions to perform the experiment. So our attempts to measure things like aggressiveness would not be too subjective. I enjoyed the reading, however I don’t think it is as relevant to our current project as it seems.

Gregory Leshner - Mar 11, 2009 02:01:51 am

As we are spending a lot of time this week creating the user interface and working with users, I have found myself thinking a lot about the basic premise of our game and if it fulfills its purpose. We are building a game that is supposed to be a study aide aka a learning game. I think it would be very interesting to design a study around the game and see if it has the intended effect. While that is an obvious application at the macro level. I am sure these techniques could be applied at the micro level of how well a user navigates a page based on placement/size/color of buttons, drop down boxes, etc. The rub is that reliable testing takes a lot of effort and discipline in order to get publishable results. Anecdotal evidence is so much easier ... and sometimes just as convincing.  :)

Rohan Dhaimade - Mar 11, 2009 02:13:59 am

The first chapter 2 reading was pretty boring for me as I've taken statistics before and most of hte topics covered in the reading are basic things to consider when doing any statistical experiment or even analyzing one. The second reading though was more interesting as it presented criteria for choosing independent and dependent tasks. The tasks on validity and reliability were much more comprehensive than anything I've covered. I think one of the "reliabilites" test and retest is completely invalid. Experiences are always built up during the first test and this original test should affect the outcome of the re-test no matter what the situation unless you modify the test somehow including input and possibly even criterion.

William Cho - Mar 11, 2009 12:34:04 am

Although I've already taken a psychology course here that covered these topics, I still learned quite a bit from these chapters. It reminded me of all the different factors and variables that can affect an experiment. The example scenarios were a nice touch. It seems that selecting the participants is the most applicable part of this reading to our class. Even though we are only selecting a few people to interview for our group project, I guess we should still try to choose a representative trio in order to better find faulty design issues that generalize to more people. Yeah. I'll try to keep some of these ideas in mind during our group's user study.

Sean Kim - Mar 11, 2009 02:29:52 am

When editing quantitative evaluation, why is it an issue how to do psychological experiments? This was my first question, but after reading the articles, I can learn how difficult good and precise quantitative evaluation is. There are lots of different variables to be able to affect the result of experiments; independent variable, dependent variable, control variable, random variable, and confounding variable. Especially, the confounding variable is closely related with internal validity. And it is occurred from the following reasons; history, maturation, selection, mortality, testing, statistical regression, and interactions with selection. therefore, when getting quantitative evaluation from experiments, it is important to choose the variables that can make the result less affected with unrelated reasons.

Carolchen - Mar 11, 2009 02:28:32 am

I thought Chapter 2 did a good job of matter-of-factly describing the experimental method and what goes into selecting different types of variables in an experiment. It was a good introduction to conducting scientific experiments. For example, knowing how the fewer control variables you use, the more generalizable the study is, can come in handy. My eyes glazed over while reading the end of Chapter 7. While I found it very applicable to learn how to select variables, select productive ranges, and make operational definitions, the reading lost me at the sections on single, multiple, and composite dependent variables.

Moonway Lin - Mar 11, 2009 02:43:22 am

I've taken psychology courses, so this material is all review for me. It's interesting, though, how the scientific method is always presented as a standard overview among Psych 101 textbooks' introductory chapters, and yet many textbooks written for the hard sciences never bother to teach or even refer to it. After we've implemented our interactive prototypes, I think it will be beneficial for us to test them on actual users according to the format discussed here. In return, we'll be able to know whether our programs are truly effective, and if not, how we can possibly improve them. The problem is that the effectiveness of a program cannot be easily measured or quantified.

Victor Lum - Mar 11, 2009 02:59:09 am

Man, this class is feeling more and more like a psych class. I guess what I found interesting is that psych experiments shouldn't be tightly controlled, something different than what we were taught in our science classes. Also, there seems to be a lot of threats to validity, some that aren't so obvious. I guess this could be applied to the assignment in that we could experiment with our users, and see what parts of our user interface work well. But then again, that kind of just sounds like user testing.

Shoeb Omar - Mar 11, 2009 02:53:28 am

I've taken psych classes before but I've never had this in-depth an intro to these concepts. At first I thought I was going to be bored out of my mind but the reading was surprisingly enjoyable. I never knew about things like having multiple or composite dependent variables and the concepts were interesting. I also found it delightful to read from a textbook for once--nice easy arguments, organized, and it had pictures. Super easy to follow and thus nice to read.

Timofey Titov - Mar 11, 2009 03:12:10 am

While people think of this as being part of psychology, it falls also under statistics! The headline gives it away. We're interested in gathering meaningful quantitative data. However, since we're dealing with subjects, psychology helps us minimize the effect of confounding variables. I guess this is foreshadowing to the Pilot Study that we will be doing soon. We have seen a lot of this material in previous readings. For example, not telling the user how to complete a task beforehand helps retrieving unbiased data.

Derek Liu - Mar 11, 2009 03:04:58 am

These readings show that user interfaces and psychology are closely related. We have to know how the brain works and how to conduct psychological experiments in the design and prototyping phases of our projects in order to create easily usable interfaces. The first reading provided a good outline of the experimental method, however, most of it was review for me having taken a psychology class in the past. The second reading provided more of a "how-to" on creating variables and presented the many different types of variables there are.

Phiroath Chan - Mar 11, 2009 03:20:07 am

Martin did a good job of layout out the experimental method. I vaguely remember the experimental method from high school science classes or even middle school, but i haven't really done any work using it recently so having this reading was a useful refresher. I do remember stuff about the independent variables, dependent variables, and control variables from previous schooling, but Martin introduced new ideas to me this time around. First off I like how he connected independent variables with circumstances and dependent variables with behaviors. It is easier visually to understand if one refers to the circumstances that cause the behavior than just calling them independent or dependent variables. I also liked his explanation of internal validity. I'm not sure if this is a new turn based on UI, but the concept is new to me. Martin went in to great depth about the treats to internal validity. I feel like when we introduce freedom or choice into the mix of anything things get somewhat more complicated and that was what i felt from the reading. There were terms in the reading that were suppose to help us pick variables, but I find myself going back to the sections to re-read what they were. Overall i think reading it again will be beneficial.

Chris Thompson - Mar 11, 2009 03:35:43 am

Ah, the first article was a throw back to CogSci 100. It details how to conduct psychology experiments. An experiment consists of independent variables (the parts of the experiment that change from trial to trial but are determined by the experimenter before the user sees them), dependent variables (the users' responses to the situation created by the dependent variables), and control variables (things that are decided to be consistent throughout all experiments conducted). This article also includes random variables, which we didn't discuss in CogSci 100 (but have frequently in CS courses). They are the variables that are decided by fate; be it a coin toss or a die roll or a computer simulation. The second reading adds a little more focus to these ideas, helping the experimenter(s) to choose effective yet reasonable parameters, and to ensure that they validly describe what we're trying to discern. Most of it is fairly common-sense advice, but it's good to brush up on from time to time since often experimenters can make grand assumptions based on their (expert) knowledge.

Raymond Young - Mar 11, 2009 03:37:43 am

This reading was cool because it went over things that I never think about. The scientific method is hammered into all of us by our high school teachers. What isn't taught that I read here are the common pitfalls of obtaining results from an experiment. There are more than one might imagine at first. It's a very delicate matter to make sure that nothing in your experiment setup is creating a bias. I loved the Pepsi vs. Coke example. It is definitely helpful material for testing our interfaces without any biases. The reading also went over how to execute a repeatable experiment, meaning that the chosen details of experiment execution are chosen and defined carefully so that the results can be verified and/or contradicted based on the conditions the experiment was run in. An example of where things can go wrong if things aren't defined carefully and documented as such: If two parents are discussing whether or not their children are affected by violent TV, and one parent says their child is and the other says their child isn't. What they don't know is that the parent of the child who "isn't affected" thinks that Football is violent TV, and the other parent thinks that anything short of Dawn of the Dead is nonviolence. Simple example, but it can be applied to our UI testing and documenting in many ways, including: having our users express what they're feeling/thinking about the interface in real time (while we take notes/record) and having that user explain exactly what they mean if they use terminology that can be interpreted in different ways, using consistent and concretely defined terminology (e.g. reaction time: significantly faster = at least .3 seconds less lag, rather than leaving "faster" undefined). Much of this ties into our previous readings about obtaining information from the test-user's mind and interpreting their behaviour, and verifying your understanding of what they are saying by explicitly asking and explaining to them how you are perceiving what they are explaining about your GUI. All of this makes testing sound more fun because I have confidence about good ways to go about testing.

David Burban - Mar 11, 2009 03:41:31 am

I'm hoping that we do not have to apply these readings to our low-fidelity prototype assignment. I felt that Ch7 was too specific in the selection of the variables, since I felt that Ch2 seemed to cover them quite well for our needs. I don't really see how this is going to help us with our group projects, since we're doing user studies, not user experiments.

Timothy Yung - Mar 11, 2009 03:55:07 am

The chapter 2 reading seemed like a review of how to conduct experiments and how to sample from high school statistics. However, the concept of internal and external validity were new to me. I found external validity to be interesting because the article treated the differences in the participants (i.e. their stress level, genes, etc.) as random variables that allowed the data to represent a population (as opposed to damaging the dependent variables). Also, the explanation of internal validity and the list and explanation of possible confounding variables was quite interesting.

Chunwei Lai - Mar 11, 2009 04:12:59 am

It's not often to see psychology mixed in with computer science and unfortunately we don't see it here. This is mostly about learning about experiments the appropriate way which is not as helpful. I don't really believe that the vast majority of programmers conduct experiments with end-users directly but rather indirectly through data analysis (such as tracking usage). The material would have been more interesting if it wasn't presented in the same format as it always has been presented. Regardless, this type of information is still good to know -- I just don't think this kind of depth is really necessary.

Kevin Nakahara - Mar 11, 2009 04:25:57 am

Like most of the other posters here, I found that the readings were more about the process of experimentation than psychology itself. However, this is not a bad thing. The first reading was a nice refresher on how to conduct experiments taking randomization, confounding, etc in account. I think that for most of us CS oriented people, who may not have taking science courses for a reaaally long time, its important to understand how to extensively survey things, and how to find way to apply experimentation techniques to other fields. The first reading brought up a particular point that I had never thought of before-how experimenters do not want to control all the variables in an experiment so that it can be more relevant to real-life scenarios. Often times in certain fields *cough* economics *cough* too many things are held ceteris paribus, without regard to how many variables can affect a system. Taking the randomness of the environment certainly makes sense to me.

Alexei Baboulevitch - Mar 11, 2009 03:31:17 am

According to Chapter 2, if the experiment has too many control variables, the results will be ungeneralizable. On the other hand, if there are too few control variables, the experiment will probably end up with a lot of confounding variables, which will make the results equally useless. This means that there has to be a fine balance between control variables and random variables in order for the experiment to be meaningful.

In many ways, iterative interface design is very similar to the described methods for psychology experiments. The dependent variable is the ability of the user to perform certain tasks. We could use the quantitative measures discussed earlier or a more descriptive heuristic. The independent variable is mainly the UI design itself, with the range being a small adjustment to the interface for every iteration. For this reason, it's very important to acknowledge confounding variables in UI design: just because a user does a task well doesn't mean that good design is necessarily responsible. It's especially important to notice if something in the UI - a status indicator, for instance - is making the task easier for just that instance and not others.

Bernardo de Seabra - Mar 11, 2009 04:51:37 am

In chapter 2 of the reading the author goes into a great length in explaining how to properly conduct experiments that hold validity. From the reading I can notice how detailed and specific each parameter of an experiment has to be studied, planned and executed. The authors mention the different variables present in an experiment such as independent, dependent, control, random (with/without constraints) and confounding. Each single one of these is explained and exemplified in great detail. The author then covers the internal validity of an experiment mentioning about important factors to keep in mind. Among these we can find history, maturation, selection, mortality, testing, statistical regression and interactions with selection. These show how results could be very well misinterpreted if these factors are not analyzed carefully. Having talked about all the different variables in an experiment and how important and delicate choosing the right values using the right method is, the author covers in chapter 7 how to decide which variables to manipulate and measure. He offers detailed explanations for choosing and defining both independent and dependent variables. The author further explores the sub-categories under which dependent variables could fall under.

Jeffrey Patzer - Mar 11, 2009 04:51:53 am

So what I really took away from these readings, especially the first (chapter two), was that no matter what you do, there are always going to be things that you cannot control in your experiments. Since there are always going to be extenuating variables then you must try and control for them as much as possible. Try as you might, this science that you are attempting to implement is more like a good guess. Despite the news that one cannot expect to be able to ever fully realize a perfect experiment, the article offered many good tips on how to do your best to account for unforeseen variables. I think the power of this article comes from its ability to help guide the designer through a non-scientific task in a scientific way that is both humane and effective.

Nalditya Kusuma - Mar 11, 2009 04:48:10 am

The topic of the readings are quite interesting; how to do an experiment and how to be good at it. Chapter 2 talks about things to do and not to do upon/during an experiment. It also talks about how to take random samples and how to validate your variables. Chapter 7 talks about how to choose the appropriate variables to manipulate for the experiment. Even though it was difficult for me to read the articles they are however interesting to read and very informative.

Ian Hildreth - Mar 11, 2009 05:09:50 am

This reading started my brain thinking about psychological experiments and our own project, and what we can test. Its not as easy as simply stating an objective and running an experiment; the article asserts that there is a strict formal procedure with many different variables involved needing to be defined. I think the challenge comes in making the argument valid. One story that stuck out in the article was an experiments conclusions that left-handed people die at a younger age, but they did not take into account the social stigma attached to being left handed for the 80 year olds, something the 10 year olds did not have to face. It is very hard to run an experiment without confounding variables. Overall, I liked the article though, it did a very good job explaining things clearly, thought I felt like it was a dumbed down version of many scientific method breakdowns I have seen before.

71.202.240.141 - Mar 11, 2009 05:46:05 am

I was expecting articles that were related to heuristic evaluation while these were more about setting up an experiment. It was interesting the parallels to statistic, which is how we analyze the data, we could see the same line of thinking going into creating experiments and handling the problems. The details on mean regression made a lot of sense and showed how taking samples of just blocks of a group can be very easily misleading. That taught me to look at even individual set results as not a set number but a range with an error which makes it difficult to have any real solid conclusions. I wouldn't be too sure about how many of these apply to user interface design. It seems you would likely improve designs through iteration, not through having multiple versions that you allow the user to test.

Ling Chen - Mar 11, 2009 05:54:00 am

I always thought variables are just variables. I didn't know variables could be divided into so many specific categories until I read this week's articles. There is so much psychological theories and ideas that goes on behind the experiments. Knowing these things would probably help us when we conduct our own project testings. For example, we would know that we need to eliminate or minimize confounding variables so we don't distort the relationship between the independent and dependent variables. I also found the section on the threats to internal validity interesting. I haven't really heard about the ideas before. Knowing these will definitely help us to identify things we need to watch out for while conducting experiments with our users. Then chapter 7 went more into detail about how to choose and manipulate variables. There are just too many variables we need to account for that could affect the result of our testing/experiment. I feel like after reading these two articles, we now have a better sense of what we need to do while testing with our users.

Prahalika Reddy - Mar 11, 2009 04:41:18 am

I thought that both the readings this time were useful to read. Chapter 2's discussion of how to do an experiment was helpful for starting off with when doing tests with users. I especially thought the part about internal validity was interesting. Unfortunately, it seems to me that none of those are absolutely controllable by the experiementers, which makes me wonder how any experiment is completely valid. Chapter 7 went into more depth about what to actually experiment with, which variables to decide to make the independent ones. Again, the validity tests were very interesting to read about; they seem to be one of the most important parts of an experiment, as the experiment is useless unless you can prove it's valid. The section on dual task methodology was also interesting. It seems like a very quick and easy way to see how focused a participant is on a certain task.

Shendy Kurnia - Mar 11, 2009 05:56:34 am

I think you have to be a really really serious UI designer to understand and implement a design methodology involving a deep study of user's psychology. The articles give deep information about human psychology which I think it is too much for UI designers. Anyhow, I enjoy chapter 2 where it explains about threats to internal validity. The example about a test for employee's durability to stress in a company is interesting; it is in the section explaining about one of the threats to internal validity, Mortality.

Adit Dalvi - Mar 11, 2009 07:06:21 am

This reading was not very interesting to me since I don’t enjoy reading about psychology much. Also, the reading trailed on like a stats class in which you learn about the effects of different variables on your experiments. I liked the sections on the threats to validity and I think they are useful to what we’re doing for our projects when testing our interface with actual users. I understand now why you guys make us look for users who are not our peers or people we may know or people who already have an idea about what we’re doing.

Anatol Tsang - Mar 11, 2009 08:42:17 am

The first reading seems like a basic introduction to statistics terms and concepts in conducting experiments. Although the reading was rather dry, I was amused by the little caricatures they had throughout the whole chapter. I guess I did learn something: I got reminded of the different biases when conducting experiments. For example, the concept of differential mortality explains that biases may occur in the data if a certain set of subjects drop out of the testing.

The second reading was an application of what we learned in the first reading. I found it somewhat more useful to our user interface class, but not by much. I think some of the methods were common-sensical. It is sometimes difficult to find something useful when some of the techniques in a field were developed with subjective definitions of success.

Meiying Li - Mar 11, 2009 09:39:04 am

I think the readings are a bit far from a HCI topic. But I found the theories introduced pretty helpful.

The first reading talks about different kinds of variable to be aware of during a psychology experiment. We have to be careful with confounding variables because they introduce distortion in the experiment result. Speaking for our prototype interviews, I think the most important threat to internal validity that we should be aware of is the statistical regression. We can use a larger range of the participants to limit the effect of the statistical regression. But since we are doing three interviews only, we need to carefully select our interviewees with as more diverse background as possible.

The second reading talks about the setup of the experiments and the measurement of the experimental results. Physiological measures are more difficult than behavioral measures because they are more implicit. This is what we should take care of in our prototype interviews, too.

Sum Sum Wong - Mar 11, 2009 09:46:54 am

The first part of Chapter 2 introduced different kinds of variables, which reminded me the life back in the middle school, where I learned variables and functions :). And the later part is pretty much like a statistic text which talked about regressions and reliabilities. So I didn't really learn a lot from the Chapter 2 but instead, it helped me recalled something that I didn't use/think of for a long time. Chapter 7 focused on how to choose the variables in order to conduct a nice experiment. But I think its a little hard to apply what is learned in Chapter 7 in our own projects since the chapter is too in detail in my opinion. And at last...the cartoons are fun.

Andrew Chen - Mar 11, 2009 10:10:44 am

After reading the articles, my question about the variables discussed is how does an experimenter go about identifying variables in the first place, regardless of how they are eventually classified? I feel that the first article is very comprehensive in its definition of different types of variables; however, the task of identifying variables seems to be in itself a confounding variable, or at least a random variable. Each experimenter can only identify so many, and when the number of experimenters increase, the number of identified variables must also increase (brainstorm style). In the lecture pace experiment example, there are only a dozen or so variables, and some of them seem rather arbitrary: "whether football won or loss." If all variables of this nature must be identified, there would definitely be more than the ones the author has listed (what about say, college basketball?); does the selection of variables therefore depend simply on the preferences or mental capacities of the experimenters?

Anjana Dasu - Mar 11, 2009 10:04:10 am

I agree with everyone these chapters were like reading a psych or cog sci textbook with all the key terms and whatnot, but that's not a bad thing. These chapters made me think about my group's methods for task analysis and lo-fi prototype testing. It got me wondering if the users we selected for our task analysis were "random" enough to be statistically useful. I think that the discussion of randomization is really important to get generalizable results, which is what we really are looking for when testing our game-- even if we are looking at a target user group. Overall, this was a pretty easy read and could be useful if our group implemented some of the techniques more consciously. I liked the cartoons in Chapter 7 a lot :)

Aaron Hong - Mar 11, 2009 10:41:11 am

First off, I would that to say this is probably the most different and unique CS class I've taken. Frankly, it feels like a mix of a business, cognitive science, and psychology class. The reading did have some interesting insights that I've never thought about in experimentation. Although it probably is so normal that anyone who's taken Introduction to Psychology class must have known. That interesting insight is that I never saw control variables as being bad. I think from a hard science point of view (biology, physics, chemistry), the more control variables you have, the better. However, control variables in psychology limit external validity and limit the generalization of experiments. In the case of people, the power to generalize is important (otherwise the results would only apply to one or a few people).

Alexander Cho - Mar 11, 2009 10:42:20 am

I learned most of these concepts during my statistics class in high school. These different variables (independent, dependent, control, random, etc.) are all something to consider when conducting experiments. Too often are experiments done that result in confounding variables and false statistics are thrown everywhere. The selections and considerations of conducting the experiment, choosing the right dependent variables, making it replicable all is agreeable. I wonder how strict we must practically control the environment when experimenting with our test users for our group project.

Sean Ahrens - Mar 11, 2009 10:49:13 am

I like this psychological take on user interfaces. The readings were intriguing, and provided for some interesting formal models for evaluating user interfaces. I learned a lot about control variables, experiment validity, generalization, and metric indentification. In particular, I was impressed by the idea that you need to have just the right amount of control variables. This was a new concept to me; I had thought that the more control variables, the better. If there are too many, the experiment cannot be generalized; if there are too few, the results will end up with too many confounding variables.

I also liked the idea of focusing on metrics of evaluation -- for example, the average time to perform a task. By comparing an experiment of this nature to the average times table provided in our previous reading, we can come to relate this dependent variable on the independent variable of ease of use. While, narrow, this experiment is focused and can provide for a remarkable benefit to the UI designer.



[add comment]
Personal tools