Qualitative Evaluation

From CS 160 User Interfaces Sp10

Jump to: navigation, search




  • Heuristic Evaluation. Nielsen. (Read the 5 linked pages under the heading: Jakob Nielsen's Online Writings on Heuristic Evaluation).
  • How To Do Experiments. Doing Psychology Experiments. Chap 2. Martin. (in preparation for Wednesday)


Mattkc7 - Mar 10, 2010 11:14:53 am

Alexander Sydell - 3/5/2010 14:36:01

I liked Nielsen's idea of holding a debriefing session with the all of the evaluators, observers, and designers after the evaluation sessions are complete. The evaluators may come up with creative solutions to usability problems in the middle of their evaluations, but would not be able to express them as they are only looking for problems related to usability principles. A debriefing session would allow them, as well as observers or even the designers themselves, to offer their solutions to the problems they found. I think that this would make heuristic evaluation more useful because, while it mainly stays within the guidelines of the given design principles, it also allows the evaluators to express other positive and negative aspects of the interface which could be important to keep or fix.

Jason Wu - 3/5/2010 22:21:28

The Neilsen writings on heuristic evaluation were very informative. While the ten usability heuristics he listed seem somewhat obvious, I can already think of a few aspects of my third and fourth programming assignments that I could have improved if I had been thinking about these heuristics at the time. Although Neilsen only talks about heuristics in the context of evaluating user interfaces, I think every UI designer should keep this list handy and refer to it at every step of the design cycle to ensure that target users will actually be comfortable using the interface. Also, I really liked Neilsen's idea to extend the heuristic evaluation method by conducting a debriefing session after the last evaluation session. In a user testing debriefing session, the design team must come up with severity ratings and possible fixes themselves; this can be difficult since the designers have been working hard on the prototype for some time and may not be able to think of alternatives to the current implementation. On the other hand, a heuristic evaluation debriefing session includes both evaluators and members of the design team. The evaluators, who are not attached to the current design, may be able to offer solutions and suggestions that the design team would not have been able to come up with on their own.

The Martin reading got me thinking about the ability to generalize results. While the contextual inquiry was not an experiment, my group and I did use the results of the interviews to come up with a list of tasks that our app can accomplish as well as a rough idea of the interface. However, since we interviewed only such a limited number of potential users with similar circumstances (all young, living in the Bay Area, etc.), I wonder if we can actually generalize our inquiry results to our entire target user group.

Calvin Lin - 3/6/2010 22:08:13

The impression I got after reading about heuristic evaluation is that it is basically the second best option after real-user testing. To me, Nielsen was basically describing a system where a small number of people evaluate the interface on their own, and by having multiple opinions and perspectives, hopefully most of the issues will be spotted. A problem with heuristic evaluation is that a company is most likely relying on people who are not in actual situations the interface is designed for. These evaluators cannot fully understand the tasks to be performed nor have the mindset of a real user. Even if the designers and evaluators become knowledgeable of the domain, their understanding of how users would want to do things is likely comprised of assumptions and perceptions.

However, I could imagine positives coming out of this method. Because the ones who are evaluating the interface are not actual users, they could potentially provide a new, fresh, or outside-the-box perspective. This could lead to ideas and observations that a real user may not notice since the real user is likely used to a certain system and way of doing things, and so the way he/she evaluates an interface would be in comparison to expectations and what is already familiar. Someone who is knowledgeable of the domain but hasn’t developed habits and expectations could pave the way for new ways of doing things, instead of just progressing on what is already out there.

Vidya Ramesh - 3/6/2010 22:15:19

In the website readings concerning Heuristics Evaluation, the author of the various tutorials brought up the point that three to five usability experts were necessary for a good heuristics evaluation. While the number seems reasonable in theory, I would assume that it would extremely difficult to find three free usability experts even in a large company due to the fact that most technology companies don't seem to believe in usability engineering. The author made another point about the need to compare the interface against a set of defined heuristics. I think it might be better to come up with a set of specialized, but still pretty general heuristics that compile different elements from the various lists of heuristics available. A general list, while useful, would not reveal all the problems associated with the characteristics of the interface the designers want to emphasize. In the chapter by Martin, I was unsure of what he meant by randomization with constraints. The definition he gives of controlling parts of the event assignment while randomizing other parts doesn't seem very methodical or precise.

Wei Wu - 3/7/2010 11:21:36

The introduction of heuristic evaluation as a means of usability testing makes me wonder whether heuristic or wizard-of-oz testing is more useful for the purposes of our projects in the class. While the first reading argues that the two methods complement each other in finding distinct sets of usability problems, I am inclined to say that wizard of oz testing is more productive for our project since we have been pushed to design apps for such a specific user group. The tasks our applications strive to accomplish are aimed at basketball coaches, so heuristic evaluators unfamiliar with coaching basketball are likely to overlook the usability problems that are associated with the tasks.

At the same time, many of the coaches we select as our users do not own an iPhone and are innately unfamiliar with the widgets of an iPhone. This may make it hard for us to catch the problems that surround our use of the iPhone UI since our users may be confused by elements that we assume a person who owns an iPhone would automatically know how to use. In this sense, heuristic evaluation with iPhone users as evaluators may help to catch these technical problems. But, as the reading points out, it is harder to look for missing dialogue elements and other technical details with paper mock-ups so perhaps heuristic testing is better left for when we have a running system prototype.

Eric Fung - 3/7/2010 13:48:05

The readings tout heuristic evaluation as a cost-effective and easy-to-learn method for addressing biggest usability concerns with just a small number of testers. Because there is a given list of heuristics to follow, it is easier for nonexperts in interface design to apply the method, if asked. I think this flexibility in the people you can ask to do heuristic evaluation is one of its strongest benefits, as you get a diverse set of viewpoints with a relatively small amount of people.

I think the alternation between heuristic evaluation method and user testing can be compared to the differences between beta builds/release candidates and alpha/nightly builds. To be cost-effective, you would apply heuristic evaluation internally to clean up the first 'public' prototype, so it can be better representative of a usable product for your actual users. Then you can gauge on a larger scale (with more actual users) how well you hit your mark without the interruptions of an unstable prototype. It allows you to evaluate your application with two focuses - do you want to find major design flaws or touch up rough edges?

Kathryn Skorpil - 3/7/2010 19:36:58

I agree that "Aesthetic and minimalist design" is generally a good guideline for user interfaces. I remember when I took an art class on semester and they recommended that we create a website for the class. Most of the people in the class were actual art majors, while the others were not. Many of the people in the class created very elaborate and detailed web pages that were like a piece of art from afar, but then trying to navigate through the pages was awful. Simple things like knowing what thing on the website was an actual button versus just an image or text made it difficult to know where to go next. These websites could have been much more simple and still had the same artistic appeal they were going for.

Aneesh Goel - 3/7/2010 19:40:39

The heuristic evaluation method is an interesting alternative means of testing, but having QA engineers or UI specialists performing the tasks seems to introduce problems to the system. The papers correctly point out that some problems will be invisible to those without domain expertise, and suggests that subsequent user testing will fix that; however, they don't discuss the rate of false positives - information that looks misleading or incomplete that would be clear to someone in the target user-base, or the presence of extra material useful to the domain expert but appears to be clutter to the UI expert. Since often times the developers have researched the subject and worked with domain experts in their original design, they would be closer to domain expertise than the UI specialist but not experts themselves, making it difficult to distinguish sound advice and false positives.

One fix to that problem could be parallel user and heuristic evaluation, but doing user evaluations without first taking into account advice from heuristic evaluation cuts into the cost-efficiency. Having an observer present to clarify things to the UI/QA expert also helps, but risks the trap of explaining valid concerns away as not a problem for normal users and influencing the UI specialist's thought process. Another approach that might work is to implement the advice that comes from the heuristic evaluation, and query a domain expert on issues that might be false positives; this is cheaper and less time-intensive than a full user test, but still "wastes" users for other tests, and an opinion on design issues without actually working with and testing the design can easily be inaccurate. Altogether, heuristic evaluations are clearly a valuable tool, but like any other evaluation should be taken with a grain of salt and be treated as another part of one's arsenal, not a magic bullet.

Charlie Hsu - 3/7/2010 21:45:47

The heuristic evaluation readings introduced excellent ways for us to look at our own interfaces cheaply and effectively. Users are indeed difficult to find and schedule, as we are experiencing now with our Lo-Fi Prototype assignment. Evaluators of design heuristics however, are all around us in a class centered on UI design and development. The ten examples of usability heuristics were good examples of things every student in this class can recognize: we don't need target users like dance groups (for our group's interface) to check these heuristics. Furthermore, we can easily grab a group of 3-5 students to perform heuristic evaluation and maximize returns-for-costs in heuristic evaluation.

The Martin experimentation reading brought up some useful points that we should keep in mind for the Wizard-of-Oz testing we'll be doing for our Lo-Fi Prototype assignment. Our dependent variable might be thought of as the time it takes for an user to complete one of a few tasks, which would be an independent variable. The most difficult thing might be maintaining external validity. Since we'll only have around 3 participants, we need to try and get as diverse a sample of dance group coordinators as possible so that we can generalize our results well.

Daniel Ritchie - 3/7/2010 23:50:36

Heuristic evaluation sounds potentially quite useful. However, I can see a few potential problems with it. First and foremost, giving the evaluator a list of "things to look out for" could very well make him or her hypervigilant. Indeed, Nielsen's own studies suggest that evaluators typically find far more minor flaws than major ones. Furthermore, the example Nielsen gives of the type of flaw typically caught is one of "typeface inconsistency." If evaluators typically catch things only this minor, then heuristic evaluation might not be terribly useful for early-stage designs (where these details are not--should not--be specified).

I was also a bit concerned that a couple of Nielsen's heuristics seem to contradict Raskin's guidelines for human interfaces. In particular, Nielsen mentions confirmation dialogues and (limited) user customizability as good things, whereas Raskin makes convincing arguments against them.

Yu Li - 3/7/2010 23:54:22

Heuristic evaluation is an engineering method that is used to find usability problems in a user interface design. Unlike the traditional user testing, heuristic evaluation differs since it allows the observer to answer any questions posed by the evaluator and give hints in order to better assess the usability of the program. Heuristic evaluation is centered upon discovering the usability of the program and not upon finding the mistakes users make when interacting with the program, like in traditional user testing. Furthermore, a small set of evaluators is ideal for heuristic evaluation, anywhere from 3 to 5 is a good number. The reason for 3-5 evaluators is that poor results are achieved when using only a single evaluator, since a single person cannot find all the usability problems.

Jungmin Yun - 3/7/2010 23:57:05

This reading about heuristic evaluation provides measurable methods. The heuristic evaluation is concise, so it seems easy to apply it to a practical application. I think that the then recommended heuristics are especially useful and make us be able to check an interface design for a quick evaluation. And the optional reading focuses on the point that although user interaction analysis is very useful for designing and debugging a user interface, the potential users will have little bit different sets of problems. On the other hand, cognitive walkthroughs also won't find every issue with a user interface, but a combination of the these two should significantly improve the end result. This reading also indirectly explains how to apply GOMS analysis for the most frequently used actions and how to do the back of the envelope calculation on a more complicated user interface. I think these articles are really informative and useful, and they will be helpful for the Lo-fi prototype.

Daniel Nguyen - 3/8/2010 0:25:10

When Nielsen discusses the types of usability problems foun by heuristic evaluation, he states that heuristic evaluators often overlook major problems, then goes on to speak of the benefits of alternating heuristic evaluation with user testing. However, Nielsen fails to identify whether the major problems mentioned are likely to be found by user testing. It seems to me, that if user testing cannot find these major problems, then a third method of testing which could fill in this last gap should be mentioned in the reading. Otherwise, it may be worth it to focus more heavily on user testing during the early stages of design, then on heuristic evaluation later during design rather than alternating at every stage. In this way, major issues can be identified and corrected early in the design process when there is still lots of time and money left to be spent, rather than risking not finding a major issue until the 3rd or 4th stage of user testing, which may be at the end of the overall design process.

Annette Trujillo - 3/8/2010 0:49:32

I think the idea of random selection of participants for a test is a great idea that our group should use when we do our lo-fi testing on the three individuals. It is very important that we pick at random, that way we get more of a variety and a feel for the rest of the population. This way our testing will not be biased for intermediate skaters, beginner skaters, younger or older skaters, etc. Each of these groups of skaters will most likely react slightly differently than the rest as far as how they use the app, but within each group the members will react similarly, which is why this idea of random selection is a must for our groups lo-fi testing. Also, I think before doing the user testing based on our lo-fi prototype, we should have been required to do a heuristic evaluation of different group's lo-fi prototype. This would have eliminated bugs so that then when we test our lo-fi prototype on users, we can find other deeper errors in the interface.

Richard Lan - 3/8/2010 1:29:51

Heuristic evaluation is a good way to find usability issues in an interface without having to conduct user tests. Because the focus is on user-interface interaction, rather than the interface's aesthetics, it is possible to conduct a heuristic evaluation using a low-fidelity paper prototype. The difference between a user test and a heuristic evaluation is that the evaluator usually is knowledgeable of accepted rules of thumb for good interface design, and is able to evaluate the interface with respect to those guidelines. All of the guidelines for good user interfaces focus on how the system may be designed to make it easy for people to use. Because there are so many criteria, different people will find different usability problems in the same system, so it is best to have multiple people conducting heuristic evaluations on an interface. At the same time, these guidelines also make assumptions about the domain in which the interface is implemented. Depending on the system, available hardware, and available (screen)space, it may not be possible to meet all of the criteria for good design. The interface of an iPhone is vastly different from that of an IBM mainframe, so the guidelines do not always have the same relevance in all situations. Furthermore, the shear complexity of systems such as multi-function displays in modern glass cockpits make the implementation of “aesthetic and minimalist” designs extremely difficult.

The rapid spread of heuristic evaluation as a method of usability engineering was due to its ability to produce massive amounts of useful information for design engineers. In addition, it is usually a cheap process when compared to the projected benefits of conducting a usability examination. In addition, the process outlined in the article is flexible enough to allow for moderate to significant variations on its main theme. Due to this fact, evaluators don't necessarily have to be professional user interface specialists, and short training sessions can be used to give people the necessary skills to evaluate the interfaces.

David Zeng - 3/8/2010 1:50:17

The heuristic evaluation is a very interesting way of doing testing, and the reading clearly points out the pros and cons of using the technique. One additional difficulty that I believe the reading does not quite mention is the evaluator's ability to correct use the technique. If someone were not familiar with this method, then even with the proper rules in front of them, then they could still potentially misinterpret them, leading to finding "errors" that don't actually exist.

Although I had experience with performing experiments, there were still many new terms that were unfamiliar to me. Even though I did not know what they were, I had previously grouped them all into the "confounding variables" category. One new group of problems is the problem that the selection of your users or actually performing the experiment repeatedly could lead to error. This is especially useful because I often work with data and data analysis, and it is important to be able to identify potential variables that could skew the results.

Chris Wood - 3/8/2010 9:07:04

I enjoyed this week's readings, but I did take issue with a few things. First of all, when trying to find problems in an interface the readings say to sample many users because one user may not find all usability problems. This intuitively makes sense, but it seems hard to gauge whether or not fixing a usability problem for one person may create usability problems for another. The practice of an observer helping the test subject navigate the interface may lead to human misunderstandings that do not reflect the merits of the UI. The diminishing return to test subjects makes sense to me, as an economics major and all. Finding minor problems are difficult because users often do not recognize their own inconvenience that they put up with. Testing in a way to get good results from people is of course difficult because of the inextricable complexity of their lives. Applying the scientific method to UI evaluation seems problematic. How does one select a control?

Hugh Oh - 3/8/2010 9:49:51

The heuristic guidelines presented in the readings are extremely helpful in providing a guideline in producing a user interface. One of the points was on error prevention and having the interface try to hide/prevent as many errors as possible. However, this could make things extremely difficult to design and/or be detrimental to a user's experience. If there is an issue going on that is being fixed by the computer which might effect a user's experience later, it could leave a user in the dark about what is going on.

Tomomasa Terazaki - 3/8/2010 9:53:36

The readings were on testing or experiments, which we need to use for the next group project on low-fidelity prototype. The article mainly focused on how the experimenters should get ready before the experiment. Where and when the experiment takes place is very important because doing an experiment at a top of a cliff is completely different from doing at a regular sidewalk. The experimenters need to choose the best situation best suits the users that will actually end up using whatever that is being experimented. Also, another important thing to check for an experiment is to check how the participants’ act change through the test. If something that seemed like a problem seems easier the second time and as the participant learn how to use it, the time starts to fasten up. These are all the factors that are important in order to have a good test

The online article focused more on what type of things the experimenters should be careful about when the actual experiment starts through heuristic evaluation. Obviously the most important thing in a test is to find where the problems and errors are, but the article suggest what main things the experimenter should focus on in order to not just fix the problems but make it better and easier to use than before. The important things to be careful are to check whether the participant knows what is going on whenever he/she does something to the device, fix the errors if he/she accidentally pushes the wrong choice, and the interface should be clear so the user should know what to do even without someone explaining it to him/her. When the experimenters do find a problem, it is important to check how often the problem occurs (frequency), how big of a problem it is (impact), does the problem frustrate the user if it happens multiple times (persistence). Also, it is important to rate the problem, so the experimenters know what is the priority to fix. It the problem has small frequency, impact, and persistence the programmers should not even care until the last second to fix it since it should not have any impact on the program or the users. When I am doing my experiment it is important to check every move the user is doing but not to discomfort him/her and find out whether using the device is fun/frustrating or not because making a useful application that frustrates the user is not good.

Linsey Hansen - 3/8/2010 11:11:54

So... since I apparently did this reading on accident for last Wednesday and commented on Heuristic Evaluation, I'll just comment on the 'How to Do Experiments' reading... So, I really did like the parts about randomization, and I definitely agree that without it data will be less accurate- it actually made me wish that we had time to test out our application proto-type on multiple troops. Aside from the fact that the original two troops we interviewed had boys of different ages, different personalities, and from different regions, they were also run differently and this reading kind of reminded me that this could have an impact on their experience with our prototype and how they might 'think aloud' for us. Plus, we won't exactly have the option of choosing many "random" participants (but then this is just supposed to be a small-scale test with only a few people). So, while any data we get from this will definitely help us with refining our application, as the reading said, it might not be as general as we would like it to be. So yeah, this reading was definitely helpful.

Dan Lynch - 3/8/2010 12:08:25

Heuristic Evaluation is the topic of the reading from the internet. Some interesting aspects that I thought are worth mentioning will be discussed. First, a single individual will never be able to find all usability problems in a given interface---multiple user tests are required. Some research shoed that the hardest-to-find usability problems are found by those who may not find many other problems, thus, it is useful to have many people on the testing floor. However, there was a graph that showed a logorithmic-like function depicting a relation between effacacy of tests and the number of testers: after a certain number of testers (the author states 3-5) that should be optimal. The author then goes and discusses the differences between traditional and heuristic evaluation processes. The main difference is that the evaluator can ask questions during the testing that will be answered. In addition, a cost-benefit analysis was done on Heuristic evaluation, which demonstrates a model for optimizing the output from a heuristic evalation. This optimization is applied to the number of testers required for the evaluation to be successful.

The last notion discusses iw the 10 visibility heuristics. There are three that I would like to metion. The first is visibility of system status. Very important especially when using modes. For example, when in a 3D program, you want to know when you just switched into some mode (and possibly how you switched), so you can switch back if it was unintended. The second is recognition rather than recall. That is analagous to having icons with pictures instead of a command-line interface for creating 3D objects. However, if the command-line interface has a help feature, over time, it may be faster because of the time it takes to click on an icon vs. typing on the keyboard.

One of the last topics was severity ratings. Severity ratings can be associated with a given problem. The way by which they can quantify this is frequncy, impact, and persistence. Market impact is also of importance. This gives us a way to evaluation problems in UI design.

The last reading was about doing psychology experiments. The first portion of the article discusses math. Randomization, and Random variables are discussed in the context of experimentation. Randomization with constaints is where you completely control certain variables, and then compeletely randomize others. This practice can help reveal results from experiments. However, all variables cannot be completely controlled. The notion of external validity is needed. This represents the generalizabilty of an experimental finding. This can be important when discovering some new topic for the first time, how do you know when its a law? Repeatabilty and replication.

Matt Vaznaian - 3/8/2010 13:24:49

What defines a "usability problem?" It seems that the users are the ones who define how severe usability problems are since they are the ones using the products. I feel that a major usability issue arises when all tested users report back the same thing being a problem. For example, if the majority of users feels that this one dialogue is confusing and needs to be changed, then in fact it should be changed for the benefit of all users. Usability problems are subjective in that sense because there are no right or wrong usability features; it depends on the class of users. I am fascinated by the discussion of different type as a minor usability feature. I understand that a complete change in text font can throw a user off but I have a hard time believing that a small change has any effect on the time it takes the users to process the information being presented, despite what formal tests show.

Divya Banesh - 3/8/2010 13:51:02

I really enjoyed reading Neilson's writing on heuristic evaluation. I agree with him that just one user can not be used to evaulate the interface and a group of users is best. I also think that users should be picked carefully according to the needs/target group of the application. For example, users that fit the target group of the application should be selected so that there is a wide range of the types of people in the group. If the application is a personal trainer application, users of different education backgrounds, at different levels of training, different genders, age levels, etc. should be considered.

Nathaniel Baldwin - 3/8/2010 14:13:11

While the Martin article was a great introduction to psychological testing methods and pitfalls - clearly and simply written, and fairly exhaustive for its brevity - I don't have too much to say about it, as it was review for me. The heuristic evaluation materials, though, were a bit more interesting. Definitely the thing that struck me the most was, from the "10 heuristics" list, the following: "Accelerators -- unseen by the novice user -- may often speed up the interaction for the expert user such that the system can cater to both inexperienced and experienced users. Allow users to tailor frequent actions." I found myself nodding along emphatically and thinking "Take that, Raskin!" The list also reminded me how often I wish that less-than-robust programs that I use had some sort of undo/redo functionality. One question I had about the first document we read was, "is five enough?" Looking at the data from the example they start with (which used 19 evaluators), it seems to me that if they had picked the "wrong" 5 people (e.g. people that were too similarly successful/unsuccessful) they would have missed out on a number of problems, but the "right" 5 people would've given a much better variety. Understandably, in the real world, cost is an issue, but I was left wondering if there should be some attempt to pick people of a variety of skill levels.

Michael Cao - 3/8/2010 14:32:47

In today's reading, I found it interesting that in a heuristic evaluation, the evaluator is allowed to ask any question they want, while in a traditional user testing, the user isn't. I thought the whole point of not answering questions in a user testing was to reveal any problems the user interface my have that might have confused the user. However, an evaluator in a heuristic evaluation won't be able to focus on any of these problems because they are allowed to ask any questions they want. I also found it interesting that the evaluators are not actually using the systems they are evaluating whereas users do. I feel that users testing the system would be able to find many more problems and bugs than heuristic evaluators would due to this.

Brian Chin - 3/8/2010 14:34:17

I found this weeks reading on heuristic evaluation interesting, but perhaps not too useful. I feel that traditional user testing is more useful in determining the problems with an interface for a variety of reasons. For one thing, user testing involves using people who have never used the interface before using the device, with limited to no help. This will give a realistic result of what will be the public's reaction when the product actually goes to market. In heuristic evaluation, it seems that much more help would be given, perhaps skewing the data. Another problem I feel is that in heuristic evaluations, the evaluators are supposed to look for features that violate usability features. I feel that this is too strict of guidelines to use. In traditional user testing, features that violate these guidelines would be fine as long as the users didn't have problems with the violations, a criteria that I feel is much more flexible and useful. Finally, the ability to measure the severity of usuability problems is compromised in heuristic evaluations. In these evaluations, severity is measured by a questionnaire after the test. This data would seem to be not as accurate or as good as watching users test the interface and see how much they struggle in figuring out each feature.

Wei Yeh - 3/8/2010 14:39:31

Nielsen's "10 Heuristics for User Interface Design" is a very useful guideline for evaluating the quality of a user interface. Although an interface that follows all of these principles is not guaranteed to be a good interface, an interface lacking one of these is almost certainly one that needs improvement. Thus, these 10 heuristics are very useful for identifying areas of improvement. However, these heuristics do very little to help the designer figure out exactly how to improve on the interface. It would be great if Nielsen could expand on this and provide suggestions on ways to solve these usability problems.

Alexis He - 3/8/2010 14:41:29

On Jakob Nielsen's list of "ten heuristics for usable interface design", I found it enlightening that the "redo" and "undo" buttons made it to the top of the list at number three. I agree that these are important functions to include, but rarely to websites have this feature. The only one I can recall is Google's gmail and calendar system, where each action taken is highlighted at the top with a "undo" button. I've found it very useful and not obstructive at all which leads me to wonder why more websites don't implement this.

Jessica Cen - 3/8/2010 14:51:25

I agree with Nielsen when he points out that the minor usability problems encountered in the heuristic evaluation are still relevant. As a user, I found it frustrating when I encounter irregularities in the interface, such as the inconsistent typography example.

These types of distractions can also affect the way the user completes the task. For example, if the program pops up error messages in dialog boxes, then I as the user will start relating the error messages with dialog boxes. However, if similar error messages also show up somewhere else like in the console, then there is a chance that I overlook them since I am focused in receiving error messages in a dialog box. Therefore, consistency is important since that is how the user is guided when using the program.

Peter So - 3/8/2010 15:15:32

The heuristic evaluation is an optimization process. It is used to find the best way to present information to a target audience by seeking out problems through user feedback. The 10 heuristics are essentially a sanity check for design. I feel it was important to see the breakdown of project development costs as it showed an alternative way to develop a design that compares the standard "cookie cutter" recommendations of a minimum of 3 focus group sessions to identifying the value of each session and giving criteria to support data from just one session is enough. This reading helped put many of the recommendations in class into perspective as you associate the costs of developing different features of a design.

Saba Khalilnaji - 3/8/2010 15:51:12

Sparing users is one of the great benefits of heuristic evaluation. There's no point in using target users to test the usability of an interface when they can be used to test its effectiveness. When you have several people evaluating an interface severity rating are very effective means of collaborating on the individual findings. I can see problems arising when individuals find problems that might not be problems at all, but more opinions. For example a user might find a missing feature a usability problem, would it be a problem or just a missing feature? What if that feature greatly reduces the capabilities of the interface is it still a usability problem or just a missing feature perhaps addable in another version. Companies who dont spend time evaluating their software before release really hurt themselves because its cost-effective advantages are too good to pass up!

Sally Ahn - 3/8/2010 15:57:45

At first, I felt doubtful about heuristic evaluation because it seemed to contradict the focus of user testing. However, after further reading, I see that there is indeed a lot that can be learned through heuristic evaluators that might be missed in user testing. As the reading mentions, iterating between the two methods can yield better results. The reading mentions that 3-5 evaluators seem to result in higher benefits-to-cost ratio, but it mentions little about the evaluators themselves. Would evaluators with more experience be better at spotting usability problems? Or would they become accustomed to looking for certain "patterns" of problems and thus tend to miss a more unique problem? I would like to know more about how such evaluators would be chosen.

Owen Lin - 3/8/2010 16:19:27

Using heuristic evaluation seems like a really good way to focus Lo-Fi prototype test sessions. By giving the tester a set of guidelines at the end of the session, the tester can hone in on UI problems and are able to explain why those problems are indeed problems. The ten recommended heuristics seems like a perfect way to judge if your application fits within the general standard of usability, which is something that me and my group can work with to try to create the most user-friendly interface possible. I think that a thorough way to test a prototype would be to have a Lo-Fi prototype ready, have users run through it and execute certain functions without paying attention to heuristics to identify general, glaring problems, and then have a second run-through with the heuristic guidelines in mind to pinpoint more specific problems. At the end, the user would rate each problem they found in terms of severity. After all this, we could evaluate all of the users' responses and revise the interface accordingly, and test all over again.

,Jeffrey Bair - 3/8/2010 16:31:26

An interesting thing that Nielsen mentions is that heuristic evaluation sessions are different from traditional user testing. What I initially thought to do with people experimenting with our app was to just try it out and see what problems they make. However, Nielsen brings up an interesting point in that people are unwilling to try and do the wrong thing and only want to do as little as possible in order to show that they aren't dumb which makes sense. Also in terms of manpower to find the problems of userbility it only takes a few people (around 5) to find most of the flaws. By finding the number of evaluators we can see that it doesn't take that many people to find all the problems in a user interface. I wonder however, if it scales correctly since a simple iPhone app such as Hello World can easily be done with maybe even one person but trying to check windows vista with just 5 evaluators doesn't seem like a good idea.

Spencer Fang - 3/8/2010 16:33:46

In the list of Ten Usability Heuristics, one of the sections state that accelerators are unseen by the novice user, and may speed up interaction for expert users. I believe GTK has an interesting implementation of GUI accelerators. If a user hovers over a menu item such as "New File" for a few seconds, the accelerator shortcut can be redefined by pressing any keystroke combination. This offers great flexibility, but allows the user to create an interface that is non-standard, and will not be available if the user moves to a different workstation.

Conor McLaughlin - 3/8/2010 16:43:07

Not much to say about the Martin reading besides it being a nice breakdown of the basics of scientific experimentation. Nielsen, however, makes a compelling argument for the use of heuristics. Often personal evaluations of UIs have taken place without a defined list of what and how the interface should be evaluated, so to offer structure to the process is a simple but successful move. The data on how little overlap occurs between evaluators' findings was also very interesting. It shows the concrete need for a team in order to design clean and successful interfaces. I'm also glad Nielsen addressed the need for User Testing and how Heuristics serve as a supplement to that. My group is currently having trouble finding unbiased users to conduct test with, so this offers a nice intermediate point where iteration upon the design can still take place.

Arpad Kovacs - 3/8/2010 16:46:23

Heuristic evalution seems to be the 'lo-fi prototyping' approach to usability testing: quick, cheap, but effective. I thought that the list of 10 heuristics was extremely useful; the theme here seems to be providing the user with relevant information that provide the illusion of control, without any intimidating, complex decisions. I especially like the recommendation to visibly show system status; many users feel that their computers/electronic gadgets/programs are black boxes, and would probably feel more comfortable and in control if they could understand what is going on inside. I thought that the heuristic evaluation two-pass method of broad overview, followed by exploration of specific usability issues is quite clever. In many traditional testing situations, participants are only given one chance with the interface, which does not capture the fact that in real life people try again when they make a mistake. I also found the const-benefit analysis intriguing; I was surprised that the rate of return diminishes so quickly after only about 5 users.

The How to Do Experiments chapter was less interesting, possibly because I've been exposed to the scientific method for so long that it just seems common sense now, although it was nice to see a few concise definitions of these familiar concepts. Since we are not conducting a long-term experiment on hundreds of users, I don't think that we have to worry about statistical regression to the mean, mortality, or maturation; however the passage on threats to internal validity seems quite relevant to user selection for low-fidelity prototype testing. Due to the small number of users we are dealing with, we need to be especially careful of how each users' past experiences shape their current perceptions, as well as biases due to self-selection (eg volunteer users are probably more enthusiastic and eager to learn how to use a new interface, so expect the typical user to be more frustrated/less cooperative). Finally, this seems like a good reminder that we need to conduct the testing in a contextual, and reasonably routine situation (think master-apprentice model) to avoid sensitizing participants, even though in this case we are not actively trying to confirm a particular hypothesis.

Brandon Liu - 3/8/2010 16:47:23

Nielsen describes ten heuristics for usability, but doesn't really talk about the compromises between each of them.

1. Visibility of system status 2. match between system and real world 3. user control and freedom 4. consistency and standards 5. error prevention 6. recognition rather than recall 7. flexibility and efficiency of use 8. aesthetic and minimalist design 9. help users recognize, diagnose and recover from errors 10. help and documentation

Sometimes these heuristics can contradict each other. For example, combining the visibility of system status with a minimalist design is the challenge the designer faces. Also, keeping consistency and standards of words and actions can be difficult if the designer incorporates "accelerators" to make the interface more flexible and efficient. In practice, It seems that heuristics (1), (6) and (8) are the most difficult to integrate, especially keeping a minimal UI alongside 1 and 6. Finally, error prevention, error recognition and help and documentation are often ignored compared to the other heuristics - I think the hope is that these features will not be necessary if everything else works, but some of the time it's just because designers need to create a "minimum marketable product".

Long Do - 3/8/2010 16:47:28

I'm confused by the difference between user testing and heuristic testing. The only difference I could see was that in the user testing, the observer is not allowed to help whatsoever? It seems that as we make several iterations of our project, we will have to have several heuristic evaluations as well, until the near final product where we would have user testing. This might be difficult if our target users are hard to reach or the functions hard or costly to reproduce. If the model that we're testing is still low-fi, like a paper model, should the experimenter be explaining what the picture is showing or should the tester just play with the model and try to understand, which might be more difficult since the low-fi model would not be as well-polished and self-explanatory?

Joe Cadena - 3/8/2010 16:54:43

Comparing low-fidelity prototyping with heuristic evaluation raises a couple of questions with me. First, I understand finding usability problems during heuristic evaluation tests is hard to do using paper models which implies a working interface is recommended. If true, can the same user's from the low-fidelity prototyping tests also be used for the heuristic evaluation? Second, if only one of the testing methods was feasible due to time constraints, available funding, etc., which method is preferred and would a mixture of the two be useful?

Mikhail Shashkov - 3/8/2010 16:57:51

One of the ten heuristics is to provide help and documentation. I was wondering if there have been any ingenious solutions to this problem for mobile devices. Which not only provide minimal space within which to place instructions but also thrive on the notion of simple and quick use that doesn't require instructions. Basically, what is the best way to make this heuristics relevant for mobile applications? Especially more complicated ones with lots of functionalities that basically REQUIRE instructions in order to learn all the functionality.

Richard Heng - 3/8/2010 17:03:55

This seems like it would be an effective system. The only problem I see would be to reconcile conflicting heuristics. For example, Visibility of the system status and aesthetic and minimalist design could be in conflict. The system might be large and complex, and it would not be possible to show the without sacrificing some simplicity. I would imagine this would probably force multiple iterations to resolve to the maximally usable interface.

Raymond Lee - 3/8/2010 17:11:31

I believe these usability test guidelines are intuitive to follow, and it does seem it would save money and time. The included charts would likely be helpful in staging a scaled usability test in regards to number of evaluators, etc.

The (averaged) severity ratings yield a concise and quick method of determining the most damaging UI problems, and it seems like it would be easy for evaluators to generate this rating data after sufficient time with the interface.

Andrew Finch - 3/8/2010 17:14:21

Heuristic evaluation can be a very effective and relatively easy method of rating and refining a user interface. The Nielsen article lists the top ten most recommended heuristics that are to be examined while testing an interface, which do seem relatively appropriate, but also grossly generalized. The last of Nielsen's ten recommended heuristics was documentation and help. I believe this feature can be important in rare cases, but in my experience I have found that most often UI's are better without any documentation. Reading documentation is a tedious process and is often never done at all by users. It is very difficult to create documentation that is clear and effective for most users. Even for interfaces that have a lot of functionality and are generally complex, the interface should make it very obvious how a certain task is to be performed and should never rely on any instructions or help being available. The description of this heuristic in the article does touch on this, but I don't even think documentation should be a heuristic.

Andrey Lukatsky - 3/8/2010 17:22:59

Although Nielson states that a large portion of people were using usability inspection methods after being introduced to them for the first time, it seems they are not as prolific as he would like - simply judging from the fact that I haven't heard of them until this class. As I was reading this, I was thinking it would very interesting if someone created a service that did these evaluations for clients' products (perhaps QA companies are already doing this). Then I began to wonder whether it would be better for the companies' employees to do these evaluations vs. outsourcing them. Perhaps we could discuss the two alternatives in class.

Victoria Chiu - 3/8/2010 17:28:50

We want to be able to control factors besides independent variable so we are sure that the results are only affected by the independent variables. But at the same time, we do not want to control too many factors so that we can generalize the results of the experiment. The generalizability of the results of an experiment is called external validity. If we control all the circumstances, the external validity will be very small. We want to make the circumstances vary, but not biased. Therefore, we make these variables random.

Angela Juang - 3/8/2010 17:29:53

Are heuristic evaluations done by outside users made any less necessary if the producers of the interface are part of the target audience as well? It seems that if the people who are making an interface are part of the target audience, they are already performing heuristic evaluations on their interface by themselves without having to consult outside users. Therefore, I'd be tempted to think that groups like this find it less necessary to ask a large number of other users for input. However, I've heard that this doesn't actually make any difference because whenever something is your own idea, you can't criticize it as completely as someone who's not involved with the project can. I'm not sure which of these opinions is correct, but in any case, getting as many users to evaluate the interface as possible is always a good idea. It's better to get too much data than too little.

Bryan Trinh - 3/8/2010 17:30:01

Heuristic Evaluations gives an overview of an alternative method of testing usability based on specific design principles, heuristics. I think that this is a direct and straight forward way of testing certain aspects of your UI in a gradnular type of way. Using this method the observers can focus their attention on optimizing the design principles by recieving direct feedback on these attributes. In contrast a typical usability test requires that the evaluator discover how to use the UI without aid. They both have their place though and by switching between the two in an iterative process the UI can evolve to optimize on the heuristics as well as pin point major design flaws.

In my experience by continually going through these heuristics in the design process, the designer can already find potential problems in the UI.

Wilson Chau - 3/8/2010 17:31:43

The readings we did for this lecture had a lot to do with the assignments that are due soon. The Martin reading was more related to our lo-fi prototyping assignment, it was about psychology experiments and all the different variables and methods involved with that. In a sense for our lo-fi prototyping we are also doing psychology experiments, we are testing to see how users interact with our interface, using this we will try to make it better.

The Nielsen reading was about heuristic evaluation and this is exactly what we are doing for our other assignment. This reading was really good for helping me understand the process and breaking it down into parts. It went through some terms that would be useful in using in my assignment.

Esther Cho - 3/8/2010 17:33:26

I noticed a huge factor in Heuristic Evaluation is money. Of course, industries don't want to invest in something that would lose money (the article tries to adhere to these concerns by bringing up user satisfaction and better reviews) however, does the factors change when money isn't a concern? I also noticed that the article claims that both heuristic evaluation and user testing is necessary. There wasn't much detail on what user testing is, though I'm guessing that's what most companies use for usability testing so it's already given, but I was wondering if there are experimental results of the effectiveness of user testing like they have for the heuristic evaluations?

Vinson Chuong - 3/8/2010 17:37:30

Heuristic evaluation is described as a cheap way to gain insight into usability issues in lieu of or as a supplement to user testing. Lo-fi prototyping is a cheap way to produce an interface that can be evaluated. It seems that we now have a very cheap way to usefully iterate a design with only the design team. Just how much can we get out of lo-fi prototypes and heuristics alone? How much would we lose by replacing a few user testing iterations with heuristic evaluation iterations? Although the readings have established the effectiveness of heuristic evaluation, we don't really have much information on the relative effectiveness of these methods we've been given. I think it's worth discussing all of these methods in context with each other so we can get a feel for which methods to use in which instances to obtain the information we want.

Mohsen Rezaei - 3/8/2010 17:46:24

In designing a user interface there are variables and things that a designer should be concern about at the time of testing or manipulating the design. The first and most important aspect is testing the interface design. This step is very important for the design because as a system gets tested in different environments with different users, then problems and troubles show up that a programmer or designer has never thought of. After that, shuffling up and randomizing the interface design helps the designer see the reactions to different types of design for a specific topic/area. One user might be happy with one type of interface and another with another type. If we choose a set of users for testing the design it is recommended to use the same users in different environments, times, or place. This will reveal a lot of hidden properties and/or dark sides of the interface design. Furthermore, using users that have to do something with the design is important because if the users are not somehow connected to the design then they would not feel comfortable or would not feel in place at the time of the testing. Might sometimes not understand what the design/interface want them to do because they've never been in the situation. Age would be another aspect for the users that are connected to the area of study of the interface design. By this we can see what users reactions are, as they age, toward the same tasks and topics. An example would be the left-handed people that died earlier than right-handed people in the reading. Although this might not be a true fact, its important to keep in mind to do this task for testing the design.

Geoffrey Wing - 3/8/2010 17:47:43

I first heard the term "heuristics" in CS188: Artificial Intelligence. We would use heuristics to find the best route through the Pacman course. Here, the concept generally still applies. We can use heuristics to help us find the optimal user interface - users should have a simple time navigating through their tasks.

Of the ten heuristics for user interface design, I feel like "help and documentation" is often undervalued by software designers. I often find myself looking to the help section for programs, when I do not know what a feature does or how to accomplish a given task (this is a UI problem as well). More often than not, the help documentation gives minimal information, and I am then usually forced to google my problem.

Long Chen - 3/8/2010 17:48:43

Heuristic Evaluation This is a method that should be implemented iteratively in any design process since so many problems can be solved ahead of time. The writing said that any number more than 4-5 evaluators will not get that much more information. I wonder if the formula presented can be expanded to include the complexity of each design, such as the number of interactive points or number of pages/screens. I would naturally imagine a simple "hello world"-esque program would not need as many evaluators as a much more complex Bart application.

Jordan Klink - 3/8/2010 17:49:58

The reading primarily had one main effect on me, and that is simply the importance of testing when developing a product. Just today in my software engineering lecture we were discussing the importance of testing and different methods of implementing tests. At face value, qualitative evaluation is simply another method of testing a product, albeit a very effective one. Getting an outside perspective is a huge advantage, since you are able to receive analysis from someone unfamiliar with your product, just as your customer will be (initially) unfamiliar with your product. The sooner this is done the better you're off, as you'll be able to focus on the problems earlier in the design process, and will thus be able to design a better product later on.

bobbylee - 3/8/2010 17:57:50

In the essay that Jakob writes, it says it is better to alternate between user testing and heuristic evaluation iteratively since it will cover the usability and some domain-specific problems. My point of view is that if the heuristic is good at locating major usability problems, we don't really have to do it iteratively. Probably, we just need one heuristic evaluation at the beginning followed by a series of user testing if resources is limited. As I believe if a certain product is designed for a specific group, it is more important to have their feedback as many times as it can be, whereas the major usability problems are already pinpointed in the first heuristic evaluation.

Kyle Conroy - 3/8/2010 17:58:10

While qualitative and heuristic evaluation are effective techniques for diagnosing user interface designs, these methods seem aimed at larger companies with spare resources to allocate to these kinds of methods. I can't see a small start up (less than seven people) having the resources nor the time to conduct reviews such as these. I think time is the biggest problem, as many startups are focused on staying agile and creating value via features and or support. I feel that there should be a quick and dirty method that, while less effective, could be executed within an hour's time span, without taking up precious resources.

Weizhi Li - 3/8/2010 17:59:10

In his aritcle Nielsen mentions the evaluation method, heuristic evaluation, which is a technique used to evaluate usability. It can help a designer to evalueate a user interface and to check the prototype with existing products. I think it is important to note that the industry is constantly using this evaluation methond because of the wide technical user base and the rapid development. For instance, many software product go through a beta testing phase, where users are to report any problems they have with the product, or many product have sequels that use feedback from the consumers to improve the older version.

Jeffrey Doker - 3/8/2010 17:59:20

These readings were fairly straightforward. I was surprised at how data-driven the heuristic websites were, given that the content seemed so, uh, heuristic. Knowing how many test users vs heuristic testers to use will be useful for the upcoming paper prototyping though.

The experimental design paper was kind of surprising to see, simply because I hadn't before thought of our iPhone app development as being experiment based. I am curious to apply some of these techniques in how we test our prototype (i.e. presenting different modifications to different test users based on a controlled experimental design).

Darren Kwong - 3/8/2010 18:00:02

Learning about heuristic evaluation would have been beneficial in the design of some programming assignments. It appears that most respondents in the survey used user testing. I'd like to see more info and statistics on user testing compared like those given for heuristic evaluation. I'm also unclear on the difference between heuristic evaluation and user testing - is it a distinction between looking for usability issues and doing normal usage testing? How will our assignment on heuristic evaluation differ from user testing?

Jonathan Hirschberg - 3/8/2010 18:06:39

We're all performing our own psychology experiments with our interface designs. We have to select users and administer two tests. First we have to follow them around and observe them as they do their work, and we have to tell them to use our interface and see if they understand how to use it. But in order to be generalizable to the larger population, it has to randomly sampled. Otherwise there may be threats to validity. For example, when selecting users, since participation is voluntary, the users that volunteer for the test might be different from those who did not volunteer. And if you end up not using the same users for both the contextual inquiry and the prototype testing, could it be argued that it's a mortality problem, not in the sense that you'd get different percentages of success and failure if you don't count those who dropped out, like in the confrontation example in the reading, but rather in the sense that a different user whose words and thoughts were not used to inform the design ends up using the design. The people we intereviewed in the contexual inquiry are not the ones who end up using the prototype. What are the ramifications? Also, when conducting heuristic evaluations, we need to be careful of other sampling issues. Each look at the interface may be biased in some way, and everyone has a different take on the subject. So maybe the selection of examiners needs to random. Or at least, all need to do it independently so that they won't be influenced by the others or susceptible to groupthink.

Richard Mar - 3/8/2010 18:12:03

The reading on how to do experiments reminds me of the X-lab here at Berkeley. The X-lab performs social experiments, but these differ from psychological experiments in a very important way. X-lab experiments try not to deceive the participants in any way, and the primary motivating factor for participants is to earn as much money as possible.

Boaz Avital - 3/10/2010 2:02:30

I really liked the suggestion to have users testing your application to run through given tasks twice. The first time a user does something may be rough, but if they're extremely comfortable by the second time it says a lot about your application and its learning curve. Conversely, if the second time is almost or just as bad, you know you have a lot of changes to make. Running through it once won't give as much information.

[add comment]
Personal tools