PilotStudy-Group:BluJay
From CS160 User Interfaces Fa06
Contents |
Blujay Flashcards Pilot Usability Study
latest working package: FlashCards.zip //refer to README.txt for instruction
Introduction (5 points)
"Functionally, its great!"
"I dont use flashcards often however if there
was a program like this available then I would
be more inclined to study with flashcards"
-Freshman Engineer Jeff
System Introduction
We are evaluating a second iteration of the BluJay digital flashcard system. Our target user group for this system are students who study subjects in which flashcard making is frequently used as an effective study tool, such as chemistry, math, biology, and various foreign languages. During our research of flashcard usage we have found many regular flash card users recognize the difficulties in trying to efficiently organize and study their flash cards. This system is designed to address several of these difficulties with paper flashcards including improving organization and allowing for randomized study order, and keeping track of the comfort level that the user has with the different flashcards in the deck. The flashcard system is also designed to leave the basic flashcard paper system as unchanged as possible so that users of the software will not feel like they are making any sacrifices to using our software, and also, will never be worse off than if they were using traditional paper flashcards.
Purpose and Rationale of the Experiment
The purpose of this experiment is to study how users interact with our system and determine what interface modification we can make so that our system will increase user comfort and improve their efficiency. Specifically we have three primary aspects of the program which we have implemented and are seeking to evaluate on people unfamiliar with our system. We would like to evaluate how easily they learn to use the system as well as what features they feel are missing or burdensome. We intend to do this by monitoring the subject as they produce flashcards using the Anoto pen on our paper interface. Afterwards we will observe how well they can manipulate the cards in the browse mode and then how effective they find the study portion of the system.
Implementation and Improvements(15 points)
The largest new feature we have added to our prototype is implementing the fully pen-paper digital interface. In our interactive prototype which we presented last week we did not have the import function fully operational. Now the streaming feature is fully operational. This worked out much better than our initial vision for the device using the batch stroke processing. There are two major reasons why this streaming solution is superior. The first reason is that actually trying to produce even a reasonable number of unique flashcards would be a major hurdle with the R3 toolkit. The toolkit is designed fundamentally to produce unique patterns on a single page. It is not inherently good at setting up a batch solution with a open ended number of flashcard pattern and corresponding XML reference documents. Instead of this batch idea, we have implemented a streaming version which greatly improves the feel of making digital flashcards. Using our streaming solution there is no ambiguity about whether or not the card was correctly recorded, or if it looks right. The import screen shows the flashcard you are working and shows exactly what strokes the pen has read. This means that if there is a problem with the pen or the Bluetooth connection, the problems will be clear immediately and will not be a surprise later. Even if the import will not start working (a worse case system failure scenario), the flashcards and pen will still work in their original capacity as a plain paper-pen technique.
In addition to the basic stroke capture, the JPEG storage and image manipulation tools are also now fully implemented. In the previous version, we had fields set up where the user could describe their flashcard flipping style, however we had no hard code to process this preference so that the cards would be interpreted by the software correctly. In this most recent prototype we have finished the image processing tools so that the flashcard can be flipped horizontally or vertically and the strokes will be correctly displayed on the computer screen.
In the previous iteration of the interactive prototype, much of the underlying card handling was not implemented and only the skin was in place. Now there is vastly expanded interconnection of the user interaction window and the underlying flashcard engine. For example, in the browse panel, most functions including delete, copy, move, set familiarity level have are fully functional. Also, in the interactive prototype we employed placeholders to represent the flashcards in the system; now those placeholders are replaced by real flashcard data that can be browsed by categories. The browse features in conjunction with the completion of the import has made this a usable program for organizing flashcards.
An additional feature that we implemented in the most recent version of the software is implementing the study portion of the software. In the previous version the interface portion looked complete, however, most of the controls did not serve any function. Now the study buttons such as next, flip, different card viewing options (front/back/both sides) are fully functional, although the familiarity settings on the study panel was not functional for the testing. Completely features include a 'return to options screen' button from the study mode. Previously there was no way to get back to the options screen once the 'study' had started. Another feature is the disabling of traversal buttons that don't apply. This behavior helps the user determine when they've reached the end of a list, so that the user doesn't have to keep clicking 'next' when next doesn't exist!
This most current version of the software also corrected several problems with correcting errors. In the interactive prototype, there was no way to go back to alter certain preferences. i.e. once certain preferences were set in the study mode, these would remain in effect until the software is restarted. In this version we changed the software so that clicking on the study tab would reinitialize the study preferences to default and allow the user to set up the study from scratch.
One aspect of this prototype which remains wizard of oz is fully updating the flashcard list. After the user has finished the browse tasks, i.e. he has moved the math flashcards out of the GRE folder, one member of the group would quickly restart the program so that the file list would be reloaded again. This was a minor aspect of the program which we only learned about during the pilot testing, and saw the math cards presented in the same group as the GRE cards. This will be an aspect of the software which we will have fixed when we produce the next version of the software.
Method (10 points)
Participants
All the participants are selected from Unit 2 dormitory; many of whom were recruited in Unit 2 study lounges. We set our requirement for our pilot testing users to having at least some experience with flash card making and studying.
CHC: Male, CS major, Freshman, Have used flash cards for SAT & biology, Interviewed for lo-fi testing.
CF: Female, Architecture major, Freshman, Use flash cards to study for various classes, First time user.
JL: Male, Natural Resource major looking to switch to Engineering, Freshman, Use flashcard for studying class material sometimes, First time user.
RC: Male, EECS major, Freshman, Have used flash cards for SAT in high school, but have barely used flash cards since, First time interviewee.
Apparatus
The equipment we used was a Dell laptop computer running Windows XP. For the interface we used an integrated trackpad and Nokia's Anoto digital pen(Model SU-16). We connected the pen to the computer wirelessly using a Bluetooth dongle. We provide each subject with two plain cards to use a control for their flashcard making process, and 9 BluJay flash cards(8 needed, 1 as a spare). The experiments were conducted in Unit 2 study lounge where the users normally study. We employed a digital stopwatch to record the time it took the different subjects to complete a flashcard.
Tasks
- Create & Import (difficult)
The first task we asked of our subjects was to produce a sample set of flashcards. In this pilot study we wanted to make the production of the flashcard emphasized on the interface and not on actually studying so we provided a list of cards describing the flashcards that we wanted them to make. (See appendix) During this task we asked the users to use the Anoto digital pen to copy the information on the list provided to the flashcards as quickly as they feel comfortable. This task also requires the user to select a destination directory for the flashcards to be assigned, as well as specify how they would like to flip the card. This task is finished when the user has clicked the finish import button.
- Organization: Move/Set familiarity level (medium)
In the second task that we are asking our user to perform, we want them to use the browse part of the program to do organizational work on the flashcards that they have just finished making. Most of the cards will already be in the correct folder if they have selected the correct folder in the import menu. However, we also included two flashcards which do not belong in the group. They were asked to make two math flashcards which were miscategorized into the GRE group. The user must select these flashcards and move them to the math folder. They will do this by selecting the two math cards and then clicking on the menu at the bottom and changing the menu selection from GRE to Math. Then they click the transfer button to change the directory. Also from this screen the user will also be asked to set their comfort level for each flashcard.
- Study Flashcards (easy)
In a third task, the users are asked to study the GRE cards through the software interface by selecting the folder of their interest, choosing the words of desired difficulty level, and traversing through the list with a few basic buttons such as previous, next, flip(to see the other side). In addition, users can modify difficulty levels of cards as they master new information; they can also choose the mode of display so that the reverse side or both sides can be studies first.
Procedure
As a first part of the experiment, we asked each subject to read and sign our informed consent document which stated the basic tasks we would ask them to do and also informed them of their rights as a subject in our experiment.
Each subject was given a short tutorial on how to use the digital pen and the features of the software. We then handed the subject a paper with basic goals for the experiment, and the vocabulary list that we chose for them to use to make the flashcards. We offered them the opportunity to look through the list of objectives for the experiment and to ask questions of us before starting.
- Flashcard production
During this phase of the experiment we kept a stopwatch running and timed how long it took the subject to produce each of the assigned flashcards. The first two were on regular paper to be used as controls to measure the time it would take with regular pen and paper to produce one flashcard. Afterwards, we asked them to use the Anoto pen on our digital paper to produce the remaining eight flashcards. Our instructions for our subject were to produce the cards as quickly as they felt comfortable. The goal was to allow us to gain some statistical insight from a limited number of trials. Without providing compensation to the users, we felt it was inappropriate for us to ask them to give us more than 20-30 minutes of their time, and as a result we asked them to produce fewer flashcards than what would have been ideal. At the end of producing each digital flashcard we made sure the subject properly checked the import box before starting a new card. During this flashcard production part of the experiment we also made the subject produce two flashcards which would be put in the wrong directory so we could evaluate how effectively they would correct this error in the next screen.
- BluJay Browse
When the user was operating in the program's browse mode we asked them to perform several operations which were depicted on the instruction sheet. These included setting Familiarity level with the different flashcards as well as moving two misclassified cards to the correct directory. As a first step we asked the users to move the two math flashcards out of the GRE directory and into the math directory. This was done by marking the checkbox at the corner of the math flashcards and then changing the directory below to math. Then they pressed the move button to move the files to the math directory. We also asked them to set familiarity levels to the cards from the browse screen. (Please refer to the appendix for detailed instructions)
- BluJay study
When the user had finished all the browse tasks we asked them to use the study mode. In this mode we gave very sparse instructions and allowed the subject to choose how best to study the material. We felt that studying is an individual process and we would gain very little by asking a subject to study by a procedure list.
As the user performed all of these tasks, we used a stopwatch to take timestamped notes about the rate the user went through the import procedure and the browse feature. We also looked for places where the subject behaved unexpectedly, or the program behaved in an undesirable way. We encouraged the subjects to speak out loud about what they were thinking and what they were trying to do as they did different actions within the software.
At the end of the experiment, we would ask the user to rate the application in terms of visual presentation and ease of use, and ask them for additional thoughts and feedbacks on the interface. If there were special operations they performed during the experiment, we would ask them to clarify their thoughts and actions to help us figure out what went wrong.
Test Measures (5 points)
Qualitative analysis:
This test was designed to provide qualitative feedback about the system in addition to statistics. Therefore, we asked the user to speak aloud during the various tasks and we wrote down useful comments and ideas for how to change the software. Several things which we were particularly interested in was whether or not they would understand the button names and would follow the anticipated flow of the program. We also took note of how the program responded to large deviations from our anticipated flow through the software. We were alert for signs that the program was behaving unexpectedly as a result of a user mistake, or bring up error messages. If the users found an error, we would then try to determine a way of building in a case where the user could be guided to using the software correctly instead of bringing up an error dialog.
Quantitative analysis:
One of the key constraints for our system which we feel is critical to its viability as a study tool, is that producing the flashcards using the digital pen-paper interface will take no longer than it would on regular paper. While it would be possible to make this system without the pen at all, we feel that inputing flashcard content into a computer directly by a conventional method would significantly slow the flashcard production process. In our test we would like to measure how quickly our users can make flashcards on our system and compare this time with the length of time it takes the user to make flashcards on regular notecards. To do this we propose a controlled experiment to examine how quickly users can make flashcards with regular paper and compare this to how quickly they can make the flashcards on our system initially and after a small amount of practice.
Our procedure is as follows:
- Present the subject with a list of flashcards that they are to make.
- Allow the subject time to look over the full list.
- Time the subject as he/she makes the first two flashcards on regular paper.
- Time how long it takes for the subject to produce each flashcard thereafter with the Anoto system.
- Examine whether time to make flashcard approaches time to make a regular flashcard
This experiment will help us understand several important aspects of how users improve with our system. We will learn the initial learning curve uses show for our system, and we will also be able to predict how effective they get at flashcard production with practice.
To control this study we will ask each of our subjects to make several cards that have been chosen to be a fairly equal in difficulty level as well as provide important raw data in the following software tests.
Results (10 points)
Import
Key Issues
- All users were able to successfully import the 8 cards
- Two users began writing on digital card before starting the import panel at the beginning
- Two users wrote on the back side of the card first, then quickly discovered their mistake
- Most users struggled with the DONE button at some point during the importing task (due to the shift of the pen)
Measurements & Analysis
- Time it took for the flashcard users to produce a regular paper flashcard:
mean: 18.25 seconds
standard deviation: 3.84 seconds
- Time it took to complete a digital flashcard:
mean: 31.34 seconds
standard deviation: 12 seconds
- Time spent delayed on trying to tap the done button:
mean: 5.34 seconds
standard deviation: 8.5 seconds
- Time spent making digital flashcards, omitting time with done button delay:
mean: 26 seconds
standard deviation: 7.8 seconds
- Student's t-test comparing the time to make a regular card and a digital card
The whisker box A represents the dependent variable of this experiment. i.e. the time to make the flashcard. The B box represents the time taken to make a regular flashcard.
The left hand graph shows the relationship between the average total time to make a digital flashcard versus a regular one. The middle graph shows the difference when we see when we subtract the time fumbling for the done button. The right shows the results of the last two flashcards made, presumably when the subject is most practiced.
Null Hypothesis: There is no difference between making a digital card and a paper card
t= 3.01 corresponding to a probability of 0.46%, meaning that we reject the null hypothesis that there is no statistical difference between these two activities. Producing flashcards with our system is slower than making regular paper flashcards.
Null Hypothesis: If we exclude problems with the done button, then making digital flashcards takes the same time as making regular cards
t=2.69 corresponding to a probability of 1.1%, meaning that we reject the null hypothesis. Even without "done" button delays, making a digital flashcard is slower.
Null Hypothesis: After the user has practiced with the digital pen, there will be no time difference between the digital and regular flashcards. (last two digital flashcards took the same time as the regular paper cards)
t=2.17 corresponding to a probability of 4.8%. This is on the cusp of being unable to reject the null hypothesis. 5% is normally the cut off point, however, we had few users and did not have enough time to complete the practice. From the trends in the data, we believe that a user who has practiced using the digital pen may be able to produce digital flashcards at a rate which is essentially equal to the rate at which they can produce regular flashcards.
In this graph, the data from our limited number of subjects and with limited numbers of trials is insufficient to conclude that the subjects are definitely improving with practice. However, the trend of our data is that the mean time to produce a flashcard does diminish as they gain experience with the Anoto technology. The trend suggests that they will become better but will eventually plateau as the flashcard production rate becomes bottlenecked by the speed they can write.
Organization
Key Issues
- Two users click the MOVE button before selecting a destination folder: cards were moved to "unknown destination" (deleted).
- Two users expected the settings to change immediately after they selected the desired rating without clicking SET COMFORT LEVEL button.
- Many users had to scroll up/down multiple times due to the large size of our "thumbnails".
Measurements & Analysis
- Number of clicks to complete MOVING task specified:
Minimal/correct number: 5
Avg number of click: 4.5 / Stdev: 1.66
Avg number of clicks different from correct: 1.5 / Stdev: 0.866
- Number of clicks to complete SETTING COMFORT LEVEL task specified:
Minimal/correct number: 12
Avg number of click: 16 / Stdev: 4.74
Avg number of clicks different from correct: 4 / Stdev: 4.74
- Time (sec) spent on the browse panel:
Avg: 135.25 sec
Stdev: 32.9 sec
Study
Since studying panel was rather straightforward, all the users were able to traverse through the cards they made without any obvious difficulties. Thus no important measurements other than key observations were taken from this task. We omitted recording any values on this study because studying is an inherently unique activity and we could not think of a reasonable way to pilot this experiment. To test the efficacy of this study style, we would need large populations of students preparing for a similar exam. Such as study a population that used the software to study for the SAT and a population that did not use the software.
Overall
- Rating on visual presentation of the application (1-5):
mean: 3.625
std dev: 0.414578099
- Rating on ease of use (1-5):
mean: 4
std dev: 0
Discussion (15 points)
Discussion of Statistics
The results of the statistical analysis show that our goal of making the digital flashcards is not inherently as quick and easy as making it on normal paper. However, while there is a statistical difference between making the cards on digital paper vs. on normal paper, we can not conclude that producing digital flashcards is functionally slower than normal flashcards. In this task we asked the users to produce flashcards from a list that we provided, and even after a limited amount of practice, the subjects were 20% slower producing the digital flashcards than the regular flashcards. However, if they were making flashcards for a class or an exam, making the flashcard might be only a very small percentage of the time spent on each thing that they intended to study. For example, if the user's goal was to learn 20 chemistry structures, they might spend four minutes researching each structure and deciding what to write on the card. Once the card is finished, they may plan to study 10-20 times for a total of 10 minutes. In such a case, spending an additional 4 seconds on the flashcard production component would be inconsequential on the grand studying scheme, and this slight delay would be unlikely to cause the user to shun the digital flashcards because they take slightly longer.
Furthermore, we did not ask the subject to produce so many flashcards that they became an experienced software user. Thus, there was a degree of novelty to watch the strokes appear on the screen, which caused them to make the flashcards more slowly. Using the plot of time to produce the card vs. practice, we attempted to show that the time to make cards would decline according to a power law, and would eventually asymptote at a level similar to regular flashcards.
Improvements to Testing
One issue that occurred with the pilot study is that the participants had trouble following directions. Once the program was presented to them, they kind of “took off” and did their own thing with the interface. On the one hand, this shows that they are getting into the program and that they are excited about it, however, on the other hand, it means that they are likely to do the wrong thing. For example, a few of the participants wrote on the wrong side of the flash cards (or were about to). Because of the limitations of computer interpretation of flash card data, “sidedness” has become relevant. However, some of the participants didn’t realize this right away.
Part of the problem may be due to lack of physical directions to follow. In this experiment, we gave directions mainly through vocal commands and gesturing to help guide the testers. However, this approach is more casual than having a set of paper directions. It’s possible that given a clear set of numbered directions on paper, the tester may be inclined to follow it more exactly. This way we can prevent errors from occurring such as the one when a user clicked the “done import” button prematurely, because he thought he had to change which folder to import to in the middle of a session, when in fact that task was not specified.
Another improvement we could make to the experiment is to take more data (and specify which data to take more accurately). Instead of just taking down the times for “important events,” it would have been useful to specify exactly when events started and stopped. Some of the data taken is subject to a bit of guesswork because the clock was always running. However, in the event that a user gets stuck because the Anoto pen is not functioning, or because they misunderstood the directions, the time accrued does not reflect the interface design.
Additionally, instead of taking only temporal data, it would also be useful to take event-driven data, such as the number of clicks necessary to perform a task. One of the trends we noticed was that a user who got the task done more quickly also did not click around as much. Thus we think it would be interesting to see the correlation between number of events needs, the KLM model, and the actual time taken to perform a task. The additional information may also help guide us in designing specific interface elements.
For a serious study, we would probably consider adding a more in-depth questionnaire at the end. For this study we only included a two-question “on a scale of one to five” verbal questionnaire. This is useful because it qualifies the quantitative data from the tester’s viewpoint. In the moment, it’s not always easy to tell what they think of the interface, however adding this section will help with that.
Improvements to the Design
Import
The major hotspot from the importing section (task 1), was the ‘done’ field in the paper interface. This button is to tell the program that the user is done making the current flash card and is ready to move on to the next sheet of paper (physically and in the program). However, flash cards have inherent size limitations, so we could only afford to make the ‘done’ button so big. This proved to be difficult to acquire in some instances because the bulky pen’s built-in camera is slightly offset from the center of the pen tip. Add this to already quirky (at best) behavior from a new technology, and the conditions are ripe for frustration. The testers were able to use the done button in 1 click for most of the flash cards, however, for some of cards, the done button took dozens of clicks before the program would register it. One possible fix to this problem is to sacrifice a bit of space and try to make the active region for the ‘done’ button bigger. This will hopefully give the pen more dot patterns to potentially recognize. Another possible fix is to have a ‘done’ button on the electronic interface in addition to the paper one. A possible downside to this is that it may be confusing for beginners, but once they become familiar with the system, this addition shouldn’t be confusing.
Number of times needed to click the done button varied
Browse
One of the main problems encountered in this section of the program is when moving flash cards to a different folder. Many of the participants clicked the ‘move’ button without first selecting a destination. This is bad because not only did this action lead to a wrong result, it led to an undefined result. The team wasn’t sure what exactly happened to flash cards that got moved to a ‘blank’ folder option. One of the reasons this may be happening is that when the program is run in Windows (as opposed to Mac), the drop down selection menu always shows up as blank when it isn’t currently being highlighted. The user never has a chance to realize that they have to select a folder option unless they are already thinking about it. Aside from that strange bug, possible fixes to this problem are to either not let the user click the ‘move’ button until a valid folder has been selected or to have a default folder pre-selected.
Another problem people had with the drop-down menu system was when changing the comfort level. Some users didn’t realize that in addition to setting the drop down menu, they also had to click the “set comfort level” button. This could be due to the increase in recent web interfaces that don’t require button clicks. However, we modeled our section off of an e-mail interface, most of which require 2 actions. One of the reasons for this is to prevent accidentally setting the comfort level by selecting the wrong item in the drop-down menu. We noticed this happened at least once with our testers. Strangely, it was the same person that didn’t realize he/she had to click the button for the action to take effect. The user simply selected twice from the menu without realizing the implications had the action actually been consequential. A possible modification is to use radio buttons in the browse section, because we use radio buttons to set the comfort level in the study mode.
Another observation was that users had a tendency to drag the scroll bar frequently to find certain cards. This could be changed by enlarging the default window size, or else by decreasing the size of the thumbnails so that more cards could be displayed at the same time.
Study
In the options screen, once again, the testers didn’t seem to think it was necessary to select from all 3 option areas. For example, one user tried to double-click on the folder selection menu expecting a reaction. However, even if double-clicking was implemented, further selection would still be required! One of the users selected a folder, selected which cards to study from and proceeded to skip over the study method and jump straight to the “Go” button. One possible fix for this type of behavior is to set defaults so the user can jump straight into study with only 1 or 2 clicks. However, given that the testers tended to do things quickly, setting this default may rush them into a study mode that was undesired. If the program doesn’t let the user progress, as it does now, they have to stay on the same screen to select exactly what they want.
Another area on confusion seemed to be the difference in the “flip” button and the “study from” radio button set. One flips the card temporary, the other flips the entire card set. Once we explained what the difference was, the users had no trouble with the interface, however, upon first inspection they did not realize the difference.
Paper Interface
One concern over the paper interface is that the test subjects seemed to have trouble distinguishing the front side of the card from the back side. There are a number of ways that this could be corrected:
- Have horizontal lines on the back side so that the card appears like a traditional index card
- Put a "front" and "back" watermark in the active regions
- Have different color regions for front and back
Another issue regarding the paper interface is that it tends to be difficult to set the done button, taking an average of 5 seconds to mark the region.
Overall
- We need to make it more clear that people need to click every thing, instead of just going to the ‘done’ button
- The program could have better aesthetic qualities to it. Right now we are using Java defaults.
- The elements within the program aren't set to resize nicely: the proportions don't scale
- Add keyboard shortcuts for some of the buttons
Miscellany
- creation of ‘new subject’ is kind of awkward
- have import options while ‘hot’ – change folder on the spot
Workload breakdown (5 points)
Yimin: 26%
Yimin contributed to the project in implementing the pen-paper interface. She resolved many of the Bluetooth problems that posed an obstacle to having the pen communicate with the computer. She then also developed the software to convert stroke data into jpeg images and store these JPEG images to the hard disk. Her software also constructed the file structure and the text file used to organize the data tag associated with each card. During the experiment, she's in charge of setting up the application for each task and providing a brief tutorial of the software for the users.
Jonathan: 26%
Jonathan worked primarily on Browse tab of the program. This involved coding a scrollable window with many flashcards depicted at once. He also built up the menu structure on this tab and implemented all the changes to the data tags that can be controlled from this view mode. During the experiment, Jonathan took notes on user actions.
David: 26%
David contributed to the paper interface by designing the cards and producing the cards in bulk for the experiment. He also assisted in implementing the stroke capture and display portion of the software. In addition, he also planned out the tests that the subjects would be tested on and took a lead role in performing the statistical analysis on the data taken in the experiment. This included choosing the flashcard content that we wanted the subjects to add to the cards and describing the goals for the different exercises to the subject. During the experiment he operated the stopwatch to monitor the time the subjects took to finish each card.
Bowen: 26%
Bowen set up some of the internal data structures and utilities to be used. These included a 'current list' object, a 'card' object, and a 'reading from file' object. Additionally, Bowen worked on the study tab, implementing the new features such as list traversal, and the option to study from both sides. During the experiment, Bowen took notes on user actions.
Appendices (5 points)
* Materials (all things you read --- demo script, instructions -- or handed to the participant -- task instructions) * Raw data (i.e., entire merged critical incident logs)
Informed Consent Form
Blujay Digital Flashcard Pilot Usability Study
Please read and sign document before agreeing to participate in the experiment.
Purpose of Study: We are examining a new type of flashcard study system which uses both paper and computer elements to improve the effectiveness of flashcard study. This research is being conducted as part of a UC Berkeley course, CS 160: Human –Computer Interfaces.
What you will be asked to do in this study: In this study you will be asked to use a digital pen to prepare some flashcards, and then use a computer software program to manipulate and study the flashcards. This will involve using a mouse and looking at a computer monitor.
Time Required: 20 minutes
Risks and Benefits: This study poses no risks to the subject. There are no direct benefits for those who choose to participate in this study. They will be helping to develop a product which could potentially improve studying efficiency.
Compensation: There will be no compensation for participating in this research.
Confidentiality: Your identity will be kept confidential to the extent provided by law. Your comments and data from this experiment will only be referenced by a subject code letter. Your name will not be used in any report.
Voluntary participation: Your participation in this study is completely voluntary. There is no penalty for not participating.
Right to withdraw from the study: You have the right to withdraw from the study at anytime, and for any reason without consequence.
Agreement:
I have read the procedure described above. I understand my rights as a subject and voluntarily agree to participate in the study.
Participant: __________________________ Date:__________
Investigator: _________________________ Date:__________
Instructions to subject: Thank you for agreeing to participate in this study. There are three tasks for you to accomplish. As you perform the tasks please speak outloud what you are thinking and trying to do. This system is still under development and as such needs your help so that the various problems can be addressed. If there is any aspect which you feel is unclear or frustrating please tell us.
1. We would like you to produce the 10 flashcards shown below as quickly as you feel comfortable with. The first two will be on regular paper and should be treated as you would a conventional flashcard. Cards 3-10 should be made on the Blujay flashcards provided.
2. When using the software please select the GRE directory to import the cards to, and select how you would like to flip the cards. When you click the start import button the program will record your pen strokes. When you are finished, please tap the DONE button on the rear face of the card. The program will confirm by telling you the card was imported and clear the field for the second card.
3. After clicking finish import, please go to the browse screen and use this window to move the two math cards to the math directory. Then set the comfort level for the GRE flashcards.
4. Once you are finished with the browse tab, please go to study and study the GRE flashcards you have made.
5. Please explore the interface as you perform these tasks and inform us of the parts which are effective and which parts need further development.
Flashcard list:
1 Card
- Front: decorum
- Back: sense of propriety, properness
2 Card
- Front: penchant
- Back: strong inclination, a liking.
3 Card
- Front: pragmatic
- Back: practical, favoring utility
- Sort into group: GRE
- Set comfort level: unfamiliar
4 Card
- Front: paucity
- Back: scarcity, a lacking of
- Sort into group: GRE
- Set comfort level: Familiar
5 Card
- Front: malinger
- Back: to fake illness or injury
- Sort into group: GRE
- Set comfort level: unfamiliar
6 Card
- Front: d/dx cos(5x +4)
- Back: = -5*sin(5x+4)
- Sort into group: Math
7 Card
- Front: What is the shape of a quadratic function?
- Back: Parabola
- Sort into group: Math
8 Card
- Front: equivocate
- Back: lie, mislead, conceal the truth.
- Sort into group: GRE
- Set comfort level: somewhat unfamiliar
9 Card
- Front: striated
- Back: striped, grooved, or banded
- Sort into group: GRE
- Set comfort level: unfamiliar
10 Card
- Front: precarious
- Back: uncertain, risky, dangerous.
- Sort into group: GRE
- Set comfort level: somewhat unfamiliar
- Dave's note on browsing time:
In addition to timing the user during flashcard production we also measured the time taken to perform the browse functions.
Subject A navigated the browse features in 1 minute 33 seconds.
Subject B navigated the browse features in 1 minute 55 seconds.
Subject C navigated the browse features in 2 minutes and 56 seconds, however, she required assistance to complete the task.
Subject D navigated the browse features in 2 minutes 37 seconds.
The major reason for the difference in times is that some people found the interface more intutitive than others. Practice would greatly decrease the time to fully utilize this feature.
Jonathan Notes (raw)
RC:
- Wrote on the back side of the flash card for the first card (not really clear where front or back was)
- Had some trouble marking the done button for most of the cards
- Not clear if getting better at hitting the done button after each try...
- Learned to work around bug with repaint by adjusting scroll pane size
- Had some problem trying to get into the study traversal window because didn't click on study options
- Thought interface was pretty simple
- Felt that progression through software seemed to make sense
- Liked graphics for flip in import
CHC:
- Didn't seem to have any difficulty with marking the done buttons during the import phase
- Ran into an error during "move card" task (card got deleted b/c didn't select a folder?)
- Error came up during the study traversal?
- Thinks that familiarity level feature could be changed
CF:
- thought that pen writing was cool
- not used to vibration of pen?
- had trouble with done button on flash card
- didn't have any problems with move button
- thinks that software is useful, but doesn't know why anyone would use it if traditional flash cards can be used
- thinks that tracking familiarity level may be useful
JL:
- made mistake of writing on the back on first card
- did not know that "done" button needed to be pressed after writing card rather than clicking "done import"
- had trouble with marking the "done" button on the card
- accidentally pressed "done import" button
- seemed to get better at pressing the done button
- couldn't click on "enter" for "done import"
- problem with blank cards showing up on browse
- didn't run into errors in selecting comfort level or folder to move to
- tried to double-click on "GRE" in study options tab
- couldn't study only "familiar" cards in study tab
- thinks that there should be some type of "stats" panel?
- uses flash cards, but not very often
- would be more inclined to use flash cards with a program like this
Bowen Notes (raw)
User 1 – male
Task 1
- Import screen 1: not sure whether to write first (interact w/ paper) or click first (interact w/ computer)
- Wrote on the back of the card (not paying attention?)
- Pen buzzing confusing/surprising
- 2 ways to check for feedback: the paper and the computer - seems a bit too much to keep track of
- Done button hard to acquire
- Writes on very center of card; doesn’t leave room for the bottom
- Held pen “backwards” – anoto pen is asymmetric because of camera
- Wrote at an angle; not very vertical
- Confused about which counts as ‘front’ which counts as ‘back’ for the “parabola” card
Task 2
- Seemed very awkward when setting the comfort level, when reading from sheet
- Constant looking back and forth: would this be improved if he set it based on his own skills?
- Commented that he would rather not have to click when setting the comfort level: instead just use a drop down or other 1-action type method
Task 3
- Lots of clicking between ‘flip’ and ‘next’ – any way to streamline this?
- He didn’t realize that ‘flip’ was a non-permanent button vs. ‘study from’ radio buttons until after he tried
Misc
- Presentation: 4/5; Ease of use: 4/5
- Didn’t understand the importance of flipping orientation until we explained it
User 2 – male
Task 1
- Confused about how to use the paper interface and flipping
- Asked “does the computer do it for you?” – not sure where the handoff is between physical flipping and translating into the computer
- Looks on provided sheet first, then checks the paper, then checks the computer monitor – for visual feedback when making flash card
- Time spent looking: source of data > computer screen > paper
- Uses paper as minor glances (presumably because he is most familiar with writing vs. what the computer is doing or what the source data is)
- Didn’t have problems with the ‘done’ button until the end
- There is a slight downward shift between physical card and the GUI
- Writes quickly
Task 2
- When deleting blank cards – spent a bit of time looking for blank cards – hard to find which cards are actually completely blank vs. what has data on it
- Tried to ‘move’ the card without selecting a destination – card went nowhere
- He maximized the window, but the sections didn’t scale proportionally (some more than others) – is it obvious (or not) that our program has soft borders?
Task 3
- When the screen is maximized, the objects on the sides seem small – not obvious because they get pushed to the side
Misc
- Presentation: 4/5; Ease of use: 4/5
- Already somewhat familiar with our system: lo fi tester
- We need to have more consistent wording (semi vs. somewhat)
User 3 – female
Task 1
- Accidentally clicked ‘make new folder’
- Writes much more slowly, and has smaller words
- Doesn’t check computer as often
- Surprised by pen vibration
- Holds pen mostly vertical
- Has less problems with the done button
- “is this for a psychology class?”
Task 2
- Didn’t click on a button when setting comfort level
- Didn’t realize the feedback (or lack thereof) when she didn’t set the comfort level correctly (ie the displayed comfort level stayed the same)
- Didn’t realize that she should follow the comfort levels on the sheet (thought she should have done how comfortable she was to it)
- Seemed confused on this whole section
Task 3
- Only checked 2/3 of the possible options on the options screen (but all of them are required)
- Would use the ‘study from both sides’ option
Misc
- Presentation: 3/5; Ease of use: 4/5
- Thinks digital version is too pixilated; prefers her own handwriting
- Thinks program is useful, but would prefer paper
- Seems less adept / comfortable at using a computer in general
User 4 – male
Task 1
- When making purely paper flash cards, wrote them :front, front, back, back: instead of :front, back, front, back:
- Lots of switching back and forth between the paper interface and the GUI
- Used the ‘back’ of the paper interface as the front
- Went to GUI to click ‘done’ when instead the button was on the paper interface
- Trouble acquiring the ‘done’ on the paper interface
- Makes flash cards by stacking one on top of the others (instead of taking 1 at a time)
- Clicked ‘done with import’ to switch folders on the math card – didn’t follow directions
- Confusion on intent
- Writes with pen vertical; small letters; medium speed
- Pressed enter key for ‘done import’ but it wasn’t implemented
Task 2
- Used the word ‘select’ in the directions – leading?
- Didn’t rush things; did the job slower, found the correct items and moved them
Task 3
- Double clicked the 1st option on the study options panel : too fast? Didn’t read through everything
Misc
- Presentation: 4-/5; Ease of use: 4/5
- Would like better feedback on the ‘browse’ when selecting comfort levels
- Would like different view options for browse; if he could see more cards at a time
Yimin's note on browse
user move_clicks set_comfortlvl_clicks
RC 7 15 // move the cards, set the comfort level one by one first, then set the last set of cards to familiar as a group
CHC 3 13 // move without selecting destination, set comfort levels by groups, click to maximize window
CF 3 ~24 // very confused. move to nowhere also. a lot of extra clicking for setting the comfort level
JL 5 12 =) // took his time, but did everything correctly and precisely.







