From CS 160 User Interfaces Sp10
- Angela Juang worked on creating vocal tracks, writing the report, and redesigning the interface.
- Owen Lin worked on the Tune Voice view and integrating views together.
- Vinson Chuong worked on the Sing view, real-time pitch analysis, and integrating views.
- Jungmin Yun worked on the options screens.
Problem and Solution Overview
Singers who sing in groups with other singers or instrumentalists often have a hard time practicing with full effectiveness individually, because their music naturally depends on having the parts contributed from other members of their group as well as real-time feedback. Singers would be able to use their time more efficiently if they were able to simulate their group members and obtain real-time feedback no matter where they are and who they are with.
Our mission statement is: “To bridge the gap between individual and group practice by providing a set of useful tools that help singers practice alone”. The application we are designing provides a solution for these singers by giving them the capability to play back music tracks to represent group members and pairing this with visual feedback on their pitch and rhythm. By putting this application on a mobile phone, we will give singers the power to practice individually as effectively as possible wherever they are.
Task 1: Tune Voice (easy)
The first task users may want to perform with our app is to practice singing particular notes by playing them individually on a keyboard. The Tune Voice feature displays a keyboard on the screen along with a display that shows bars indicating the pitch of the notes being played on the keyboard. An arrow on the left side moves up and down with the singer's voice to show if his pitch matches the one that is being played on the keyboard. The octave of the keyboard can also be shifted up and down to allow for a greater range of notes.
To use the tune voice feature, the user can simply select "Tune Voice" from the main menu, which will show the voice tuning view. The user can then tap any key with his finger to play it, and he can also sing into the microphone to compare his pitch with the one being played if desired. The back button on the top of the screen will bring the user back to the main menu.
We consider this to be an easy task because it requires relatively few taps and is readily accessible from the main screen. It also does not require any settings to be configured other than shifting octaves if necessary.
Task 2: Configure Tracks (medium)
A more difficult task for our users is to configure the playback for each track for the songs they would like to sing. Each track has options for showing notes, lyrics, and changing volume. The user should be able to choose any combination of settings for each of the tracks that they wish. For example, a typical scenario might be to show the lyrics and notes for the main vocal track, then mute the main vocal track but turn up the volume of the instrumentals and supporting vocals. Users should be able to easily choose a configuration such as this one that they want for this session.
We consider this a medium task because it requires users to separately configure settings for each track. Users must select multiple different options for tracks depending on how they wish to the song to be played.
Task 3: Sing and Receive Feedback (hard)
One important task the user must be able to perform is to sing with a song and get pitch feedback. The sing view can be accessed through the main menu after choosing a song. From here, users may want to configure tracks before starting the song, as in Task 2. Otherwise, they may start the song directly by tapping the play button. Real-time feedback comes in the form of a line drawn on the screen as the notes scroll by, showing where the singer's current pitch is. The singer can see that if the drawn line matches with the bars on the screen, he is in tune; if not, he may adjust his voice up or down and necessary. During the course of playback the user may want to pause the music or change some settings for any reason - these functions are also included as part of this task. A single tap on the screen will cause some playback menus to pop out. The user can pause the music from here and proceed to change playback settings or reconfigure tracks as desired.
We consider this to be a hard task because users must first have settings for the tracks configured, and verify that these have produced the desired results when playback starts. The user must evaluate what they hear and may need to pause playback to adjust either general playback options or individual track options as necessary to obtain the desired sound. In addition, while singing, the user must keep track of both the notes on the screen and his own pitch that is being drawn onto the screen as he sings.
Revised Interface Design
Based off of the feedback we received from users interviewed with the lo-fi prototype, we revised the interface for a few of our views to make them easier to understand and use. Changes and diagrams to selected views are described below. We also came up with the design for a view we had not previously included in our prototype because it was never used for any of the tasks we asked users to perform.
On startup, our application displays a simple main menu that gives the user the option to either tune their voice using the keyboard function by selecting "Tune Voice," or to practice singing a song by selecting "Practice Singing." We have separated these two functions at the main menu because singers will often want to practice matching their pitch to a particular note without having to sing a song, and vice versa. This allows the user to choose right away whether he will want to practice a song or just check a note or two. A help button at the bottom of the screen gives the users a quick reminder about how to use the application.
Tune Voice View
The user navigates here from the main screen with the "Tune Voice" button. In this view, the user can press down on any of these keys and the corresponding note will play, as if it were a real piano. The user can then tune his or her voice by ear, by matching the pitch of the piano key. We have more plans for this view when we have more time, which include implementing a way to shift up and down the piano (and thus adjusting the octave for which the user would like to hear the notes). We also want to utilize the empty space above by generating the note in its bar form, and have the real-time pitch indicator to show the user if they are sharp, flat, or in tune.
Once a user chooses a song to sing, the view switches to Sing view. At this point, we start off with playback stopped so the user can choose to configure options if necessary before playing the song. This view features a progress bar are the bottom that shows the singer how far into the song he has gotten. The title bar at the top of the screen shows the name of the song the user is singing so it is easy to pause playback and come back to the application again later without forgetting what song is currently loaded. A compact menu on the right side has buttons to play or stop the song, and a slider to adjust overall playback volume. The play button will resume playback if tapped, and the stop button will stop playback and return to the beginning of the song. A "Settings" button on the title bar takes the user to a more detailed view of playback options; "Exit" takes the user back to the main screen of the application. We chose to only provide the options that the user would want to access and adjust quickly during playback on the menu in this view to keep the screen from becoming cluttered and to allow the singer to adjust some basic options on the fly without pausing, while still giving them the option of adjusting more options at a separate screen.
Playback Options View
The playback options view contains options that the user would not normally want to change on the fly while singing, such as the speed of the song and the individual settings for each track. The speed can be adjusted using a simple slider. The tracks for the currently loaded song can be selected from the menu below if the singer wishes to change any of the individual options. This will take the user to an options screen that is for that particular track, as shown below.
Track Options View
After clicking on a track in the table of tracks in the previous screen, the user is taken to a view with dedicated options for each track. With this screen, users are able to adjust the volume of each individual track, and also whether the notes and lyrics of this track are going to display in the scrolling "Sing" view. In future revisions of our application, we'd like to be able to display multiple vocal note tracks in the scrolling "Sing" view, since some of our interviewees would find that helpful in harmonizing for their songs in acapella and choir.
We chose not to implement the Add Track and Add Song features in our interactive prototype because these are relatively infrequently used features that are not necessary to complete the major tasks listed above. These functions would only be accessed when adding new content, while we anticipate that the majority of the time users spend interacting with our application will be used practicing with content that has already been loaded onto the phone. In addition, these particular views, as described above, are extremely simple (i.e., a single text box and a button to import media) and therefore less interesting from a user interface perspective to test on potential users. Furthermore, writing algorithms to process digital media is time-intensive.
Also, after entering the "Sing View", you must manually rotate the iPhone so that the home button is on the right side. And, after exiting the "Sing View", you must manually rotate the iPhone back to portrait mode. We were unable to find an effective method for forcing screen orientations without resorting to undocumented API's.
Although the track options and song options views are implemented, they do not actually affect the rest of the application. The digital media we pre-loaded into the application differs significantly from the digital media that the final application would use. If we had implemented the data-structure backbone specifically for our pre-prepared media, we would have had to later completely refactor the application. We saw this to be an unacceptable trade-off moving forward. However, because these views are actually implemented, we can use Wizard of Oz to allow users to configure settings and then move data ourselves.
Wizard of Oz Techniques
Some Wizard of Oz techniques were taken to substitute for the generation of notes to be displayed on screen for individual tracks. Instead of computing and drawing the visuals from audio files on the iPhone, we preprepared some visual tracks that match with various audio files to play for those songs. Depending on which song the user chooses, the corresponding visual track will be displayed on the screen when the sing view is active. We chose to use Wizard of Oz for this functionality because this way we could provide an outwardly identical interface to the user, complete with freedom to choose between different songs with different visuals, without having to produce a computationally intensive prototype.
We used the following libraries in our prototype:
- sc_listener (http://github.com/jkells/sc_listener): This library contains a class that listens to levels taken in through the iPhone's microphone and detects their frequency. We used this functionality to determine the pitch of the singer.
- Kiss FFT (http://sourceforge.net/projects/kissfft/): This library provides a simple way to compute Fourier transforms for pitch analysis.