CS 247 - Interaction Design Studio

Milestone 5: User Study


Introduction


We will conduct a user study analyzing the effectiveness and accessibility of our long-distance brainstorming system prototype. In particular, we will focus on a few key areas.

First, users’ desire to keep a record of discussion as it progresses. We have three main questions here. Is there a compelling need for a record of discussion? If so, is this need most acute during the brainstorming process or at a later point? And finally, what is the optimal medium to fulfill this need: topic-sorted video of the discussion, user-created text annotations, or some alternative?

Hypothesis 1: Users desire a discussion record, primarily for later review. As such, video (low friction, more useful after the brainstorm than during) is the optimal medium.


Second, we will assess users’ needs to categorize or group ideas in some way. This was expressed in passing by several users during our previous WoZ prototype, so we would like to explore it further. We have two questions here. Does this need exist? What organizational structure makes the most intuitive sense: grouping existing ideas into categories, creating sub-ideas for existing ideas, or some other option?

Hypothesis 2: Users desire a categorization system. Specifically, they would like to group existing ideas into categories during the discussion phase.


Third, we will assess untrained users’ ability to grasp the purpose of each phase with minimal instruction.

Hypothesis 3: Otherwise untrained users will have a robust conceptual model of each phase after a brief explanation of its purpose.


Finally, since this is a fairly early-stage study, we will be paying close attention to any usability issues which arise. We will also solicit general feedback and analyze this for trends.

Methods


We recruited six groups of three users from the pool of Stanford undergraduates. A significant portion (0.56) of this sample was untrained in the design-focused brainstorming technique we are using, while an additional significant portion (0.44) had at least some experience with it. We will recruited from a variety of majors. This was completed through several testing sessions in both an upper-class house and a four-class dorm.

Each participant in a given group was placed in a separate room, with the app pre-loaded on a computer and an experimenter there to observe them. They were given a brainstorming task: “How would you redesign Tresidder Union?” The experimenter then briefly explained the brainstorming process by reciting a standardized explanation of the three phases (brainstorming, discussion, and voting). Users were largely left alone to use the app, with the only guidance being a repeated description of each phase as it began.

We recorded all of the generated ideas and took observational notes throughout. Afterwards, we conducted interviews with each subject. The goal here was threefold: get general feedback about the app, appraise their subjective need/desire for certain additional features, and gain an understanding of their conceptual model of the application and process. To successfully rate their need for additional features, we asked whether they would use each addition (if they had not already brought them up on their own, which many groups did). To understand their cognitive model of the process, we simply asked them to explain and then probed for details.


Results


First, the data on our hypotheses. Hypothesis 1 was strongly disconfirmed. While some of the groups expressed that a video-recording feature would be “cool,” none of the participants said that they would use it. However, only two of the participants said that they would not use a text note-taking feature, and two groups expressed a desire for this before we even asked (one of them while the task was still in progress). Every group concluded that it would be a useful functionality, more so for processing and summarization during the discussion than for later review.

Hypothesis 2 was confirmed. Roughly half of groups expressed a desire for some sort of categorization feature before we had even asked, and all but one concluded that they would take advantage of this functionality (the remaining group was neutral). Some groups mentioned sub-ideas, but the general consensus was that categorization was the best route. Half of the groups specifically mentioned the categorization of similar ideas (as opposed to other means of grouping).

Hypothesis 3 was confirmed. Users definitely did require the brief explanation; we completed a pilot test without it, and they were fairly confused. But with that description, there was no further coaching or guidance necessary. Their conceptual model of the phase structure was robust and accurate.

There were several other issues which came up during the interviews. Some of these were implementational rather than conceptual. For example, every group had a member who discovered that they could continue clicking on the ideas to make the text very large in the discussion phase or to cast an absurd number of votes in the voting phase. As such, every group suggested that we prevent ideas from being pushed off the screen (by the extremely large text) and limit the number of votes that could be cast.

Roughly half of the groups also requested some type of undo or delete function. These were uniformly mentioned in passing and desired by only one member of the group.

Users had a few common problems while using the app, which we observed but they did not articulate. First, two groups had points during the ideation phase when their brainstorming slowed and no users were contributing. Second, every group had some trouble with the “click-on-the-topic-you-want-to-discuss” feature. Most of them, after discovering that clicking repeatedly on ideas would continue to increase their size, used the feature to make their favorite ideas larger. This supplanted the topic-selection functionality. The two groups which continued to use the feature as designed did so only because they each contained a member who was very serious about the process and kept the discussion topic up to date, while the other users conversed freely.


Discussion


The results of Hypothesis 1 send a strong conceptual message. Users wish to take notes as part of the brainstorming process, not for later reference. Many groups said that video recordings would be unwieldy and inconvenient to revisit. Several participants also said that being recorded would lead them to self-censor and thus stifle their creativity.

However, the need for a text note-taking feature was very strong. One group assumed that it was possible within the app and tried to find the functionality. Several others indicated the need indirectly, by pausing as if they were considering writing something down or asking us if they could add specific notes through the idea submission text box.

The data from Hypothesis 2 had similar implications. In general, these two sets of results indicate users’ desires to see their conclusions from the discussion phase reflected in the app. Since the primary “task” of the second phase is to sort through the generated ideas, users want to physically organize and make sense of the lists.

Hypothesis 3’s results are fairly straightforward: the conceptual divisions of the process are intuitive.

The consistency with which a single user of each group discovered and abused the ability to make ideas very large/cast many votes for a single idea was very interesting. We had assumed that users would take the task seriously enough to refrain from taking advantage of these unrestricted functionalities. For two-thirds of the users, this was true, but each group had one “jokester” who insisted on clicking repeatedly. We believe that the natural tendency of users to “troll” is exacerbated by the anonymity of the main panel (and, frankly, how satisfying/fun a lot of the core functionalities are).

The undo/delete functionality was rarely but consistently mentioned because only those users who had made mistakes even thought to include it, and their desire for it was fleeting.

The lulls during ideation were something we had also observed during our Wizard of Oz prototyping. In this case, they were much less prevalent. The primary reason for this is, undoubtedly, the shorter timeframe. But beyond that, it was clear that the format of the app encouraged rapid submission of ideas (sans discussion) much more than verbally sharing with the group.

The click-to-discuss feature had two major problems. First, it just didn’t fit into the flow of conversation. People do not clearly delimit discussion topics in their minds, and it is unnatural to force such attention. Second, the fact that frequently clicked/discussed ideas grew and changed color ended up outshining the core functionality. This feedback was so viscerally effective in conveying the relative importance of ideas that users wanted to adjust it manually, to better reflect the results of their discussion.


Implications


To respond to our results from Hypothesis 1, we will implement a note-taking functionality. It will be a simple text box with persistent results, associated with a specific idea. Only one note box will be displayed at once, depending on the current discussion topic. This should also address the issues users had keeping the discussion topic up-to-date; rather than having to remember to click on a topic every time they start talking about it, they will simply click on topics when they want to take notes on them.

We will also implement a categorization system in response to the confirmation of Hypothesis 2, to allow organization of ideas during the discussion phase. Hopefully, these two features will together allow users to feel as if the results of their discussions are being reflected within the app.

The results of Hypothesis 3 will guide us as we execute our plans to add contextual instructions to the app. The tests seemed to indicate that minimal additions (perhaps simply our one-sentence explanations of each phase) will be required to make the app usable without any first-time coaching.

In response to the implementational issues, we will institute a cap on the number of votes each user can give (perhaps allowing this to be customized), allow users to scroll horizontally through ideas, and make the maximum font size smaller. We will be careful to restrict the power of single users to disrupt the process in the future.

The undo/delete functionality is clearly useful but nonessential. We will plan to implement it, but it will be very low priority.

To deal with the lulls during the ideation phase, we will implement a “prompting” feature. If no users have contributed ideas for a certain amount of time, we will prod them to continue with a 20-idea/20-second sprint, a call to elaborate on an existing topic, or something similar. We originally included this in the Wizard of Oz prototype, where it proved very effective, but had not yet added it to the digital version. Clearly, it will still be useful.

The biggest open question, as we continue, is how to adapt the click-to-discuss-a-topic feature in the face of widespread confusion. It originally had two intended functions. First, it created a record of the most-discussed ideas, which we could use a rough measure of popularity. Second, that popularity was expressed as a change in text size and color.

The first function can be entirely replaced by the notetaking system. Clicking on a topic to take notes on it is much more intuitive than clicking to reflect the conversation, and the measure of popularity should be roughly equivalent. The future of the second function is less clear. We could simply leave it tied to the new note-based measure of popularity, but this ignores the fact that participants clearly found it useful as an independent, manually-controlled feature. We will experiment with different levels of automation (e.g. only the color is dictated by note-taking popularity, while users adjust size) to find one which feels intuitive, and test this in the next round of user studies.