"There are less certain and less robust forms of empirical evidence. We'll focus on the most common, a user test, also known as a usability test."
Is it the evidence that is less robust, or is it the way evidence is collected? The way I understand it, a user test is a method of gaining empirical evidence, not a form of evidence in of itself. This sentence makes it sound like the opposite is true.
"Once you've found the breakdowns that occur in your design, you can go back and redesign your interface to prevent breakdowns, running more user tests to see if those breakdowns still occur."
Ultimately, I understand that more user tests are to be conducted after iterating a design with knowledge gained from previous testing, but the way it is worded, someone could get slipped up if they read it quickly. I had to do a double take on my initial skim of the article. Perhaps splitting up the sentence or adding a more specific phrase like "running more user tests after redesign..." would help. This is a really subtle issue, and for the most part, it probably isn't much of an issue at all, just something I noticed.
"For example, you might have a script like this:
Today we're interested in seeing how people use this new copier design... We're here to test this system, not you, so anything that goes wrong is our fault, not yours. I'm going to give you some tasks to perform. I won't be able to answer your questions, because the goal is to see where people have difficulty, so we can make it easier. Do you have any questions before we begin?"
I realize that this is just a very brief example of what a dialogue could look like, but it seems unclear to say "I won't be able to answer your questions" just to end by asking "Do you have any questions?" I think more clarification of what type of questions the participant can ask would be beneficial. The topic is touched on briefly after the example, but maybe a concrete example would be helpful.
udgement"﷽gvaluation rious people are going for.This reading provided us a well-rounded understanding of empirical evaluation. However, I wish Andy could talk more about why real life contexts make the evaluation more valid and useful as well as some drawbacks of that. In the meantime I think it will be more helpful if he can also include an example of a written task description. I like how the reading emphasizes the importance of debriefing with the users because that is actually very important to establish possible potential future users. Lastly, there is a grammar mistake in the first sentence; it should spell as “judgment” instead of “judgement”.
—————————————————————————
-
I think the way A/B tests and usability tests are presented is a bit confusing as it stands now. They are introduced one after another in a way that made me feel like they were exclusive, but that isn’t really the case. You might conduct usability testing on a variety of different options with a variety of different subjects to determine what design to go with. I think a sentence or two describing how they are related and can be used together would clear up any further confusion.
-
You did talk about finding subjects that are representative of the group you’re designing for and why it’s important, but not too much into how to do that. I have a feeling this will be one of the hardest parts of the design process for our final project and would have liked to have seen a little more guidance on how to choose the right people to test with.
-
Perhaps this could just be another entry in the “further reading” section, but I would have liked to have seen how companies that do this type of design (Google, Facebook, etc.) conduct tests and determine which design to move forward with. Being able to read about or see how these companies get around these issues would have been helpful for me.
• Maybe include some illustrations/pictures between the first few paragraphs so that the entire block of text is segmented, which makes reading easier and less stressful.
• You could give more examples of A/B tests and its usefulness based on a designer’s definition of success. If the measure of success is more for profit, does this mean that the designer should choose the design that made the most profit?
• Also, this sentence in the A/B test paragraph seems a little bit confusing to me, “If your definition of success is that users are less confused, that might be harder to measure, because it can be harder to observe, classify, and count, especially automatically.” (What was automatic in this case?)
- In the User Test section about finding people who are representative of an audience, it was explained how it might be difficult to go out and recruit them, but it wasn’t explained in depth on how to determine who is truly representative. I think this should be expanded on a little more because I think it’s more important to first understand who should be picked before understanding how to invite them.
- In the section where the article discusses breakdowns again (starting right above the image of the breakdowns graph), it says that we should define a set “path” that we expect them to follow so that we can see where the deviations occur. I think it is worth nothing that deviations from a path might not always be an error in design, like with video games.
- The last blue text linked article “Not all think aloud is valid” by Mark Fox, Anders Ericsson, and Ryan Best, is not cited at the bottom of the article
———————————————————————————
- I think this article should talk about the difference between an “average” user and a “representative” user. We’ve been taught that designing for an average using can be harmful, but logically, testing on a user that represents the average is logical. Some clarity would be great here.
- Perhaps, when talking about thinking aloud and how it cannot help the research find out what the user noticed first, mention something about eye tracking software. I assume that this is considered an analytical tool as well.
- In the second paragraph, I wish the idea of ecological validity was elaborated on more. I’d like to learn about an example of this in order to grasp the concept better.
———————————————————————————————————