If you're a Reviewed.com reader, you know that we love to test things. A lot of our tests are adapted from international testing standards that are designed and updated by august bodies of engineers and scientists. But you know what there's no testing standard for? Judging how much someone is going to like using a refrigerator once it's in their home.
Recently, we have been systematically revamping the testing and scoring methods for each of the main product categories. One piece of feedback that kept popping up was that people wanted more information on usability – things like ease of use, comfort, and intuitiveness.
Those ideas broadly fall under the category of “the human factor”, and answer questions like, “What is this product like to use,” “How comfortable is it to hold/wear/watch this product,” and “How stylish will this product look in my kitchen/living room?”
We have always scored products on their usability, but the task has historically been a difficult one – while it’s easy to translate a number from a test into a score, it’s harder to give a score to a feeling or an impression.
This problem became even more pressing as we expanded our product testing to horizons beyond tech and home appliances. We’ve started edging into the massive product categories known as “parenting,” “smart home,” and “small appliances,” and testing everything from tablets for kids, to smart security cameras, to sous vide immersion circulators, and back again.
Now that we’re testing products that do not have an established testing methodology in place, one big sticking point remains: How do we actually quantify the results of the testing? In a lot of the products we’re assessing, numbers alone cannot help us to separate the wheat from the chaff.
For example, when we were testing oven thermometers, we measured the thermometer’s ability to accurately reflect the temperature of a toaster oven, but addressing the ease of use is less straightforward. When using an oven thermometer, you want to be able to read the temperature at a distance, through an oven window. But short of conducting an eye test, there is no quantity we can measure that will allow us to instantly classify a thermometer’s visibility as “good” or “bad”. So what do we do?
Posing the question as “Can you read the numbers on the thermometer?” is too general. How close are you to the window when you read the thermometer? Can you read the small numbers and the large numbers? Are the numbers obscured by something in the oven? These are all questions unanswered by a simple “yes” or “no” to that original question.
On the other hand, framing the question as a quantity to be measured, like “Record the maximum distance you can get from the oven and still read the numbers” is problematic on its own. Depending on the person doing the testing, the location of the oven, and the ambient lighting conditions, each answer could vary greatly.
After conversing with some experts and a lot (A LOT) of trial and error, we’ve settled on a method that makes score number-crunching relatively straightforward, while also giving us lots of useful information. The idea is to have the testers use the products the way actual people would, answer very specific multiple-choice questions about the usage/performance, and then leave room for testers to expound on their answers.
As an example, here’s the actual question that our tester answered, related to the number visibility on an oven thermometer that had been sitting in a toaster oven at counter height while the temperature increased:
How easy was it to read the display on this thermometer in the oven?
1. I can’t read anything at all - the font is way too small
2. The numbers and tick marks are very difficult to read
3. It is easy to read both the major ticks and numbers, but the minor ticks/numbers are more difficult
4. It is easy to read both the major and minor ticks and numbers
There was also the option for additional comments or to add another possible answer for the question.
This way, we have the benefit of a number that can be turned into a score, as well as information that can help us shape the editorial content surrounding that product.
Gathering data with this style of questioning really helps us to put the quantitative results in context. For example, if a pair of headphones is technically perfect but extremely uncomfortable to wear for more than 20 minutes, we’re not going to recommend that product to our readers. Conversely, if a pair of headphones doesn't sound great but is super comfortable, we might recommend them for a certain type of listener, like someone who travels a lot.
At the end of the day, it’s people using products, not robots, so we think that combining the squishier human factor in with our numerical assessments is the best of both worlds.