How We Score Almost Everything: The S-Curve

Everyone knows how fast technology evolves. In the ‘60s, Gordon Moore coined “Moore’s Law," declaring that that processing power would double every year. He thought that pace would last ten years, but it's showed no signs of letting up. While speedy evolution is great, it can make it a challenge to review products in a context that makes sense. After all, that "context" is a moving target. And though it's not always that tricky to identify the criteria for scoring or the lab tests, navigating scores in a mercurial climate presents certain challenges. But fortunately, some elegant math has given us a fantastic tool to deal with these scoring challenges: the s-curve. uses a one to ten scoring system, which seems pretty simple. For the approximately middle two-performing quartiles, scores follow a linear function, since it makes sense for the score to increase in direct proportion to performance. If “P” is the test result, and “d” is a constant that determines how important a jump in performance is, then:

performance graph linear.jpg

So if you’ve got an awesome TV that has twice the resolution of another model, it’s going to score twice as well.

But linear scoring has a few problems when something scores exceptionally well or poorly. If a product performs exceptionally well in one area—way above what we think a ten-worthy performance is—the score overshoots over the zero to ten point scale, skewing the product’s overall score. For instance, if a stove can get all the way down to 100°F, it gets a simmer score of ten. But say another stove comes along and gets down to 80°F, linear scoring would give a better score. But shouldn’t get a better score; it’s not an advantage to get another 20°F lower. Since there are so many situations like these, it’s important to recognize when “improvements” aren’t really improvements, but diminished or negligible returns. The same goes for the poor performances—if a television’s contrast ratio falls below a certain threshold, it’s just not doing it’s job properly and should get a zero.

To account for the undesirable scoring behavior around the two extremes—the awful and the excellent—we modify the linear scoring system into an s-curve by rounding off the ends of the graph. In this setup, the scores are directly proportionate to the performance for most of the scoring spectrum, but anything below a certain performance standard gets penalized harder and anything above another performance standard does not get a higher score. This way, there’s no performance scores that will break the scoring system and yield a score higher than a ten.

To create our s-curve (also known as a sigmoid function), we alter a simple logistic function to fit our needs. This is the same math used in population carrying capacities, economics, biology, probability, and tons of other of applications.

performance graphalt.jpg

Though it might look pretty complicated, it’s actually very easy to work with, and works in most scoring applications. For any given test, we plug in the minimum (a) and maximum (b) scores possible (always zero and ten in our case), the test result (c) that we think should score a five, the test result (P), and a certain editorially-derived constant (d) that represents how severe changes in performance should be scored. With all these variables to tweak, we can tailor this scoring tool to apply it effectively across a variety of products, or find another scoring method if it makes more sense for a given area.

The s-curve's robustness and customizability as a scoring system allows us to keep the same scoring system if some crazy new product comes along. Since product performance often changes more quickly than the consumers' requirements, this system lets us adjust the scoring when it makes sense, not when a fancy new product comes a long and ruins the curve, knocking the scores down. The less we change the scoring algorithms, the better advice we can give.

Contributing: Timur Senguen

TAGS: scoring math how we test

What's Your Take?

All Comments