## Ordinal Malpractice

We love to put things in order. “Which college is best, second best, third best, etc., and how can I get my kid into one near the top of the list?” “I love God, Mom, America, and apple pie, in that order.” “Formal education consists of elementary school, middle school, high school, undergraduate school, and finally graduate school, if you’re lucky.” “Our best salesperson is John, second best is Mary, Sally is third, and poor Harold is at the bottom of the list.” We sometimes forget, however, that when we sequence things, even when that sequence is based on a quantitative measure (e.g., salespeople ranked by sales revenues), the order itself is merely ordinal, not quantitative. The company’s top salesperson, John, might be mediocre at best, and the second-best salesperson, Mary, might sell but a smidgen less than John or perhaps only half as much. A #2 ranking merely reveals that Mary sells less than John, not how much less.

An ordered list that appears along the axis of a graph, such as the ranked list of salespeople below, is called an ordinal scale.

An interval scale, like the one below, is quite different.

An interval scale subdivides a continuous range of quantitative values into equal intervals, in this case a range extending from 0 to 500, subdivided into intervals of 100 each. One of the most common interval scales that we use in quantitative data analysis involves ranges of time (e.g., from years 1950 through 2020) subdivided into equal periods (e.g., the 1950s, 1960s, etc.). Interval scales, by definition, are quantitative in nature; ordinal scales are not. In general, all we can say about ordinal scales is that the items have a meaningful order, nothing more. Along an interval scale, quantitative distances between adjacent items are always equal, but along an ordinal scale, distances between items typically vary.

A Likert scale is an example of an ordinal scale that is often used in social science research and surveys. Likert scales allow people to respond to questions, such as “How often do you drink more than a single serving of an alcoholic beverage in a day?”, by selecting from an ordered list such as the following:

1. Never
2. Seldom
3. Occasionally
4. Frequently
5. Always

Notice that the items have a meaningful order, but the scale itself is not quantitative. The difference in the frequency of occurrence between “Never” and “Seldom” is not necessarily the same as the difference between “Seldom” and “Occasionally.” Item #5 (“Always”) is not five times greater than item #1 (“Never”). Even though items in ordinal scales are often labeled with numbers (e.g., “1 – Never” and “5 – Always”), the numbers only indicate a sequence (item #1, item #2, etc.), not quantities.

Ordinal scales often provide meaningful and useful ways to arrange items in a list. I often arrange the values that appear in a graph in order from low to high or high to low because it is easier to compare values when those that are close to one another in magnitude are near one another in the graph. You can see this by comparing the two graphs below: one ordered by SAT scores from high to low and the other arranged alphabetically by student names.

As I’ve already mentioned, when values are ranked, the ranking itself is not and shouldn’t be treated as quantitative. Shouldn’t, but often is.

Here are three examples of salespeople ranked by sales revenues, displayed graphically. Notice how differently the salespeople vary in sales performance among these three graphs even though the rankings are the same.

For most purposes, it is the sales revenues themselves in these examples that are important. They deserve far more attention than the rankings.

Let’s get back to Likert scales for a moment. Assigning numeric values to items in a Likert scale is not appropriate, in my opinion, but it is routinely done. For example, social science research that uses a questionnaire to measure depression among a population, based on the following Likert scale, could simply add up the numbers associated with the items to produce an overall 5-point depression score.

1. Never
2. Seldom
3. Occasionally
4. Frequently
5. Always

Measuring people’s depression in this manner, however, does not qualify as truly quantitative.

If ordinal quantification is misleading, why is it done? Mostly, for convenience. It allows people to represent something that is difficult to measure with a simple number, but that number is inherently misleading. Sometimes this is also done for another reason: to lend Likert scales an inflated sense of accuracy, precision, significance, and objectivity when making research claims. A great deal of social science and survey-based performance reporting (e.g., customer satisfaction surveys) is based on quantified Likert scales. In my opinion, this renders any claims that are based on them suspect.

In science and data sensemaking (including data visualization), it is important to understand the difference between interval scales and ordinal scales: the former are quantitative; the latter are not. Both play a role, but their roles should not be conflated.

### 2 Comments on “Ordinal Malpractice”

By Brent. April 24th, 2020 at 10:49 am

Is this not also an issue with, for example, choropleth maps, where the color ramp is quantized into an arbitrary number of scale values where an area with a 74% value is lumped into the 50-75 color and the 76% is in the 75-100 color? At what point do you choose some other method of division or choose straight up actual values?

By Stephen Few. April 24th, 2020 at 11:01 am

Brent,

The problem with choropleth maps that you mentioned is indeed real and it concerns me, but I think it’s different from the problems with ordinal scales that I described. Typically, when a choropleth scale is divided into discrete ranges rather than displayed as a continuous scale, the intervals along the scale are equidistant (or should be). As you stated, if one interval ranges from >=50 to <70 and the next ranges from >=70 and <100, a value of 74.9 will appear significantly different than a value of 75.1. That's a problem. I once explored this concern in an article titled "Heatmaps: to Bin or Not to Bin” that you might find useful.