Why Outrageous Variances?

Tom Crouser November 7, 2011 0

Reader says range of suggested Crouser prices on recent survey is too big. Tom explains why it’s not.

Reader: (the) range of suggested (Crouser) prices (in Digital Envelope study) is massive: 1,000 #10 4/0 for $200 all the way up to $400? Then on the survey results you show such outrageous variation in responses as the … selling prices for 1,000 #10 4/0 from $26.50 to $690.

Tom: Yes, there is a big range of suggested Crouser prices on Digital Envelopes but that’s because market prices vary greatly which is what we are reporting. So, with regard to outrageous answers, I don’t create the answers rather just report them. But the reader brings up an interesting question: why are there such “outrageous variations” in prices among digital printers? Secondly, what’s up with not throwing out the outlying responses (actually we do and explain how).

Large price variation is often the case when a product/service is new and pricing routines (methods) have not been adopted by an industry. What’s a pricing routine? We use to be able to buy heavy construction equipment (bulldozers for instance) for $1 a pound. Didn’t matter what it cost (cost plus pricing) the manufacturer to produce, that’s what it sold for (market pricing).

We see this not only in the Digital Envelope study but also in our studies of QR codes, websites and email broadcasting. All are new services and all have an “outrageous” price variation. In fact, there are even large price variations where there pricing conventions have been established such as in offset printing (I usually find a spread of 50 to 200 on a scale of 100 of any given price point).

Now, our reader goes on to say: “To gain any useable data these outliers (survey results extremely high or low) should be thrown out. Then you would not have ended up with the extreme variation in your suggested high/low pricing. Also, I think you should have some kind of cap on the high/low prices such as 10% – 15% above and below the mid-price.”

Well, to answer this we need to get into some minutia because our reader is mixing his metaphors regarding the methods of analyzing data.

Two Methods: Normal Distribution and Array

We are mostly familiar with normal distribution, although probably not the name. It’s the old “bell shaped curve” you may have learned of in school and is best known by the concept of “average.”

Typical Bell Shaped Curvey

This “bell shaped” curve is a graphic description of a set of numbers such as the following example. Here we have numbers (prices) which we are in a “normal distributiion.” For instance, let’s line the numbers up from smallest to largest:

5. Maximum $5
4. $4
3. $3
2. $2
1. Minimum $1

When take an average of these numbers and we find the answer (average) is $3.

The average is a description of this set of five numbers. An average is essentially a description of the entire set of numbers.

What’s the price? About $3. That’s how we use averages to describe data. In this case we say the price point is an average of $3 (central tendency) although they specifically range from $1 to $5. What’s the use of that? Well, we know the typical price is $100 or $1,000, rather it’s about $3.

One other important thing: we can say this because the prices above are distributed normally (normal distribution).

How do I know that? Well, the number in the middle, $3, is the same as the average of all the numbers which is a normal distribution by definition.

Relative Range and Central Tendency:

So what? A single number (such as average) is not adequate to describe most numbers like price survey results because we want to know what the range of prices are (relative range) from a typical low price to a typical high price.

If we have a normal distribution, then it’s easy to calculate because finding the relative range (low to high) is a formula (standard deviation). When applied right, it will tell us the relative range of prices in the world from a small sample because 68% of all pricing points fall within plus or minus one standard deviation from the average.

(No, I’m not going to get into the math behind this one, just believe me when I say the math god said so – for proof, take a class in statistics or read up on it if you wish).

Here is the answer: average (mean) is 3 and standard deviation is 1.58114 meaning that from the sample, we can predict that most prices (68%) will fall between a low of $1.40 to a high of $4.58 with the typical (average) being $3.00. There will be outliers, of course, but if we want to pursue a low price strategy, then we need to price about $1.40, an average price at $3.00 or a high price at $4.58.

There’s only one problem with this method: it works ONLY when you have a normal distribution and we don’t have that with most price surveys.

Here are similar numbers (results) which are not distributed normally:

5. $7
4. $4
3. $3
2. $3
1. $2

Average $3.80

They are not in a normal distribution because the average ($3.80) is not the number in the middle when we line the numbers up from low to high ($3).

So can we say that $3.80 is a good description of these numbers? Hum. What we have here is a negative skew meaning that there are outlying numbers which have raised the average so that it is higher than the average.

Positive and Negatively Skewed Normal Distributions

What’s most important, we cannot use the standard deviation to predict the low and high values (plus or minus one standard deviation) where most of the numbers (prices) would fall.

What to do? Get more numbers (price survey responses) is the real answer for there’s always a normal distribution eventually (math god said so).

How many more? Well, that’s a math problem in itself since the more diverse (scattered) the numbers are, the more numbers you need to get to a normal distribution.

So, how do we describe an elephant by only measuring its trunk? We use a simple array to do a very adequate job.

How We Use an Array

Let us take 100 survey price submissions for 1,000 envelopes printed in full color and then stack them up from the lowest price to the highest (showing only the first, twenty-fifth, seventy-fifth and one-hundredth price to keep it simple)

100th – $9
75th – $5
50th – $4
25th  – $3
1st – $2

Average $4.60

Instead of doing any mathematical gymnastics, we can see that the relative range where most prices will fall is from the 25th percentile to the 75th percentile with the typical (median or one in the middle) being the 50th percentile.

Results: $3 to $5 with a typical of $4.

Real Data

Here’s a real price point to consider for 2,500 printed envelopes.

Max – $875.00
75% – $316.13
50% – $216.50
25% – $173.75
Min – $56.50

Average – $276.28

What this means is the meaningful range of prices for 2,500 is from a low of $174 to a high of $316 with the middle value being $216.

Why not just use the average? Well, we do calculate the average and report it. In this case the average is $276 but note the middle result using the array is $216 (50th) and this illustrates the issue with using averages (normal distribution procedures) on small data groups; it misleads you.

Example: Assume we wanted to position our price in the middle of all prices. But if we took the average of all results ($276) and used this as our price, then we would find (in this example) we would be the 61st price from lowest amongst the 100 individual prices. Therefore we would be much higher than our typical competitors and not be in the middle.

This is because there are not enough data points (individual prices) in our sample to form a normal distribution curve. To reach a normal distribution we would have to accumulate many more data points (survey participants) to be able to use the many math tools available for large groups of data such as an average and further, to use standard deviations which can then easily calculate the meaningful range (the group of numbers essentially from the 25th to the 75th percentiles).

Note: in a normal distribution, the average is the middle number (median) or 50th percentile.

In our data, the average is $276 and the middle number (median or 50th percentile) is $216, so it shows we do not have enough data points for a normal distribution. This array is also described as having a high skew.

If and when we have data points that show the average at $216 for instance and the 50th percentile at $216, then we would have enough data to use normal distribution calculations such as average.

This is why the array is preferable when dealing with a fewer specific results; because it gives you better information. Now, prices (data points) below the 25th percentile and above the 75th percentile are ignored (thrown out if you will) when considering our price recommendations.

The reason I include the maximum and minimums in the survey’s report is to help the reader assess where they are should they be pricing lower than the 25th percentile or higher than the 75th.

Cap on Suggested Price Variances?

Back to the reader’s comment: Also, I think you should have some kind of cap on the high/low prices such as 10% – 15% above and below the mid-price.

My suggested price is designed as a guide for those choosing a high, middle or low price strategy. My purpose is not to dictate prices, rather to show market price levels, suggest a low, middle and high price and then allow the reader to choose whichever range they wish.

To put a cap on the variance would not serve the reader well because I would have to arbitrarily select the central point and the other prices would miss the market.

Below illustrates prices we would calculate under different centering selections with a 15% cap:
An arbitrary cap on a variance would mean we miss the mark in one of the price areas (high, low or middle) price areas materially which is why we draw our suggested prices from the data instead of a formula.

In the example above, if we cap our price variance at 15% and center our recommended prices at the middle price we find in the market, then our high and low will be off the mark.

Specifically, if we want to charge a high price, then the recommended price would be some $67 UNDER the typical high price ($249 vs. $316).

If we wanted to compete in the low price area, our price would be $10 MORE than the typical low price.

Which is why we do not arbitrary cap our variance. Instead, we let the market guide us.

Hope this helps. Will be more than willing to continue the conversation but the short answer is the purpose of the price advisory is to give the reader a guide to their pricing which I believe it does.

Let me know if your still awake or if you need more.

Tom Crouser

Leave A Response »