Quick and Dirty Statistics: Figuring Out Your Sample Size for Testing
In order for the results of your testing to be meaningful in terms of your product’s actual behavior, you’ll need to test the right number of replicates – that is, the number you need in order to have statistical relevance. What is the right number?
First let’s talk about the distribution of values for any test measurement. In statistics, there are a number of different distribution patterns that data follow. For our purposes, we will assume a normal distribution – it’s symmetric about the average and has the familiar bell curve shape.
The mean is at the center of the curve, and is the average value of your measurements.
The standard deviation tells us how spread out the individual measurements were – in other words, how close was each measurement to that average value?
The confidence interval determines how much higher or lower you will allow your SAMPLE mean to be than the POPULATION mean.
Each vertical line in the figure above represents a standard deviation – as you can see, 68.2% of the measurements fall between -1 and +1 standard deviation from the mean, 95.4% of measurements fall between -2 and +2 standard deviations from the mean, and 99.7% of measurements fall between -3 and +3 standard deviations from the mean. A “Z-score” (Z) is the number of standard deviations any particular measurement is from the mean. Z=0 means that particular measurement is the same as the mean. Z=1 means that measurement is one standard deviation from the mean. Since these intervals away from the mean have percentages associated with them, due to the shape of the curve, (remember that + or – 1 standard deviation = 68.2% of the values measured), your Z-score is a way to put boundaries on your confidence interval. Z=1.645 corresponds to a 90% confidence interval, since 90% of your values will be between Z=-1.645 and Z= +1.645.
For example, if I pull 100 tensile bars and measure ultimate strength, and my average is 100ksi with a standard deviation of 10ksi, then 99.7% of my measurements were between 70 and 130 ksi (-3 to +3 standard deviations).
Ok, that’s great – the normal distribution has a certain shape, the average is at the apex of the curve, and the standard deviation shows the spread in our data. Now what about sample size? Several factors need to be considered.
- Z-score: you’ll need to choose a z-score that bounds the confidence interval you want to have.
- Margin of Error: No sample will be perfect – regardless of sample size – so how much error will you allow? Margin of error is the complement of the confidence interval. If you choose a confidence interval of 90%, that corresponds to a margin of error of 10%. You hear this mentioned on the news; this is the plus-or-minus figure often reported in poll numbers, for example, 5% (0.05) is a commonly used level. The larger the margin of error you can tolerate, the smaller your necessary sample size will be.
- Standard Deviation (degree of variability): How much variance do you expect in your population? A good place to start is 0.5 (50%), which indicates the maximum variability of a population, therefore representing the most conservative sample size estimate.
Here’s a simple formula for figuring out what sample size you need:
As an example, let’s say I want to know the ultimate strength of a metal alloy. I think that the average tensile strength of my alloy is somewhere around 100 ksi. I choose a standard deviation of 50% (50 ksi) to be ultra-conservative since I don’t yet know the actual standard deviation. I want a confidence interval of 90%, and this corresponds to a Z-score of 1.645. To be within 10%, I choose my margin of error to be 10 ksi. My sample size would be:
The sample size you need decreases with a smaller standard deviation and increases with a smaller margin of error.
Here’s a table of common Z-score values and their corresponding confidence intervals:
This post is meant to be a simple introduction. For more information about sample size, distributions, and statistical relevance, check out these links:
https://en.wikipedia.org/wiki/Sample_size_determination
http://www.itl.nist.gov/div898/handbook/ppc/section3/ppc333.htm