Standard Deviation: Complete Guide
What is Standard Deviation?
Standard deviation is a statistical measure that quantifies the amount of dispersion or variation in a set of data values. A low standard deviation indicates that the data points tend to be close to the mean, while a high standard deviation indicates that the data points are spread out over a wider range of values.
Why Standard Deviation Matters
Standard deviation is important because:
- It provides a standardized measure of dispersion
- It uses the same units as the original data
- It's sensitive to outliers (which can be both an advantage and disadvantage)
- It's widely used in statistics for inference, hypothesis testing, and constructing confidence intervals
Visual Representation
In a normal distribution, approximately:
- 68% of the data falls within one standard deviation of the mean
- 95% falls within two standard deviations
- 99.7% falls within three standard deviations
Types of Standard Deviation
1. Population Standard Deviation (σ)
Used when you have data for the entire population (all possible observations).
σ = √[ Σ(X - μ)² / N ]
Where:
- X = each value in the population
- μ = the population mean
- N = the number of values in the population
- Σ = sum of
2. Sample Standard Deviation (s)
Used when you have data from a sample (subset) of the population.
s = √[ Σ(x - x̄)² / (n-1) ]
Where:
- x = each value in the sample
- x̄ = the sample mean
- n = the number of values in the sample
- Σ = sum of
3. Corrected Sample Standard Deviation
Sometimes a finite population correction (FPC) is applied when the sample size is a significant portion of the population size.
s_corrected = s * √[(N-n)/(N-1)]
Where:
- s = sample standard deviation
- N = population size
- n = sample size
When to Use Each Type
Type | Use When | Symbol |
---|---|---|
Population | You have data for all possible observations | σ (sigma) |
Sample | You have data from only a subset of the population | s |
Corrected Sample | Your sample is a large proportion of the population | s_corrected |
How to Calculate Standard Deviation
Let's explore several methods for calculating standard deviation, from basic step-by-step approaches to shortcut formulas.
Method 1: Step-by-Step Calculation (Definition Formula)
- Calculate the mean (average) of the data set
- Subtract the mean from each data point to find deviations
- Square each deviation
- Sum all squared deviations
- Divide by N (population) or N-1 (sample)
- Take the square root of the result
Step 1: Calculate the mean: (4+8+6+5+3+8)/6 = 34/6 = 5.67
Step 2: Find deviations from the mean:
4 - 5.67 = -1.67
8 - 5.67 = 2.33
6 - 5.67 = 0.33
5 - 5.67 = -0.67
3 - 5.67 = -2.67
8 - 5.67 = 2.33
Step 3: Square each deviation:
(-1.67)² = 2.79
(2.33)² = 5.43
(0.33)² = 0.11
(-0.67)² = 0.45
(-2.67)² = 7.13
(2.33)² = 5.43
Step 4: Sum the squared deviations:
2.79 + 5.43 + 0.11 + 0.45 + 7.13 + 5.43 = 21.34
Step 5: Divide by (n-1) for a sample:
21.34 / 5 = 4.27
Step 6: Take the square root:
√4.27 ≈ 2.07
Therefore, the sample standard deviation is approximately 2.07.
Method 2: Variance Formula (Shortcut Method)
Standard deviation is the square root of variance. This method uses algebraic simplification.
σ² = [ Σ(X²) / N ] - μ²
Sample Variance:
s² = [ Σ(x²) / (n-1) ] - [ (Σx)² / n(n-1) ]
Then take the square root to get the standard deviation.
Step 1: Calculate Σx (sum of all values) = 4+8+6+5+3+8 = 34
Step 2: Calculate Σx² (sum of squared values) = 4²+8²+6²+5²+3²+8² = 16+64+36+25+9+64 = 214
Step 3: Calculate (Σx)² = 34² = 1156
Step 4: Apply the formula for sample variance:
s² = [ 214 / 5 ] - [ 1156 / (6×5) ]
s² = 42.8 - 38.53
s² = 4.27
Step 5: Calculate standard deviation:
s = √4.27 ≈ 2.07
Method 3: Using Technology
a) Using Spreadsheets
- Excel: =STDEV.S(range) for sample; =STDEV.P(range) for population
- Google Sheets: =STDEV(range) for sample; =STDEVP(range) for population
b) Using Scientific Calculators
- Most scientific calculators have built-in functions for calculating standard deviation
- Typically labeled as "σn" or "σn-1" (or similar notation)
c) Using Statistical Software
- R: sd(data)
- Python: numpy.std(data, ddof=0) for population; numpy.std(data, ddof=1) for sample
- SPSS: Descriptive Statistics procedure
Standard Deviation Examples
Example 1: Simple Dataset
Step 1: Calculate the mean: (85+90+72+95+83)/5 = 425/5 = 85
Step 2: Find deviations: (85-85)=0, (90-85)=5, (72-85)=-13, (95-85)=10, (83-85)=-2
Step 3: Square deviations: 0², 5², (-13)², 10², (-2)² = 0, 25, 169, 100, 4
Step 4: Sum squared deviations: 0+25+169+100+4 = 298
Step 5: Divide by (n-1): 298/4 = 74.5
Step 6: Take the square root: √74.5 ≈ 8.63
The sample standard deviation is 8.63.
Example 2: Comparing Datasets
Dataset B: {2, 18, 5, 20, 14, 1}
Both datasets have the same mean of 11.67, but their spread is different.
Dataset A Standard Deviation:
Mean = 11.67
Squared deviations: (10-11.67)² + (12-11.67)² + ... = 28.84
Variance = 28.84/5 = 5.77
Standard deviation = √5.77 ≈ 2.40
Dataset B Standard Deviation:
Mean = 11.67
Squared deviations: (2-11.67)² + (18-11.67)² + ... = 415.34
Variance = 415.34/5 = 83.07
Standard deviation = √83.07 ≈ 9.11
Interpretation: Dataset B has a much higher standard deviation (9.11 vs. 2.40), indicating that its values are spread out more widely from the mean.
Example 3: Continuous Data
Step 1: Calculate the mean: (168+172+165+175+178+169+173)/7 = 1200/7 ≈ 171.43
Step 2: Find deviations: (168-171.43), (172-171.43), etc.
Step 3: Square deviations: (-3.43)², (0.57)², etc.
Step 4: Sum squared deviations = 127.43
Step 5: Divide by (n-1): 127.43/6 ≈ 21.24
Step 6: Take the square root: √21.24 ≈ 4.61
Therefore, the standard deviation of heights is 4.61 cm.
Applications of Standard Deviation
1. Quality Control
2. Finance and Investment
3. Weather Forecasting
4. Educational Assessment
5. Biological Research
6. Z-Scores and Standardization
Z-scores represent how many standard deviations a data point is from the mean.
Z = (X - μ) / σ
Where:
- X = the data point
- μ = the mean
- σ = the standard deviation
Standard Deviation Calculator
Enter your data points below, separated by commas:
Results:
Count (n): | |
Sum: | |
Mean: | |
Variance: | |
Standard Deviation: |
Step-by-Step Calculation:
Standard Deviation Quiz
1. What does a standard deviation of 0 indicate?
2. In a standard normal distribution, approximately what percentage of data falls within 1 standard deviation of the mean?
3. The standard deviation of a data set is 5. If every value in the data set is multiplied by 2, what happens to the standard deviation?
4. Which formula is used to calculate the sample standard deviation?
5. Calculate the standard deviation of the data set {2, 4, 6, 8, 10}.
6. If a data point has a z-score of 2, this means it is:
7. When should you use population standard deviation instead of sample standard deviation?
8. Which of the following statements is true about standard deviation?
9. Which of these data sets has the smallest standard deviation?
10. The main difference between population and sample standard deviation formulas is: