Basic Math

One-variable statistics | Eighth Grade

One-Variable Statistics - Grade 8

1. Mean, Median, Mode, and Range

These are measures of central tendency and spread that help us understand and describe data.

Mean (Average):

\( \text{Mean} = \frac{\text{Sum of all values}}{\text{Number of values}} = \frac{\sum x}{n} \)

Add all values and divide by how many there are

Median (Middle Value):

  • Step 1: Arrange data in order from least to greatest
  • Step 2: Find the middle value:
    • Odd number of values: Middle value is the median
    • Even number of values: Average the two middle values

Mode (Most Frequent):

The value that appears most often in the data set

  • Can have one mode (unimodal)
  • Can have multiple modes (bimodal, multimodal)
  • Can have no mode if all values appear equally

Range (Spread):

\( \text{Range} = \text{Maximum value} - \text{Minimum value} \)

Example:

Data set: 12, 15, 18, 15, 22, 30, 15

Mean: \( \frac{12 + 15 + 18 + 15 + 22 + 30 + 15}{7} = \frac{127}{7} = 18.14 \)

Median: Order: 12, 15, 15, 15, 18, 22, 30 → Middle value = 15

Mode: 15 (appears 3 times)

Range: 30 - 12 = 18

2. Interpret Charts and Graphs

Common Chart Types:

Frequency Tables: Shows how often each value appears

  • Multiply each value by its frequency, then sum
  • Divide by total frequency to get mean

Dot Plots: Each dot represents one data point

  • Count dots at each value
  • Most dots = mode

Bar Graphs: Height of bars shows frequency

  • Read frequency from y-axis
  • Tallest bar = mode

Example: Frequency Table

ScoreFrequency
53
75
92

Mean: \( \frac{(5×3) + (7×5) + (9×2)}{3+5+2} = \frac{15+35+18}{10} = \frac{68}{10} = 6.8 \)

Mode: 7 (highest frequency of 5)

3. Find the Missing Number

Strategy for Finding Missing Value:

When given the mean:

  1. Multiply mean by number of values to get total sum
  2. Subtract known values from total sum
  3. Result is the missing value

When given the median:

  1. Arrange known values in order
  2. Determine where missing value must be placed
  3. Use median position to find missing value

Example 1: Finding Missing Value (Mean)

Problem: The mean of 15, 20, x, 25 is 22. Find x.

Step 1: Total sum = Mean × Number of values = 22 × 4 = 88

Step 2: Sum of known values = 15 + 20 + 25 = 60

Step 3: x = 88 - 60 = 28

Answer: x = 28

Example 2: Finding Missing Value (Median)

Problem: The median of 8, 12, x, 20, 24 is 15. Find x.

Step 1: For 5 values, median is the 3rd value

Step 2: In order: 8, 12, ?, ?, ?

Step 3: The middle value must be 15, so x = 15

Answer: x = 15

4. Changes in Mean, Median, Mode, and Range

Effects of Adding/Removing a Value:

ChangeEffect on MeanEffect on MedianEffect on ModeEffect on Range
Add large valueIncreasesMay increaseMay changeIncreases
Add small valueDecreasesMay decreaseMay changeIncreases
Remove outlierChanges significantlyLittle changeMay changeDecreases

Example:

Original data: 10, 12, 15, 15, 18

Mean = 14, Median = 15, Mode = 15, Range = 8

Add 30 to the data: 10, 12, 15, 15, 18, 30

Mean = 16.67 (increased), Median = 15 (same), Mode = 15 (same), Range = 20 (increased)

Key Observations:

  • Mean is most affected by outliers
  • Median is resistant to outliers
  • Mode only changes if frequencies change
  • Range changes when min or max changes

5. Mean Absolute Deviation (MAD)

Definition: The average distance of each data point from the mean. It measures how spread out the data is.

Formula:

\( \text{MAD} = \frac{\sum |x_i - \bar{x}|}{n} \)

where \( x_i \) = each data value, \( \bar{x} \) = mean, \( n \) = number of values

Steps to Calculate MAD:

  1. Find the mean of the data set
  2. Find the absolute deviation of each value from the mean: \( |x_i - \bar{x}| \)
  3. Add all the absolute deviations
  4. Divide by the number of values

Example:

Data: 4, 8, 6, 10, 12

Step 1: Mean = \( \frac{4+8+6+10+12}{5} = \frac{40}{5} = 8 \)

Step 2: Find absolute deviations:

ValueDeviation from MeanAbsolute Deviation
44 - 8 = -44
88 - 8 = 00
66 - 8 = -22
1010 - 8 = 22
1212 - 8 = 44

Step 3: Sum of absolute deviations = 4 + 0 + 2 + 2 + 4 = 12

Step 4: MAD = \( \frac{12}{5} = 2.4 \)

Answer: MAD = 2.4 (on average, values are 2.4 units from the mean)

6. Quartiles and Interquartile Range (IQR)

Quartiles:

Quartiles divide ordered data into four equal parts.

  • Q₁ (First Quartile): Median of lower half (25th percentile)
  • Q₂ (Second Quartile): Median of entire data (50th percentile)
  • Q₃ (Third Quartile): Median of upper half (75th percentile)

Interquartile Range (IQR):

\( \text{IQR} = Q_3 - Q_1 \)

IQR measures the spread of the middle 50% of the data

Steps to Find Quartiles:

  1. Arrange data in order from least to greatest
  2. Find Q₂ (median of all data)
  3. Find Q₁ (median of lower half, excluding Q₂)
  4. Find Q₃ (median of upper half, excluding Q₂)
  5. Calculate IQR = Q₃ - Q₁

Example:

Data: 3, 7, 8, 12, 13, 14, 18, 21, 22

Step 1: Already in order (9 values)

Step 2: Q₂ (median) = 13 (5th value)

Step 3: Lower half: 3, 7, 8, 12 → Q₁ = \( \frac{7+8}{2} = 7.5 \)

Step 4: Upper half: 14, 18, 21, 22 → Q₃ = \( \frac{18+21}{2} = 19.5 \)

Step 5: IQR = 19.5 - 7.5 = 12

Five-Number Summary: Min = 3, Q₁ = 7.5, Q₂ = 13, Q₃ = 19.5, Max = 22

7. Box Plots (Box-and-Whisker Plots)

Definition: A visual display of the five-number summary showing the distribution and spread of data.

Five-Number Summary:

  1. Minimum: Smallest value (left whisker)
  2. Q₁: First quartile (left edge of box)
  3. Median (Q₂): Middle value (line inside box)
  4. Q₃: Third quartile (right edge of box)
  5. Maximum: Largest value (right whisker)

Parts of a Box Plot:

  • Box: Contains middle 50% of data (from Q₁ to Q₃)
  • Whiskers: Lines extending to minimum and maximum
  • Median line: Vertical line inside the box

Reading a Box Plot:

  • Width of box = IQR: Shows spread of middle 50%
  • Length of whiskers: Shows overall range
  • Median position: Shows if data is skewed
  • Outliers: Marked as individual points beyond whiskers

Interpreting Box Plots:

FeatureMeaning
Long boxData is spread out in the middle
Short boxData is clustered in the middle
Median near Q₁Right-skewed data
Median near Q₃Left-skewed data
Median in centerSymmetric data

8. Identify an Outlier

Definition: An outlier is a data value that is much higher or much lower than most of the other values in a data set.

Methods to Identify Outliers:

Method 1: Visual Inspection

Look for values that are far from the rest of the data

Method 2: 1.5 × IQR Rule

Lower boundary: \( Q_1 - 1.5 \times \text{IQR} \)

Upper boundary: \( Q_3 + 1.5 \times \text{IQR} \)

Any value below lower boundary or above upper boundary is an outlier

Example:

Data: 12, 15, 18, 20, 22, 25, 28, 75

Find quartiles:

Q₁ = 16.5, Q₂ = 21, Q₃ = 26.5

Calculate IQR: IQR = 26.5 - 16.5 = 10

Calculate boundaries:

Lower: 16.5 - 1.5(10) = 16.5 - 15 = 1.5

Upper: 26.5 + 1.5(10) = 26.5 + 15 = 41.5

75 > 41.5, so 75 is an outlier

9. Effect of Removing an Outlier

Impact on Statistics:

StatisticEffect of Removing OutlierSensitivity
MeanChanges significantly (moves toward center)Very sensitive
MedianLittle to no changeNot sensitive (resistant)
ModeUsually no change (unless outlier is the mode)Not sensitive
RangeDecreases significantlyVery sensitive
MADDecreases (data more clustered)Sensitive
IQRLittle to no changeNot sensitive (resistant)

Example:

With outlier: 5, 8, 10, 12, 15, 50

Mean = 16.67, Median = 11, Mode = none, Range = 45

Without outlier: 5, 8, 10, 12, 15

Mean = 10, Median = 10, Mode = none, Range = 10

Observations:

  • Mean decreased by 6.67 (40% change)
  • Median decreased by only 1 (9% change)
  • Range decreased by 35 (78% change)

Conclusion: Median and IQR are better measures when outliers are present.

Quick Reference: One-Variable Statistics

Key Formulas:

Mean: \( \bar{x} = \frac{\sum x}{n} \)

Range: \( \text{Max} - \text{Min} \)

MAD: \( \frac{\sum |x_i - \bar{x}|}{n} \)

IQR: \( Q_3 - Q_1 \)

Outlier boundaries: \( Q_1 - 1.5 \times \text{IQR} \) and \( Q_3 + 1.5 \times \text{IQR} \)

Five-Number Summary:

Minimum, Q₁, Median (Q₂), Q₃, Maximum

Resistant vs Sensitive Measures:

  • Resistant (not affected by outliers): Median, IQR
  • Sensitive (affected by outliers): Mean, Range, MAD

💡 Key Tips for One-Variable Statistics

  • Mean = average (add all, divide by count)
  • Median = middle value (order data first!)
  • Mode = most frequent value
  • Range = max - min (spread of all data)
  • Always order data before finding median and quartiles
  • MAD shows average distance from mean
  • IQR shows spread of middle 50% of data
  • Box plot shows five-number summary visually
  • Outlier = value beyond 1.5 × IQR from quartiles
  • Mean is sensitive to outliers; median is not
  • Use median for skewed data or data with outliers
  • Removing outliers: mean and range change most
Shares: