Single-Variable Statistics - Ninth Grade Math
Introduction to Statistics
Statistics: The science of collecting, organizing, analyzing, and interpreting data
Single-Variable Data: Data involving one characteristic or measurement
Population: The entire group being studied
Sample: A subset of the population used to make inferences
Parameter: A numerical description of a population
Statistic: A numerical description of a sample
Single-Variable Data: Data involving one characteristic or measurement
Population: The entire group being studied
Sample: A subset of the population used to make inferences
Parameter: A numerical description of a population
Statistic: A numerical description of a sample
1. Identify Biased Samples
Biased Sample: A sample that does not fairly represent the population
Random Sample: Each member of population has equal chance of being selected
Representative Sample: Reflects the characteristics of the population
Sampling Bias: Systematic error in how a sample is collected
Random Sample: Each member of population has equal chance of being selected
Representative Sample: Reflects the characteristics of the population
Sampling Bias: Systematic error in how a sample is collected
Types of Biased Samples
Common Types of Sampling Bias:
1. Convenience Sampling:
• Choosing samples that are easy to reach
• Example: Surveying only your friends about school lunch
• Problem: Not representative of all students
2. Voluntary Response Bias:
• Only people who choose to respond are included
• Example: Online polls where people opt in
• Problem: People with strong opinions more likely to respond
3. Undercoverage:
• Some groups in population are excluded
• Example: Phone survey excludes people without phones
• Problem: Missing perspectives from excluded groups
4. Nonresponse Bias:
• Selected individuals don't participate
• Example: Mail survey with low response rate
• Problem: Responders may differ from non-responders
5. Question Wording Bias:
• Questions are leading or confusing
• Example: "Don't you agree that...?"
• Problem: Influences responses
1. Convenience Sampling:
• Choosing samples that are easy to reach
• Example: Surveying only your friends about school lunch
• Problem: Not representative of all students
2. Voluntary Response Bias:
• Only people who choose to respond are included
• Example: Online polls where people opt in
• Problem: People with strong opinions more likely to respond
3. Undercoverage:
• Some groups in population are excluded
• Example: Phone survey excludes people without phones
• Problem: Missing perspectives from excluded groups
4. Nonresponse Bias:
• Selected individuals don't participate
• Example: Mail survey with low response rate
• Problem: Responders may differ from non-responders
5. Question Wording Bias:
• Questions are leading or confusing
• Example: "Don't you agree that...?"
• Problem: Influences responses
Example 1: Identify if sample is biased
Scenario: A principal wants to know students' opinions on school uniforms. She surveys students in the chess club.
Analysis:
• This is convenience sampling
• Chess club members may not represent all students
• Different clubs/groups may have different opinions
Conclusion: This is a BIASED sample
Better method: Randomly select students from all grades and activities
Scenario: A principal wants to know students' opinions on school uniforms. She surveys students in the chess club.
Analysis:
• This is convenience sampling
• Chess club members may not represent all students
• Different clubs/groups may have different opinions
Conclusion: This is a BIASED sample
Better method: Randomly select students from all grades and activities
Example 2: Identify bias
Scenario: A store wants to know customer satisfaction. They ask every 10th customer who makes a purchase.
Analysis: Systematic sampling of actual customers
Potential bias: Only includes people who made purchases (satisfied customers)
Missing: People who left without buying (possibly dissatisfied)
Conclusion: BIASED - excludes non-purchasers
Scenario: A store wants to know customer satisfaction. They ask every 10th customer who makes a purchase.
Analysis: Systematic sampling of actual customers
Potential bias: Only includes people who made purchases (satisfied customers)
Missing: People who left without buying (possibly dissatisfied)
Conclusion: BIASED - excludes non-purchasers
Example 3: Unbiased sample
Scenario: A researcher assigns a number to each student in the school and uses a random number generator to select 50 students for a survey.
Analysis:
• Each student has equal chance of selection
• Random selection process
• No systematic exclusion
Conclusion: This is an UNBIASED sample
Scenario: A researcher assigns a number to each student in the school and uses a random number generator to select 50 students for a survey.
Analysis:
• Each student has equal chance of selection
• Random selection process
• No systematic exclusion
Conclusion: This is an UNBIASED sample
2. Mean, Median, Mode, and Range
Measures of Center: Values that describe the center or typical value of data
Measures of Spread: Values that describe how data is distributed
Central Tendency: The tendency of data to cluster around a central value
Measures of Spread: Values that describe how data is distributed
Central Tendency: The tendency of data to cluster around a central value
Mean (Average)
Mean Formula:
$$\text{Mean} = \bar{x} = \frac{\sum x}{n}$$
where:
• $\sum x$ = sum of all data values
• $n$ = number of data values
• $\bar{x}$ (x-bar) = mean
In words: Add all values, divide by how many values
$$\text{Mean} = \bar{x} = \frac{\sum x}{n}$$
where:
• $\sum x$ = sum of all data values
• $n$ = number of data values
• $\bar{x}$ (x-bar) = mean
In words: Add all values, divide by how many values
Example 1: Find the mean of 5, 8, 12, 15, 20
$$\text{Mean} = \frac{5 + 8 + 12 + 15 + 20}{5} = \frac{60}{5} = 12$$
Answer: Mean = 12
$$\text{Mean} = \frac{5 + 8 + 12 + 15 + 20}{5} = \frac{60}{5} = 12$$
Answer: Mean = 12
Median (Middle Value)
Median Steps:
Step 1: Order data from least to greatest
Step 2: Find middle value
If odd number of values:
Median is the middle value
If even number of values:
$$\text{Median} = \frac{\text{Two middle values}}{2}$$
Step 1: Order data from least to greatest
Step 2: Find middle value
If odd number of values:
Median is the middle value
If even number of values:
$$\text{Median} = \frac{\text{Two middle values}}{2}$$
Example 2: Find median of 3, 7, 9, 15, 20
Already ordered: 3, 7, 9, 15, 20
Middle value: 9
Answer: Median = 9
Already ordered: 3, 7, 9, 15, 20
Middle value: 9
Answer: Median = 9
Example 3: Find median of 4, 8, 10, 12, 16, 20
Two middle values: 4, 8, 10, 12, 16, 20
$$\text{Median} = \frac{10 + 12}{2} = \frac{22}{2} = 11$$
Answer: Median = 11
Two middle values: 4, 8, 10, 12, 16, 20
$$\text{Median} = \frac{10 + 12}{2} = \frac{22}{2} = 11$$
Answer: Median = 11
Mode (Most Frequent)
Mode Definition:
The value that appears most frequently in the dataset
Special Cases:
• No mode: All values appear once
• Bimodal: Two values appear most frequently
• Multimodal: More than two values tied for most frequent
The value that appears most frequently in the dataset
Special Cases:
• No mode: All values appear once
• Bimodal: Two values appear most frequently
• Multimodal: More than two values tied for most frequent
Example 4: Find mode of 2, 3, 3, 5, 7, 7, 7, 9
Frequency:
2: once, 3: twice, 5: once, 7: three times, 9: once
Answer: Mode = 7
Frequency:
2: once, 3: twice, 5: once, 7: three times, 9: once
Answer: Mode = 7
Example 5: Find mode of 1, 2, 3, 4, 5
All values appear once
Answer: No mode
All values appear once
Answer: No mode
Range (Spread)
Range Formula:
$$\text{Range} = \text{Maximum} - \text{Minimum}$$
Interpretation: Shows how spread out the data is
$$\text{Range} = \text{Maximum} - \text{Minimum}$$
Interpretation: Shows how spread out the data is
Example 6: Find range of 12, 18, 25, 30, 42
$$\text{Range} = 42 - 12 = 30$$
Answer: Range = 30
$$\text{Range} = 42 - 12 = 30$$
Answer: Range = 30
3. Calculate Quartiles and Interquartile Range
Quartiles: Values that divide ordered data into four equal parts
Q1 (First Quartile): 25th percentile - median of lower half
Q2 (Second Quartile): 50th percentile - median of entire dataset
Q3 (Third Quartile): 75th percentile - median of upper half
IQR: Interquartile Range - range of middle 50% of data
Q1 (First Quartile): 25th percentile - median of lower half
Q2 (Second Quartile): 50th percentile - median of entire dataset
Q3 (Third Quartile): 75th percentile - median of upper half
IQR: Interquartile Range - range of middle 50% of data
Quartile Formulas:
Step 1: Order data from least to greatest
Step 2: Find median (Q2)
Step 3: Find median of lower half (Q1)
Step 4: Find median of upper half (Q3)
Interquartile Range:
$$\text{IQR} = Q3 - Q1$$
Five-Number Summary:
Minimum, Q1, Median (Q2), Q3, Maximum
Step 1: Order data from least to greatest
Step 2: Find median (Q2)
Step 3: Find median of lower half (Q1)
Step 4: Find median of upper half (Q3)
Interquartile Range:
$$\text{IQR} = Q3 - Q1$$
Five-Number Summary:
Minimum, Q1, Median (Q2), Q3, Maximum
Example 1: Find quartiles for: 2, 5, 7, 9, 11, 13, 15, 18, 20
Step 1: Already ordered
n = 9 values
Step 2: Find Q2 (median)
2, 5, 7, 9, 11, 13, 15, 18, 20
Q2 = 11
Step 3: Find Q1 (median of lower half)
Lower half: 2, 5, 7, 9
Q1 = 7
Step 4: Find Q3 (median of upper half)
Upper half: 13, 15, 18, 20
Q3 = 15
Step 5: Calculate IQR
$$\text{IQR} = 15 - 7 = 8$$
Answer: Q1 = 7, Q2 = 11, Q3 = 15, IQR = 8
Step 1: Already ordered
n = 9 values
Step 2: Find Q2 (median)
2, 5, 7, 9, 11, 13, 15, 18, 20
Q2 = 11
Step 3: Find Q1 (median of lower half)
Lower half: 2, 5, 7, 9
Q1 = 7
Step 4: Find Q3 (median of upper half)
Upper half: 13, 15, 18, 20
Q3 = 15
Step 5: Calculate IQR
$$\text{IQR} = 15 - 7 = 8$$
Answer: Q1 = 7, Q2 = 11, Q3 = 15, IQR = 8
Example 2: Find five-number summary for: 3, 6, 8, 10, 12, 15, 18, 22
Minimum: 3
Q1: Median of (3, 6, 8, 10) = $\frac{6+8}{2} = 7$
Q2 (Median): $\frac{10+12}{2} = 11$
Q3: Median of (12, 15, 18, 22) = $\frac{15+18}{2} = 16.5$
Maximum: 22
IQR: $16.5 - 7 = 9.5$
Answer: Min = 3, Q1 = 7, Med = 11, Q3 = 16.5, Max = 22, IQR = 9.5
Minimum: 3
Q1: Median of (3, 6, 8, 10) = $\frac{6+8}{2} = 7$
Q2 (Median): $\frac{10+12}{2} = 11$
Q3: Median of (12, 15, 18, 22) = $\frac{15+18}{2} = 16.5$
Maximum: 22
IQR: $16.5 - 7 = 9.5$
Answer: Min = 3, Q1 = 7, Med = 11, Q3 = 16.5, Max = 22, IQR = 9.5
4-5. Identify Outliers and Their Effects
Outlier: A data value significantly different from other values
Effect: Can greatly affect mean, but not median
Why identify: May indicate errors, special cases, or important information
Effect: Can greatly affect mean, but not median
Why identify: May indicate errors, special cases, or important information
Method 1: Using IQR (Most Common)
IQR Method for Outliers:
Step 1: Calculate Q1, Q3, and IQR
Step 2: Calculate boundaries
$$\text{Lower Boundary} = Q1 - 1.5 \times \text{IQR}$$
$$\text{Upper Boundary} = Q3 + 1.5 \times \text{IQR}$$
Step 3: Any value outside boundaries is an outlier
• Value < Lower Boundary → Low outlier
• Value > Upper Boundary → High outlier
Step 1: Calculate Q1, Q3, and IQR
Step 2: Calculate boundaries
$$\text{Lower Boundary} = Q1 - 1.5 \times \text{IQR}$$
$$\text{Upper Boundary} = Q3 + 1.5 \times \text{IQR}$$
Step 3: Any value outside boundaries is an outlier
• Value < Lower Boundary → Low outlier
• Value > Upper Boundary → High outlier
Example 1: Identify outliers in: 5, 8, 10, 12, 15, 18, 20, 45
Find Q1 and Q3:
Q1 = 9 (median of 5, 8, 10, 12)
Q3 = 19 (median of 15, 18, 20, 45)
Calculate IQR:
$\text{IQR} = 19 - 9 = 10$
Calculate boundaries:
Lower: $9 - 1.5(10) = 9 - 15 = -6$
Upper: $19 + 1.5(10) = 19 + 15 = 34$
Check data:
All values except 45 are between -6 and 34
45 > 34
Answer: 45 is an outlier
Find Q1 and Q3:
Q1 = 9 (median of 5, 8, 10, 12)
Q3 = 19 (median of 15, 18, 20, 45)
Calculate IQR:
$\text{IQR} = 19 - 9 = 10$
Calculate boundaries:
Lower: $9 - 1.5(10) = 9 - 15 = -6$
Upper: $19 + 1.5(10) = 19 + 15 = 34$
Check data:
All values except 45 are between -6 and 34
45 > 34
Answer: 45 is an outlier
Effects of Removing Outliers
How Outliers Affect Statistics:
Mean: GREATLY affected
• High outlier increases mean
• Low outlier decreases mean
Median: SLIGHTLY or NOT affected
• Position of middle value usually stays similar
Mode: Usually NOT affected
• Outliers typically appear only once
Range: GREATLY affected
• Outliers are often min or max values
Standard Deviation: GREATLY affected
• Measures spread from mean
Mean: GREATLY affected
• High outlier increases mean
• Low outlier decreases mean
Median: SLIGHTLY or NOT affected
• Position of middle value usually stays similar
Mode: Usually NOT affected
• Outliers typically appear only once
Range: GREATLY affected
• Outliers are often min or max values
Standard Deviation: GREATLY affected
• Measures spread from mean
Example 2: Describe effect of removing outlier
Original data: 10, 12, 13, 14, 15, 15, 16, 50
With outlier (50):
Mean: $\frac{10+12+13+14+15+15+16+50}{8} = \frac{145}{8} = 18.125$
Median: $\frac{14+15}{2} = 14.5$
Range: $50 - 10 = 40$
Without outlier:
Mean: $\frac{10+12+13+14+15+15+16}{7} = \frac{95}{7} \approx 13.57$
Median: $14$ (middle value)
Range: $16 - 10 = 6$
Analysis:
• Mean decreased from 18.125 to 13.57 (significant change)
• Median changed slightly from 14.5 to 14
• Range decreased dramatically from 40 to 6
Conclusion: Removing outlier made data more representative
Original data: 10, 12, 13, 14, 15, 15, 16, 50
With outlier (50):
Mean: $\frac{10+12+13+14+15+15+16+50}{8} = \frac{145}{8} = 18.125$
Median: $\frac{14+15}{2} = 14.5$
Range: $50 - 10 = 40$
Without outlier:
Mean: $\frac{10+12+13+14+15+15+16}{7} = \frac{95}{7} \approx 13.57$
Median: $14$ (middle value)
Range: $16 - 10 = 6$
Analysis:
• Mean decreased from 18.125 to 13.57 (significant change)
• Median changed slightly from 14.5 to 14
• Range decreased dramatically from 40 to 6
Conclusion: Removing outlier made data more representative
6. Variance and Standard Deviation
Variance: Average of squared deviations from the mean
Standard Deviation: Square root of variance - measures typical distance from mean
Symbol for variance: $\sigma^2$ (population) or $s^2$ (sample)
Symbol for standard deviation: $\sigma$ (population) or $s$ (sample)
Standard Deviation: Square root of variance - measures typical distance from mean
Symbol for variance: $\sigma^2$ (population) or $s^2$ (sample)
Symbol for standard deviation: $\sigma$ (population) or $s$ (sample)
Population vs Sample
Key Difference:
Population: Entire group
• Divide by $n$
• Use $\sigma$ (sigma)
Sample: Part of group
• Divide by $n - 1$ (Bessel's correction)
• Use $s$
In this course, we typically use population formulas
Population: Entire group
• Divide by $n$
• Use $\sigma$ (sigma)
Sample: Part of group
• Divide by $n - 1$ (Bessel's correction)
• Use $s$
In this course, we typically use population formulas
Variance
Population Variance Formula:
$$\sigma^2 = \frac{\sum (x - \bar{x})^2}{n}$$
where:
• $x$ = each data value
• $\bar{x}$ = mean
• $n$ = number of values
• $(x - \bar{x})$ = deviation from mean
Steps:
1. Find the mean
2. Find each deviation: $(x - \bar{x})$
3. Square each deviation: $(x - \bar{x})^2$
4. Find average of squared deviations
$$\sigma^2 = \frac{\sum (x - \bar{x})^2}{n}$$
where:
• $x$ = each data value
• $\bar{x}$ = mean
• $n$ = number of values
• $(x - \bar{x})$ = deviation from mean
Steps:
1. Find the mean
2. Find each deviation: $(x - \bar{x})$
3. Square each deviation: $(x - \bar{x})^2$
4. Find average of squared deviations
Example 1: Find variance of 2, 4, 6, 8, 10
Step 1: Find mean
$\bar{x} = \frac{2+4+6+8+10}{5} = \frac{30}{5} = 6$
Step 2-3: Find deviations and square them
Step 4: Calculate variance
$$\sigma^2 = \frac{16+4+0+4+16}{5} = \frac{40}{5} = 8$$
Answer: Variance = 8
Step 1: Find mean
$\bar{x} = \frac{2+4+6+8+10}{5} = \frac{30}{5} = 6$
Step 2-3: Find deviations and square them
x | $(x - \bar{x})$ | $(x - \bar{x})^2$ |
---|---|---|
2 | 2 - 6 = -4 | 16 |
4 | 4 - 6 = -2 | 4 |
6 | 6 - 6 = 0 | 0 |
8 | 8 - 6 = 2 | 4 |
10 | 10 - 6 = 4 | 16 |
Step 4: Calculate variance
$$\sigma^2 = \frac{16+4+0+4+16}{5} = \frac{40}{5} = 8$$
Answer: Variance = 8
Standard Deviation
Standard Deviation Formula:
$$\sigma = \sqrt{\sigma^2} = \sqrt{\frac{\sum (x - \bar{x})^2}{n}}$$
In words: Square root of variance
Interpretation:
• Small standard deviation: data clustered near mean
• Large standard deviation: data spread out from mean
• Units are same as original data (unlike variance)
$$\sigma = \sqrt{\sigma^2} = \sqrt{\frac{\sum (x - \bar{x})^2}{n}}$$
In words: Square root of variance
Interpretation:
• Small standard deviation: data clustered near mean
• Large standard deviation: data spread out from mean
• Units are same as original data (unlike variance)
Example 2: Find standard deviation using variance from Example 1
Variance: $\sigma^2 = 8$
Standard deviation:
$$\sigma = \sqrt{8} = 2\sqrt{2} \approx 2.83$$
Answer: Standard deviation ≈ 2.83
Interpretation: Values typically vary about 2.83 units from mean of 6
Variance: $\sigma^2 = 8$
Standard deviation:
$$\sigma = \sqrt{8} = 2\sqrt{2} \approx 2.83$$
Answer: Standard deviation ≈ 2.83
Interpretation: Values typically vary about 2.83 units from mean of 6
Using Standard Deviation to Find Outliers
Standard Deviation Method:
An outlier is any value more than 3 standard deviations from the mean
$$\text{Lower Boundary} = \bar{x} - 3\sigma$$
$$\text{Upper Boundary} = \bar{x} + 3\sigma$$
Values outside this range are outliers
An outlier is any value more than 3 standard deviations from the mean
$$\text{Lower Boundary} = \bar{x} - 3\sigma$$
$$\text{Upper Boundary} = \bar{x} + 3\sigma$$
Values outside this range are outliers
Example 3: A dataset has mean = 50 and standard deviation = 5. Is 72 an outlier?
Calculate boundaries:
Lower: $50 - 3(5) = 50 - 15 = 35$
Upper: $50 + 3(5) = 50 + 15 = 65$
Check 72:
72 > 65 (upper boundary)
Answer: Yes, 72 is an outlier
Calculate boundaries:
Lower: $50 - 3(5) = 50 - 15 = 35$
Upper: $50 + 3(5) = 50 + 15 = 65$
Check 72:
72 > 65 (upper boundary)
Answer: Yes, 72 is an outlier
7. Choose Appropriate Measures of Center and Variation
Choosing Wisely: Different situations call for different measures
Key Question: Are there outliers or is data skewed?
Key Question: Are there outliers or is data skewed?
Decision Guide:
Use MEAN and STANDARD DEVIATION when:
• Data is symmetric (no outliers)
• Normal distribution (bell-shaped)
• Want to use all data values
• Doing further calculations
Use MEDIAN and IQR when:
• Data has outliers
• Data is skewed (not symmetric)
• Want measure resistant to extreme values
• Dealing with ordinal data (rankings)
Use MODE when:
• Data is categorical
• Want most common value
• Multiple values tied for highest frequency
Use MEAN and STANDARD DEVIATION when:
• Data is symmetric (no outliers)
• Normal distribution (bell-shaped)
• Want to use all data values
• Doing further calculations
Use MEDIAN and IQR when:
• Data has outliers
• Data is skewed (not symmetric)
• Want measure resistant to extreme values
• Dealing with ordinal data (rankings)
Use MODE when:
• Data is categorical
• Want most common value
• Multiple values tied for highest frequency
Example 1: Choose appropriate measures
Scenario: Home prices in a neighborhood: $150K, $160K, $170K, $180K, $190K, $2M
Analysis:
• $2M is an outlier (much higher than others)
• Mean would be heavily influenced by $2M
• Median better represents typical home
Mean: $\frac{2,850,000}{6} = \$475,000$ (misleading!)
Median: $\frac{170,000 + 180,000}{2} = \$175,000$ (more typical)
Best choice: Median and IQR
Reason: Outlier present, better represents typical home
Scenario: Home prices in a neighborhood: $150K, $160K, $170K, $180K, $190K, $2M
Analysis:
• $2M is an outlier (much higher than others)
• Mean would be heavily influenced by $2M
• Median better represents typical home
Mean: $\frac{2,850,000}{6} = \$475,000$ (misleading!)
Median: $\frac{170,000 + 180,000}{2} = \$175,000$ (more typical)
Best choice: Median and IQR
Reason: Outlier present, better represents typical home
Example 2: Choose measures
Scenario: Test scores: 72, 75, 78, 80, 82, 85, 88, 90
Analysis:
• No outliers
• Fairly symmetric distribution
• All values close together
Best choice: Mean and Standard Deviation
Reason: Symmetric data, no outliers, uses all information
Scenario: Test scores: 72, 75, 78, 80, 82, 85, 88, 90
Analysis:
• No outliers
• Fairly symmetric distribution
• All values close together
Best choice: Mean and Standard Deviation
Reason: Symmetric data, no outliers, uses all information
Example 3: Favorite colors survey
Data: Red (5), Blue (12), Green (3), Yellow (2)
Analysis:
• Categorical data (not numerical)
• Can't calculate mean or median
Best choice: Mode
Answer: Blue is most popular (mode)
Data: Red (5), Blue (12), Green (3), Yellow (2)
Analysis:
• Categorical data (not numerical)
• Can't calculate mean or median
Best choice: Mode
Answer: Blue is most popular (mode)
Measures of Center Comparison
Measure | Formula/Method | Best Used When | Affected by Outliers? |
---|---|---|---|
Mean | $\bar{x} = \frac{\sum x}{n}$ | Symmetric data, no outliers | YES - heavily affected |
Median | Middle value when ordered | Skewed data, outliers present | NO - resistant to outliers |
Mode | Most frequent value | Categorical data | NO - not affected |
Measures of Spread Comparison
Measure | Formula | What It Shows | Affected by Outliers? |
---|---|---|---|
Range | Max - Min | Total spread | YES - very sensitive |
IQR | Q3 - Q1 | Spread of middle 50% | NO - resistant |
Variance | $\sigma^2 = \frac{\sum (x-\bar{x})^2}{n}$ | Average squared deviation | YES - very sensitive |
Standard Deviation | $\sigma = \sqrt{\sigma^2}$ | Typical distance from mean | YES - very sensitive |
Outlier Detection Methods
Method | Formula | When to Use |
---|---|---|
IQR Method (Most Common) | Lower: $Q1 - 1.5 \times IQR$ Upper: $Q3 + 1.5 \times IQR$ | General purpose, box plots |
Standard Deviation Method | Lower: $\bar{x} - 3\sigma$ Upper: $\bar{x} + 3\sigma$ | Normal distributions |
Types of Sampling Bias
Type | Description | Example | Problem |
---|---|---|---|
Convenience | Easy to reach samples | Survey friends only | Not representative |
Voluntary Response | Self-selected participants | Online poll | Strong opinions overrepresented |
Undercoverage | Excludes part of population | Phone survey only | Missing perspectives |
Nonresponse | Selected don't respond | Low response rate | Responders may differ |
Success Tips for Single-Variable Statistics:
✓ Mean uses all values; median uses position
✓ Always order data before finding median or quartiles
✓ IQR measures spread of middle 50% - resistant to outliers
✓ Use IQR method (1.5 × IQR) to identify outliers
✓ Outliers greatly affect mean, range, and standard deviation
✓ Outliers barely affect median and IQR
✓ Variance is in squared units; standard deviation is in original units
✓ Choose median & IQR when outliers present
✓ Choose mean & standard deviation for symmetric data
✓ Random sampling eliminates bias - every member has equal chance!
✓ Mean uses all values; median uses position
✓ Always order data before finding median or quartiles
✓ IQR measures spread of middle 50% - resistant to outliers
✓ Use IQR method (1.5 × IQR) to identify outliers
✓ Outliers greatly affect mean, range, and standard deviation
✓ Outliers barely affect median and IQR
✓ Variance is in squared units; standard deviation is in original units
✓ Choose median & IQR when outliers present
✓ Choose mean & standard deviation for symmetric data
✓ Random sampling eliminates bias - every member has equal chance!