Basic Math

Two-variable statistics | Eighth Grade

Two-Variable Statistics - Grade 8

1. Line Graphs

Definition: A line graph displays data as points connected by lines, showing how one variable changes in relation to another (usually over time).

Key Components:

  • X-axis (horizontal): Independent variable (usually time)
  • Y-axis (vertical): Dependent variable (what's being measured)
  • Data points: Individual values plotted as dots
  • Connecting lines: Show trend or change over time

Interpreting Line Graphs:

  • Rising line: Value is increasing
  • Falling line: Value is decreasing
  • Horizontal line: Value stays constant
  • Steep slope: Rapid change
  • Gentle slope: Gradual change

Creating Line Graphs:

  1. Draw and label axes with appropriate scales
  2. Plot each data point at the correct coordinates
  3. Connect the points with straight lines
  4. Add a title describing what the graph shows

2. Scatter Plots

Definition: A scatter plot shows the relationship between two numerical variables using dots plotted on a coordinate plane. Each dot represents one data pair (x, y).

Purpose:

  • Show relationships between two variables
  • Identify patterns and trends
  • Detect outliers
  • Make predictions

Interpreting Scatter Plots:

Look at the pattern of points to determine:

  • Direction: Positive, negative, or no correlation
  • Form: Linear or nonlinear pattern
  • Strength: How closely points cluster around a line
  • Outliers: Points far from the general pattern

Creating Scatter Plots:

  1. Set up coordinate axes with appropriate scales
  2. Label axes with variable names
  3. Plot each data pair as a point (x, y)
  4. Do NOT connect the points (unlike line graphs)
  5. Add a descriptive title

3. Identify Trends with Scatter Plots (Correlation)

Types of Correlation:

Positive Correlation (Positive Association)

  • As x increases, y increases
  • Points trend upward from left to right
  • Example: Hours studied vs. test score

Negative Correlation (Negative Association)

  • As x increases, y decreases
  • Points trend downward from left to right
  • Example: Age of car vs. value

No Correlation (No Association)

  • No clear pattern
  • Points are randomly scattered
  • Variables are not related
  • Example: Shoe size vs. test score

Strength of Correlation:

StrengthDescription
StrongPoints cluster tightly around a line
ModeratePoints show a pattern but with some scatter
WeakPoints are loosely scattered with little pattern

4. Make Predictions with Scatter Plots

Using Trends to Predict: If a clear pattern exists, you can predict unknown values by following the trend.

Types of Predictions:

Interpolation: Predicting a value WITHIN the range of data

  • More reliable because it's within observed data
  • Example: If data ranges from x=0 to x=10, predict at x=5

Extrapolation: Predicting a value OUTSIDE the range of data

  • Less reliable because you assume the trend continues
  • Example: If data ranges from x=0 to x=10, predict at x=15

Steps to Make Predictions:

  1. Identify the pattern/trend in the scatter plot
  2. Draw or imagine a line of best fit
  3. Locate the given x-value on the axis
  4. Follow up or down to the line of best fit
  5. Read the corresponding y-value

Example:

A scatter plot shows hours studied (x) vs. test score (y) with a positive correlation. If a student studies 4 hours and the trend suggests scores increase by 5 points per hour of study starting from a base of 60, predict the score:

Prediction: 60 + (4 × 5) = 80 points

5. Outliers in Scatter Plots

Definition: An outlier is a data point that does not fit the general pattern of the scatter plot. It lies far away from most other points.

Identifying Outliers:

  • Look for points that are far from the cluster of other points
  • Points that don't follow the trend/pattern
  • Usually isolated from the main group

Causes of Outliers:

  • Measurement error: Data recorded incorrectly
  • Data entry error: Typo or mistake in recording
  • Genuine unusual case: Real exception to the pattern
  • Different subgroup: Belongs to a different category

Effect of Outliers:

ImpactDescription
Correlation strengthCan weaken the apparent correlation
Line of best fitCan pull the line away from most data
PredictionsCan make predictions less accurate

Example:

In a scatter plot of age vs. salary, most points show increasing salary with age. One point shows a 25-year-old earning $500,000 while others at that age earn $30,000-$50,000. This is an outlier (possibly a professional athlete or CEO).

6. Line of Best Fit (Trend Line)

Definition: A straight line drawn through the center of a scatter plot that best represents the relationship between the variables. It minimizes the distance between the line and all data points.

Identifying a Good Line of Best Fit:

  • Passes through or near most of the data points
  • About equal number of points above and below the line
  • Points are evenly distributed on both sides
  • Follows the general direction/trend of the data
  • Minimizes the vertical distances from points to the line

Drawing a Line of Best Fit:

  1. Look at the overall pattern of points
  2. Ignore outliers when drawing the line
  3. Use a ruler to draw a straight line
  4. Balance points above and below the line
  5. Extend the line across the graph

Properties:

  • Also called "trend line" or "regression line"
  • Can be used for predictions (interpolation and extrapolation)
  • Represents the general trend, not every individual point
  • Should only be used when there's a linear pattern

7. Write Equations for Lines of Best Fit

Goal: Find the equation of the line in slope-intercept form: \( y = mx + b \)

Formula:

\( y = mx + b \)

where \( m \) = slope (rate of change), \( b \) = y-intercept (starting value)

Steps to Write the Equation:

  1. Draw the line of best fit through the scatter plot
  2. Find the slope (m):
    • Choose two points ON the line (preferably at grid intersections)
    • Use the formula: \( m = \frac{y_2 - y_1}{x_2 - x_1} = \frac{\text{rise}}{\text{run}} \)
  3. Find the y-intercept (b):
    • Identify where the line crosses the y-axis
    • OR substitute a point and slope into \( y = mx + b \) and solve for b
  4. Write the equation: \( y = mx + b \)

Example:

A line of best fit passes through points (2, 5) and (6, 13). Find the equation.

Step 1: Find slope:

\( m = \frac{13 - 5}{6 - 2} = \frac{8}{4} = 2 \)

Step 2: Find y-intercept using point (2, 5):

\( 5 = 2(2) + b \) → \( 5 = 4 + b \) → \( b = 1 \)

Step 3: Write equation:

\( y = 2x + 1 \)

8. Interpret Lines of Best Fit: Word Problems

Interpreting the Equation \( y = mx + b \):

Slope (m): Rate of change; how much y changes per unit of x

  • Units: (y units) per (x units)
  • Positive slope: y increases as x increases
  • Negative slope: y decreases as x increases

Y-intercept (b): Starting value; value of y when x = 0

  • Initial amount or base value
  • Where the trend begins

Example 1:

Equation: \( C = 3h + 20 \) where C is total cost ($) and h is hours of work

Slope = 3: The cost increases by $3 per hour

Y-intercept = 20: There's a $20 initial fee (starting cost)

Predict cost for 10 hours: C = 3(10) + 20 = $50

Example 2:

Equation: \( T = -5d + 70 \) where T is temperature (°F) and d is depth (feet) underground

Slope = -5: Temperature decreases by 5°F per foot of depth

Y-intercept = 70: Surface temperature is 70°F

Predict temperature at 8 feet: T = -5(8) + 70 = 30°F

9. Identify Representative, Random, and Biased Samples

Sampling: Selecting a subset of a population to make inferences about the entire population.

Population vs. Sample:

  • Population: The entire group you want information about
  • Sample: A subset of the population used to represent the whole

Types of Samples:

1. Random Sample

  • Definition: Every member of the population has an equal chance of being selected
  • Goal: Reduce bias and get fair representation
  • Example: Drawing names from a hat containing all students' names
  • Example: Using a random number generator to select participants

2. Representative Sample

  • Definition: Accurately reflects the characteristics of the entire population
  • Goal: Match population proportions
  • Example: If school is 60% girls and 40% boys, sample should be similar
  • Note: Random samples are usually representative

3. Biased Sample

  • Definition: Does NOT fairly represent the population; certain groups are over/under-represented
  • Problem: Leads to inaccurate conclusions
  • Example: Surveying only students in the library about study habits (excludes non-library users)
  • Example: Calling only landlines (excludes cell phone-only users)

Common Sources of Bias:

Type of BiasDescriptionExample
Convenience samplingChoosing people who are easy to reachSurveying only your friends
Voluntary responsePeople choose to participateOnline polls (only motivated people respond)
UndercoverageSome groups not includedPhone survey (misses people without phones)

Evaluation Questions:

To determine if a sample is good, ask:

  • Does everyone in the population have an equal chance of being selected?
  • Does the sample match the characteristics of the population?
  • Is any group excluded or overrepresented?
  • Is the sample size large enough?

Examples:

Example 1: To find favorite lunch at school, survey every 10th student entering cafeteria

✓ Random and representative (all students use cafeteria, systematic selection)

Example 2: To find average income in a city, survey people at a luxury mall

✗ Biased (overrepresents wealthy people, excludes lower-income residents)

Example 3: To find students' opinions on homework, randomly select 50 students from entire school roster

✓ Random and likely representative (all students have equal chance)

Quick Reference: Two-Variable Statistics

Line Graphs vs. Scatter Plots:

FeatureLine GraphScatter Plot
Points connected?YesNo
PurposeShow change over timeShow relationship between variables
Use whenData is continuousLooking for correlation

Correlation Types:

  • Positive: Both variables increase together (↗)
  • Negative: One increases, other decreases (↘)
  • None: No pattern or relationship

Line of Best Fit Equation:

\( y = mx + b \)

  • m (slope): Rate of change
  • b (y-intercept): Starting value

Good Samples:

  • Random: Everyone has equal chance
  • Representative: Matches population characteristics
  • Unbiased: No systematic errors or exclusions

💡 Key Tips for Two-Variable Statistics

  • Line graphs: connect points; Scatter plots: don't connect
  • Positive correlation: both increase together (upward trend)
  • Negative correlation: one up, one down (downward trend)
  • No correlation: random scatter, no pattern
  • Outliers: points far from the pattern
  • Line of best fit: balance points above and below
  • Slope tells you rate of change; y-intercept tells starting value
  • Interpolation (within data) more reliable than extrapolation (outside data)
  • Random sample: everyone has equal chance of selection
  • Biased sample: some groups over/underrepresented
  • Larger samples usually more reliable
  • Use line of best fit equation to make predictions
Shares: