Statistics

Statistics - Formulas & Concepts

IB Mathematics Analysis & Approaches (SL & HL)

📊 Measures of Central Tendency

Mean (Average):

\[\bar{x} = \frac{\sum x_i}{n} = \frac{x_1 + x_2 + \cdots + x_n}{n}\]

where \(n\) = number of data values

Mean (from Frequency Table):

\[\bar{x} = \frac{\sum f_ix_i}{\sum f_i}\]

where \(f_i\) = frequency of value \(x_i\)

Median:

• Middle value when data is ordered
• If \(n\) is odd: median is the \(\frac{n+1}{2}\)th value
• If \(n\) is even: median is average of \(\frac{n}{2}\)th and \(\frac{n}{2}+1\)th values

Mode:

The value that occurs most frequently in the data set

📏 Measures of Spread

Range:

\[\text{Range} = \text{Maximum} - \text{Minimum}\]

Interquartile Range (IQR):

\[\text{IQR} = Q_3 - Q_1\]

where \(Q_1\) = first quartile (25th percentile), \(Q_3\) = third quartile (75th percentile)

Variance (Population):

\[\sigma^2 = \frac{\sum(x_i - \mu)^2}{n}\]

Given in formula booklet

Variance (Alternative Form):

\[\sigma^2 = \frac{\sum x_i^2}{n} - \mu^2\]

Given in formula booklet

Standard Deviation:

\[\sigma = \sqrt{\sigma^2} = \sqrt{\frac{\sum(x_i - \mu)^2}{n}}\]

🔄 Effect of Transformations on Statistics

Adding a Constant \(k\):

• New mean: \(\bar{x}_{\text{new}} = \bar{x} + k\)
• Standard deviation unchanged: \(\sigma_{\text{new}} = \sigma\)
• Variance unchanged: \(\sigma^2_{\text{new}} = \sigma^2\)

Multiplying by a Constant \(k\):

• New mean: \(\bar{x}_{\text{new}} = k\bar{x}\)
• New standard deviation: \(\sigma_{\text{new}} = |k|\sigma\)
• New variance: \(\sigma^2_{\text{new}} = k^2\sigma^2\)

🎲 Basic Probability Rules

Probability Range:

\[0 \leq P(A) \leq 1\]

Complementary Events:

\[P(A') = 1 - P(A)\]

Addition Rule (OR):

\[P(A \cup B) = P(A) + P(B) - P(A \cap B)\]

Given in formula booklet

Mutually Exclusive Events:

\[P(A \cup B) = P(A) + P(B)\]

When \(P(A \cap B) = 0\)

Multiplication Rule (AND):

\[P(A \cap B) = P(A) \times P(B|A)\]

Given in formula booklet

🔀 Conditional Probability

Conditional Probability:

\[P(A|B) = \frac{P(A \cap B)}{P(B)}\]

Probability of A given B has occurred
Given in formula booklet

Independent Events:

\[P(A \cap B) = P(A) \times P(B)\]

Also: \(P(A|B) = P(A)\) and \(P(B|A) = P(B)\)

📈 Expected Value & Variance (Random Variables)

Expected Value (Mean):

\[E(X) = \sum x_i P(X = x_i)\]

Given in formula booklet

Variance:

\[\text{Var}(X) = E(X^2) - [E(X)]^2\]

\[\text{Var}(X) = \sum(x_i - \mu)^2 P(X = x_i)\]

Both forms given in formula booklet

Standard Deviation:

\[\sigma = \sqrt{\text{Var}(X)}\]

Linear Transformations:

For \(Y = aX + b\):
• \(E(Y) = aE(X) + b\)
• \(\text{Var}(Y) = a^2\text{Var}(X)\)

🎯 Binomial Distribution

Notation:

\[X \sim B(n, p)\]

where \(n\) = number of trials, \(p\) = probability of success

Probability Formula:

\[P(X = r) = \binom{n}{r}p^r(1-p)^{n-r}\]

Given in formula booklet

Mean:

\[E(X) = np\]

Given in formula booklet

Variance:

\[\text{Var}(X) = np(1-p)\]

Given in formula booklet

📊 Normal Distribution

Notation:

\[X \sim N(\mu, \sigma^2)\]

where \(\mu\) = mean, \(\sigma^2\) = variance

Standard Normal Distribution:

\[Z \sim N(0, 1)\]

Standardization (Z-score):

\[Z = \frac{X - \mu}{\sigma}\]

Given in formula booklet

Inverse Standardization:

\[X = \mu + Z\sigma\]

📉 Correlation & Regression

Pearson's Correlation Coefficient:

\[r = \frac{\sum(x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum(x_i - \bar{x})^2 \sum(y_i - \bar{y})^2}}\]

Range: \(-1 \leq r \leq 1\)
Given in formula booklet

Regression Line (y on x):

\[y = ax + b\]

Line passes through \((\bar{x}, \bar{y})\)

Coefficient of Determination:

\[r^2\]

Proportion of variance in y explained by x

🔢 Combinatorics & Counting

Permutations (Order Matters):

\[^nP_r = \frac{n!}{(n-r)!}\]

Number of ways to arrange \(r\) objects from \(n\) objects

Combinations (Order Doesn't Matter):

\[\binom{n}{r} = \frac{n!}{r!(n-r)!}\]

Number of ways to choose \(r\) objects from \(n\) objects
Given in formula booklet

Factorial:

\[n! = n \times (n-1) \times (n-2) \times \cdots \times 2 \times 1\]

Note: \(0! = 1\)

💡 Exam Tip: Most probability and statistics formulas are given in the IB formula booklet, including variance, binomial distribution, and correlation. Always use your GDC for calculations - it can calculate mean, standard deviation, variance, probabilities, and regression lines. Remember: variance = (standard deviation)². For binomial, conditions are: fixed number of trials, two outcomes, constant probability, independent trials. Check your formula booklet during the exam!