Statistics - Formulas & Concepts
IB Mathematics Analysis & Approaches (SL & HL)
📊 Measures of Central Tendency
Mean (Average):
\[\bar{x} = \frac{\sum x_i}{n} = \frac{x_1 + x_2 + \cdots + x_n}{n}\]
where \(n\) = number of data values
Mean (from Frequency Table):
\[\bar{x} = \frac{\sum f_ix_i}{\sum f_i}\]
where \(f_i\) = frequency of value \(x_i\)
Median:
• Middle value when data is ordered
• If \(n\) is odd: median is the \(\frac{n+1}{2}\)th value
• If \(n\) is even: median is average of \(\frac{n}{2}\)th and \(\frac{n}{2}+1\)th values
Mode:
The value that occurs most frequently in the data set
📏 Measures of Spread
Range:
\[\text{Range} = \text{Maximum} - \text{Minimum}\]
Interquartile Range (IQR):
\[\text{IQR} = Q_3 - Q_1\]
where \(Q_1\) = first quartile (25th percentile), \(Q_3\) = third quartile (75th percentile)
Variance (Population):
\[\sigma^2 = \frac{\sum(x_i - \mu)^2}{n}\]
Given in formula booklet
Variance (Alternative Form):
\[\sigma^2 = \frac{\sum x_i^2}{n} - \mu^2\]
Given in formula booklet
Standard Deviation:
\[\sigma = \sqrt{\sigma^2} = \sqrt{\frac{\sum(x_i - \mu)^2}{n}}\]
🔄 Effect of Transformations on Statistics
Adding a Constant \(k\):
• New mean: \(\bar{x}_{\text{new}} = \bar{x} + k\)
• Standard deviation unchanged: \(\sigma_{\text{new}} = \sigma\)
• Variance unchanged: \(\sigma^2_{\text{new}} = \sigma^2\)
Multiplying by a Constant \(k\):
• New mean: \(\bar{x}_{\text{new}} = k\bar{x}\)
• New standard deviation: \(\sigma_{\text{new}} = |k|\sigma\)
• New variance: \(\sigma^2_{\text{new}} = k^2\sigma^2\)
🎲 Basic Probability Rules
Probability Range:
\[0 \leq P(A) \leq 1\]
Complementary Events:
\[P(A') = 1 - P(A)\]
Addition Rule (OR):
\[P(A \cup B) = P(A) + P(B) - P(A \cap B)\]
Given in formula booklet
Mutually Exclusive Events:
\[P(A \cup B) = P(A) + P(B)\]
When \(P(A \cap B) = 0\)
Multiplication Rule (AND):
\[P(A \cap B) = P(A) \times P(B|A)\]
Given in formula booklet
🔀 Conditional Probability
Conditional Probability:
\[P(A|B) = \frac{P(A \cap B)}{P(B)}\]
Probability of A given B has occurred
Given in formula booklet
Independent Events:
\[P(A \cap B) = P(A) \times P(B)\]
Also: \(P(A|B) = P(A)\) and \(P(B|A) = P(B)\)
📈 Expected Value & Variance (Random Variables)
Expected Value (Mean):
\[E(X) = \sum x_i P(X = x_i)\]
Given in formula booklet
Variance:
\[\text{Var}(X) = E(X^2) - [E(X)]^2\]
\[\text{Var}(X) = \sum(x_i - \mu)^2 P(X = x_i)\]
Both forms given in formula booklet
Standard Deviation:
\[\sigma = \sqrt{\text{Var}(X)}\]
Linear Transformations:
For \(Y = aX + b\):
• \(E(Y) = aE(X) + b\)
• \(\text{Var}(Y) = a^2\text{Var}(X)\)
🎯 Binomial Distribution
Notation:
\[X \sim B(n, p)\]
where \(n\) = number of trials, \(p\) = probability of success
Probability Formula:
\[P(X = r) = \binom{n}{r}p^r(1-p)^{n-r}\]
Given in formula booklet
Mean:
\[E(X) = np\]
Given in formula booklet
Variance:
\[\text{Var}(X) = np(1-p)\]
Given in formula booklet
📊 Normal Distribution
Notation:
\[X \sim N(\mu, \sigma^2)\]
where \(\mu\) = mean, \(\sigma^2\) = variance
Standard Normal Distribution:
\[Z \sim N(0, 1)\]
Standardization (Z-score):
\[Z = \frac{X - \mu}{\sigma}\]
Given in formula booklet
Inverse Standardization:
\[X = \mu + Z\sigma\]
📉 Correlation & Regression
Pearson's Correlation Coefficient:
\[r = \frac{\sum(x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum(x_i - \bar{x})^2 \sum(y_i - \bar{y})^2}}\]
Range: \(-1 \leq r \leq 1\)
Given in formula booklet
Regression Line (y on x):
\[y = ax + b\]
Line passes through \((\bar{x}, \bar{y})\)
Coefficient of Determination:
\[r^2\]
Proportion of variance in y explained by x
🔢 Combinatorics & Counting
Permutations (Order Matters):
\[^nP_r = \frac{n!}{(n-r)!}\]
Number of ways to arrange \(r\) objects from \(n\) objects
Combinations (Order Doesn't Matter):
\[\binom{n}{r} = \frac{n!}{r!(n-r)!}\]
Number of ways to choose \(r\) objects from \(n\) objects
Given in formula booklet
Factorial:
\[n! = n \times (n-1) \times (n-2) \times \cdots \times 2 \times 1\]
Note: \(0! = 1\)
💡 Exam Tip: Most probability and statistics formulas are given in the IB formula booklet, including variance, binomial distribution, and correlation. Always use your GDC for calculations - it can calculate mean, standard deviation, variance, probabilities, and regression lines. Remember: variance = (standard deviation)². For binomial, conditions are: fixed number of trials, two outcomes, constant probability, independent trials. Check your formula booklet during the exam!
