Formula Sheets

Complete Statistics & Probability Formulas Guide

Complete Statistics & Probability Formulas Guide

Complete Statistics & Probability Formulas Guide

Master essential statistical concepts with comprehensive formulas, step-by-step explanations, and practical examples for academic success in IB, AP, GCSE, and university-level mathematics.

1 Measures of Central Tendency & Variability

Mean (Average)

Population Mean:
\[ \mu = \frac{\sum_{i=1}^{N} x_i}{N} \]
Sample Mean:
\[ \bar{x} = \frac{\sum_{i=1}^{n} x_i}{n} \]

Where: μ = population mean, x̄ = sample mean, N = population size, n = sample size

Variance

Population Variance:
\[ \sigma^2 = \frac{\sum_{i=1}^{N} (x_i - \mu)^2}{N} \]
Sample Variance:
\[ s^2 = \frac{\sum_{i=1}^{n} (x_i - \bar{x})^2}{n-1} \]

Alternative Formula: \( s^2 = \frac{\sum x^2 - \frac{(\sum x)^2}{n}}{n-1} \)

Standard Deviation

Population Standard Deviation:
\[ \sigma = \sqrt{\frac{\sum_{i=1}^{N} (x_i - \mu)^2}{N}} \]
Sample Standard Deviation:
\[ s = \sqrt{\frac{\sum_{i=1}^{n} (x_i - \bar{x})^2}{n-1}} \]

Key Point: Standard deviation is the square root of variance and measures spread in original units.

Median & Mode

Median (n is odd):
\[ M = \left(\frac{n+1}{2}\right)^{th} \text{ term} \]
Median (n is even):
\[ M = \frac{\left(\frac{n}{2}\right)^{th} + \left(\frac{n}{2}+1\right)^{th}}{2} \]
Mode:

The value that appears most frequently in the dataset

2 Probability Formulas

Basic Probability

\[ P(A) = \frac{n(A)}{n(S)} \]

Where: P(A) = probability of event A, n(A) = favorable outcomes, n(S) = total possible outcomes

Probability Range:
\[ 0 \leq P(A) \leq 1 \]

Conditional Probability

\[ P(A|B) = \frac{P(A \cap B)}{P(B)} \]

Probability of A given that B has occurred

Bayes' Theorem:
\[ P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)} \]

Addition & Multiplication Rules

Addition Rule:
\[ P(A \cup B) = P(A) + P(B) - P(A \cap B) \]
Multiplication Rule:
\[ P(A \cap B) = P(A) \cdot P(B|A) \]
Independent Events:
\[ P(A \cap B) = P(A) \cdot P(B) \]

Complement & Other Rules

Complement Rule:
\[ P(A') = 1 - P(A) \]
Mutually Exclusive Events:
\[ P(A \cap B) = 0 \] \[ P(A \cup B) = P(A) + P(B) \]

3 Linear Regression Analysis

Simple Linear Regression

Regression Equation:
\[ y = a + bx \]
Slope (b):
\[ b = \frac{n\sum xy - \sum x \sum y}{n\sum x^2 - (\sum x)^2} \]
Y-Intercept (a):
\[ a = \frac{\sum y \sum x^2 - \sum x \sum xy}{n\sum x^2 - (\sum x)^2} \]

Where: y = dependent variable, x = independent variable, a = y-intercept, b = slope

Correlation Coefficient

Pearson's r:
\[ r = \frac{n\sum xy - \sum x \sum y}{\sqrt{[n\sum x^2 - (\sum x)^2][n\sum y^2 - (\sum y)^2]}} \]
Coefficient of Determination:
\[ r^2 = \frac{\text{Explained Variation}}{\text{Total Variation}} \]

Range: -1 ≤ r ≤ 1, where |r| closer to 1 indicates stronger linear relationship

4 Hypothesis Testing & Test Statistics

Standard Scores

Z-Score:
\[ z = \frac{x - \mu}{\sigma} \]
One-Sample t-Test:
\[ t = \frac{\bar{x} - \mu}{s/\sqrt{n}} \]
Degrees of Freedom:
\[ df = n - 1 \]

Two-Sample t-Test

Test Statistic:
\[ t = \frac{\bar{x}_1 - \bar{x}_2}{s_p} \]
Pooled Standard Error:
\[ s_p = \sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}} \]
Pooled Standard Deviation:
\[ s_{pooled} = \sqrt{\frac{(n_1-1)s_1^2 + (n_2-1)s_2^2}{n_1+n_2-2}} \]

Confidence Intervals

Mean (σ known):
\[ \bar{x} \pm z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}} \]
Mean (σ unknown):
\[ \bar{x} \pm t_{\alpha/2} \cdot \frac{s}{\sqrt{n}} \]
Proportion:
\[ \hat{p} \pm z_{\alpha/2} \cdot \sqrt{\frac{\hat{p}(1-\hat{p})}{n}} \]

5 Sample Size Determination

Cochran's Formula

Sample Size for Proportions:
\[ n = \frac{z^2 \cdot p \cdot q}{e^2} \]
Finite Population Correction:
\[ n = \frac{N \cdot z^2 \cdot p \cdot q}{e^2(N-1) + z^2 \cdot p \cdot q} \]

Where: z = z-score, p = expected proportion, q = 1-p, e = margin of error, N = population size

Sample Size for Means

Known Standard Deviation:
\[ n = \left(\frac{z \cdot \sigma}{E}\right)^2 \]
Two-Group Comparison:
\[ n = \frac{2(z_{\alpha/2} + z_{\beta})^2 \sigma^2}{(\mu_1 - \mu_2)^2} \]

Where: E = margin of error, σ = standard deviation, z_β = power level z-score

6 Additional Statistical Measures

Relative Frequency

\[ \text{Relative Frequency} = \frac{\text{Frequency of Event}}{\text{Total Number of Observations}} \]

Used to determine the probability of an event based on observed data

Chi-Square Test

\[ \chi^2 = \sum \frac{(O - E)^2}{E} \]

Where: O = observed frequency, E = expected frequency

Standardized Test Statistic

\[ \text{Test Statistic} = \frac{\text{Statistic} - \text{Parameter}}{\text{Standard Error}} \]

General formula for calculating test statistics in hypothesis testing

Effect Size (Cohen's d)

\[ d = \frac{\bar{x}_1 - \bar{x}_2}{s_{pooled}} \]

Measures the magnitude of difference between two groups

💡 Key Study Tips & Important Notes

📊 Data Analysis Steps

  • Identify the type of data (qualitative/quantitative)
  • Choose appropriate measures of central tendency
  • Calculate variability measures
  • Interpret results in context

🔍 Hypothesis Testing

  • State null and alternative hypotheses
  • Choose significance level (α)
  • Calculate test statistic
  • Make decision based on p-value

📈 Regression Analysis

  • Check for linear relationship
  • Calculate correlation coefficient
  • Find regression equation
  • Interpret slope and intercept

About the Author

Adam Kumar

Co-Founder @RevisionTown
Mathematics Expert specializing in various curricula including IB, AP, GCSE, IGCSE, and more. Dedicated to creating comprehensive educational resources for students worldwide.

RevisionTown provides comprehensive study materials and interactive tools for mathematics and statistics across multiple international curricula.

Shares: