Distributions

Probability Distributions - Formulas

IB Mathematics Analysis & Approaches (SL & HL)

📊 Discrete Probability Distributions

Definition:

A discrete random variable takes specific, separate values, each with an associated probability

Requirements:

• All probabilities must be non-negative: \(P(X = x_i) \geq 0\)
• Sum of all probabilities equals 1: \(\sum P(X = x_i) = 1\)

Expected Value (Mean):

\[E(X) = \mu = \sum x_i P(X = x_i)\]

Given in formula booklet

Variance (Form 1):

\[\text{Var}(X) = E(X^2) - [E(X)]^2\]

Given in formula booklet

Variance (Form 2):

\[\text{Var}(X) = \sum(x_i - \mu)^2 P(X = x_i)\]

Given in formula booklet

Standard Deviation:

\[\sigma = \sqrt{\text{Var}(X)}\]

🔄 Linear Transformations of Random Variables

For \(Y = aX + b\):

\[E(Y) = aE(X) + b\]

\[\text{Var}(Y) = a^2\text{Var}(X)\]

Note: Adding a constant doesn't change variance

For Sum/Difference of Independent Variables:

\[E(X \pm Y) = E(X) \pm E(Y)\]

\[\text{Var}(X \pm Y) = \text{Var}(X) + \text{Var}(Y)\]

Note: Variances always add (even for subtraction)

🎯 Binomial Distribution

Conditions (4 Requirements):

1. Fixed number of trials (\(n\))
2. Independent trials (one doesn't affect another)
3. Two outcomes only (success or failure)
4. Constant probability of success (\(p\))

Notation:

\[X \sim B(n, p)\]

where \(n\) = number of trials, \(p\) = probability of success

Probability Mass Function:

\[P(X = r) = \binom{n}{r}p^r(1-p)^{n-r}\]

where \(r\) = number of successes (\(0 \leq r \leq n\))
Given in formula booklet

Mean (Expected Value):

\[E(X) = np\]

Given in formula booklet

Variance:

\[\text{Var}(X) = np(1-p) = npq\]

where \(q = 1-p\) (probability of failure)
Given in formula booklet

Standard Deviation:

\[\sigma = \sqrt{np(1-p)}\]

📈 Normal Distribution

Notation:

\[X \sim N(\mu, \sigma^2)\]

where \(\mu\) = mean, \(\sigma^2\) = variance, \(\sigma\) = standard deviation

Properties:

• Continuous distribution (not discrete)
• Bell-shaped, symmetrical curve
• Mean = Median = Mode (all at center)
• Total area under curve = 1
• Curve extends infinitely in both directions
• Defined by two parameters: \(\mu\) and \(\sigma\)

Standard Normal Distribution:

\[Z \sim N(0, 1)\]

Mean = 0, Standard deviation = 1

Standardization (Z-score):

\[Z = \frac{X - \mu}{\sigma}\]

Converts any normal distribution to standard normal
Given in formula booklet

Inverse Standardization:

\[X = \mu + Z\sigma\]

Converts from standard normal back to original distribution

📊 Empirical Rule (68-95-99.7 Rule)

For Normal Distribution:

• Approximately 68% of data within \(\mu \pm \sigma\)
• Approximately 95% of data within \(\mu \pm 2\sigma\)
• Approximately 99.7% of data within \(\mu \pm 3\sigma\)

Useful for Quick Estimates:

Helps identify unusual values and understand spread without calculations

📉 Probability Density Function (HL)

Definition:

For a continuous random variable \(X\), probabilities are found using a probability density function \(f(x)\)

Requirements for Valid PDF:

• \(f(x) \geq 0\) for all values of \(x\)
• \(\int_{-\infty}^{\infty} f(x)\,dx = 1\) (total area = 1)

Probability:

\[P(a \leq X \leq b) = \int_a^b f(x)\,dx\]

Probability = area under the curve from \(a\) to \(b\)
Given in formula booklet

Important Note:

\[P(X = a) = 0\]

For continuous distributions, probability at a single point is zero
Therefore: \(P(a \leq X \leq b) = P(a < X < b)\)

Expected Value (Mean):

\[E(X) = \mu = \int_{-\infty}^{\infty} x \cdot f(x)\,dx\]

Given in formula booklet

Variance:

\[\text{Var}(X) = E(X^2) - [E(X)]^2\]

\[\text{Var}(X) = \int_{-\infty}^{\infty} (x-\mu)^2 f(x)\,dx\]

Both forms given in formula booklet

📊 Cumulative Distribution Function (HL)

Definition:

\[F(x) = P(X \leq x)\]

Cumulative probability up to value \(x\)

For Continuous Variables:

\[F(x) = \int_{-\infty}^x f(t)\,dt\]

Relationship with PDF:

\[f(x) = \frac{dF(x)}{dx}\]

PDF is the derivative of CDF

Properties:

• \(F(-\infty) = 0\)
• \(F(\infty) = 1\)
• \(F(x)\) is non-decreasing
• \(P(a < X \leq b) = F(b) - F(a)\)

🔢 Using Technology (GDC)

For Binomial Distribution:

• Use binompdf(n, p, r) for \(P(X = r)\)
• Use binomcdf(n, p, r) for \(P(X \leq r)\)
• Calculate cumulative probabilities easily

For Normal Distribution:

• Use normalcdf(lower, upper, μ, σ) for probabilities
• Use invNorm(probability, μ, σ) to find values
• Can work with any normal distribution directly

Important:

Always use your GDC for calculations - manual calculation of these distributions is time-consuming and error-prone

💡 Exam Tip: Most distribution formulas are given in the IB formula booklet including binomial probability, mean, variance, normal standardization, and continuous distribution formulas. Always use your GDC for binomial and normal distribution calculations - it's faster and more accurate! Remember: Binomial needs 4 conditions (fixed n, independent, two outcomes, constant p). For continuous distributions (HL), probability at a single point is zero. Normal distribution is continuous and bell-shaped; binomial is discrete.