# MIT Probability Reference

### Set Equations

P(A|B) = (P(A nn B)) / (P(B)) provided P(B)!=0

 = (|A nn B|)/(|B|)

"Bayes Theorem" = P(B|A) = (P(A|B)*P(B))/(P(A))

(A uu B)^c = A^c nn B^c

(A nn B)^c = A^c uu B^c

If A is independent of B:

P(A nn B) = P(A)*P(B)

P(A|B) = P(A)

P(B|A) = P(B)

"Addition of probabilities" = P(A) = P(A|B) * P(B) + P(A|B^c)*P(B^c)

mi/mi/mi/mi/mo/mimi

Range X : 0, 1

### Binomial Distribution

This is the sum of independent Bernoulli(p) variables. Range X : 0, 1, … n

Var(X)=np(1-p)

### Geometric Distribution

Range X : 0, 1, 2, … Var(X) = (1-p) / p^2

### Uniform Distribution

Where all outcomes are equally likely. Range X : 1, 2, ..., n p(k) = 1/n E(X) = (n+1) / 2 Var(X) = (n^2-1)/12

### Exponential Distribution

Models: Waiting times

X ~ exponential(lambda) or exp(lambda)

Parameter: lambda (called the rate parameter)

• Range: [0,oo)
• Density: f(x) = lambda e^(-lambda x) for 0 <= x
• F(x) = 1 - e^(-lambda x)

### Normal distribution

• Parameters: mu sigma
• Range: (-oo,oo)
• Notation: "normal"(mu,sigma^2) or N(mu,sigma^2)
• Density: f(x)=1/(sigma sqrt(2pi)) e^(-(x-mu)^2/(2sigma^2))
• Distribution: F(x) has no formula, so use tables or software such as pnorm in R to compute F(x).
• pnorm(.6,0,1) returns the .6 quantile of the standard normal distribution.
• Models: Measurement error, intelligence/ability, height, averages of lots of data.
• Standard Normal Cumulative Distribution : N(0,1) = Phi(z) : has mean 0 and variance 1.
• Standard Normal Density: phi(z) = 1/sqrt(2pi) e^(-x^2/2)
• N(mu,sigma^2) has mean mu, variance sigma^2, and standard deviation sigma.
• P(-1 <= Z <= 1) ~~ .68, P(-2 <= Z <= 2) ~~ .95, P(-3 <= Z <= 3) ~~ .99
• P(Z <= 1) ~~ .84, P(Z <= 2) ~~ .975, P(Z <= 3) ~~ .995
• Phi(x) = P(Z <= x)

### Discrete Random Variables

Random variable X assigns a number to each outcome: X : Omega -> R

X = a " denotes event " {omega | X(omega) = a}

"probability mass function (pmf) of X is given by: " p(a) = P(X=a)

"Cumulative distribution function (cdf) of X is given by: " F(a) = P(X<=a)

### Continuous random variables

"Probability density function (pdf)" = P(c<=x<=d) = int_c^d f(x) dx "for" f(x)>=0

"Cumulative distribution function (cdf)" = F(x) = P(X<=x) = int_-oo^x f(t) dt

Properties of the cdf (Same as for discrete distributions)

• (Definition) F(x) = P(X<=x)
• 0 <= F(x) <= 1
• non-decreasing
• lim_(x->-oo) F(x) = 0
• lim_(x->oo) F(x) = 1
• P(c < X <= d) = F(d) - F(c)
• F'(x) = f(x)

### Expected Value (mean or average)

• weighted average = E(X) = sum_(i=1)^n x_i * p(x_i)
• E(X+Y) = E(X) + E(Y)
• E(aX+b) = a*E(X) + b
• E(h(X)) = sum_i h(x_i) * p(x_i)
• E(X) = int_a^b x * p(x) dx (units for p(x) are probability/dx).

### Variance

• "mean" = E(X) = mu
• variance of X = Var(X) = E((x-mu)^2) = sigma^2 = sum_(i=1)^n p(x_i)(x_i-mu)^2
• standard deviation = sigma = sqrt(Var(X))
• Var(aX+b) = a^2 Var(X)
• Var(X) = E(X^2) - E(X)^2
• If X and Y are independent then: Var(X+Y) = Var(X) + Var(Y)

### Quantiles

• median is x for which P(X<=x) = P(X>=x)
• median is when cdf F(x) = P(X<=x) = .5
• The pth quantile of X is the value q_p such that F(q_p)=P(X<=q_p)=p. In this notation q_.5 is the median.

### Central Limit Theorem & Law of Large Numbers

• LoLN: As n grows, the probability that E(X) is close to mu goes to 1.
• LoLN: lim_(n->oo) P(|E(X) - mu| < alpha) = 1
• CLT: As n grows, the distribution of E(X) converges to the normal distribution N(mu,sigma^2/n).
• Z = standardization of X = (X - mu)/sigma
• Z has mean 0, standard deviation 1
• If X has a normal distribution, Z is the standard normal distribution