Random Variables and Distributions
random variable, distribution, pmf, pdf, cdf
1 Role
This page turns probability from event language into quantity language.
Instead of asking only whether an event happened, we now attach a numerical value to each outcome and study how that value behaves.
2 First-Pass Promise
Read this page after Sample Spaces, Events, and Conditioning.
If you stop here, you should still understand:
- what a random variable really is
- how a distribution differs from the random variable itself
- when to use a PMF, PDF, or CDF
- how to build a simple distribution from an experiment
3 Why It Matters
Most probability in statistics, ML, and engineering is not written directly in terms of raw outcomes in a sample space.
It is written in terms of random quantities:
- the number of errors in a transmission
- the runtime of an algorithm
- the noise in a sensor
- the loss on a fresh data point
- the label of a sampled example
So random variables are the bridge from a probability model to a mathematical object we can average, transform, estimate, or optimize over.
4 Prerequisite Recall
- a sample space \(\Omega\) is the set of possible outcomes
- an event is a subset of \(\Omega\)
- conditional probability changes the relevant world by restricting to observed information
5 Intuition
A random variable does not create randomness. The randomness is already present in the experiment.
A random variable is a rule that reads an outcome and reports a number.
So if \(\omega\) is the outcome, the value \(X(\omega)\) is the quantity we care about. Different outcomes can produce the same value of \(X\), which is why the probability of a value is found by grouping together all outcomes that map to it.
That leads to the distribution of \(X\): a description of how the values of \(X\) are spread out probabilistically.
The clean picture is:
experiment -> outcomes -> random variable -> distribution of values
6 Formal Core
Definition 1 (Definition) A random variable \(X\) is a function from the sample space \(\Omega\) to the real numbers:
\[ X : \Omega \to \mathbb{R}. \]
For each outcome \(\omega \in \Omega\), the value \(X(\omega)\) is the numerical quantity attached to that outcome.
Definition 2 (Distribution) The distribution of \(X\) tells us the probabilities of the values that \(X\) can take.
For a discrete random variable,
\[ p_X(x) = P(X=x) \]
is the probability mass function, or PMF.
For any random variable, the cumulative distribution function is
\[ F_X(t) = P(X \le t). \]
For a continuous random variable, we often describe the distribution by a density \(f_X\) such that
\[ P(a \le X \le b) = \int_a^b f_X(x)\,dx. \]
The density itself is not a probability at a point. It is a function whose integral over an interval gives probability.
Proposition 1 (Key Statement) To compute the distribution of a discrete random variable \(X\), group together all outcomes \(\omega\) that produce the same value \(X(\omega)=x\) and add their probabilities:
\[ P(X=x) = P(\{\omega \in \Omega : X(\omega)=x\}). \]
This is the basic move that turns outcome models into distribution models.
7 Worked Example
Flip a fair coin twice and let \(X\) be the number of heads.
The sample space is
\[ \Omega = \{HH, HT, TH, TT\}. \]
Define
\[ X(HH)=2, \qquad X(HT)=1, \qquad X(TH)=1, \qquad X(TT)=0. \]
So the possible values of \(X\) are \(0,1,2\).
To build the PMF, group the outcomes by the value of \(X\):
- \(X=0\) happens only on \(TT\), so \(P(X=0)=1/4\)
- \(X=1\) happens on \(HT\) and \(TH\), so \(P(X=1)=2/4=1/2\)
- \(X=2\) happens only on \(HH\), so \(P(X=2)=1/4\)
Therefore
\[ p_X(0)=\frac{1}{4}, \qquad p_X(1)=\frac{1}{2}, \qquad p_X(2)=\frac{1}{4}. \]
The CDF is then:
\[ F_X(t)= \begin{cases} 0, & t < 0 \\ 1/4, & 0 \le t < 1 \\ 3/4, & 1 \le t < 2 \\ 1, & t \ge 2. \end{cases} \]
This example shows three levels clearly:
- the sample space stores the raw outcomes
- the random variable compresses those outcomes into a numerical quantity
- the distribution summarizes how often each value occurs
8 Computation Lens
For a discrete random variable, the practical workflow is:
- write the underlying outcome model
- define the quantity of interest as a function on outcomes
- group outcomes that give the same value
- add their probabilities
For continuous variables, the same logic still holds, but we usually work through densities or cumulative distribution functions rather than explicit outcome lists.
This is why CDFs are useful even when PDFs are hard to manipulate: the CDF always encodes the full distribution, while a density only exists in the continuous setting.
9 Application Lens
Random variables are the language behind nearly every quantitative object in modern applied theory:
- in statistics, an estimator is a random variable because the data are random
- in ML, the loss on a fresh example is a random variable
- in signal processing, measurement noise is modeled as a random variable
- in reliability, time to failure is a random variable
So the distinction between outcome, random variable, and distribution is not bookkeeping. It is what keeps a probabilistic argument logically stable.
10 Stop Here For First Pass
If you can now explain:
- why a random variable is a function on outcomes
- why the distribution is not the same object as the random variable
- how PMFs, PDFs, and CDFs play different roles
- how to derive a simple PMF from a finite experiment
then this page has done its main job.
11 Go Deeper
The next page is Expectation, Variance, Covariance, where probability starts summarizing random variables by center, spread, and dependence.
12 Optional Paper Bridge
- MIT RES.6-012 lecture notes -
Second pass- official MIT notes that separate outcome models, PMFs, PDFs, and CDFs very cleanly. Checked2026-04-24. - Statistics 110: Probability -
Paper bridge- official Harvard course hub that keeps random variables tied to concrete examples instead of only formulas. Checked2026-04-24.
13 Optional After First Pass
If you want more practice before moving on:
- define two different random variables on the same sample space
- derive both their PMFs and compare them
- compute a CDF from a small PMF and sketch its step shape
14 Common Mistakes
- thinking a random variable is the same thing as a random outcome
- confusing a value like \(X=2\) with the full random variable \(X\)
- treating a density value \(f_X(x)\) as if it were a probability at a point
- forgetting that different outcomes can map to the same value of \(X\)
15 Exercises
- Roll a fair six-sided die and let \(X\) be the parity indicator: \(X=1\) if the roll is even and \(X=0\) otherwise. Write the PMF of \(X\).
- Roll two fair dice and let \(Y\) be their sum. What is \(P(Y=7)\)?
- For the two-coin example above, compute \(F_X(0.5)\), \(F_X(1)\), and \(F_X(3)\).
16 Sources and Further Reading
- Harvard Stat 110 -
First pass- strong official probability course hub with consistently good examples and intuition. Checked2026-04-24. - Penn State STAT 414 -
First pass- official open course notes with clean coverage of discrete and continuous random variables. Checked2026-04-24. - MIT RES.6-012 lecture notes -
Second pass- official MIT notes with a strong theory-first treatment of random variables and distributions. Checked2026-04-24.
Sources checked online on 2026-04-24:
- Harvard Stat 110 course homepage
- Penn State STAT 414 overview
- MIT RES.6-012 lecture notes page