Random Variables and Distributions

How probability turns outcomes into numerical quantities and then summarizes their behavior through distributions, mass functions, densities, and cumulative distribution functions.

Modified

April 26, 2026

Keywords

random variable, distribution, pmf, pdf, cdf

1 Role

This page turns probability from event language into quantity language.

Instead of asking only whether an event happened, we now attach a numerical value to each outcome and study how that value behaves.

2 First-Pass Promise

Read this page after Sample Spaces, Events, and Conditioning.

If you stop here, you should still understand:

what a random variable really is
how a distribution differs from the random variable itself
when to use a PMF, PDF, or CDF
how to build a simple distribution from an experiment

3 Why It Matters

Most probability in statistics, ML, and engineering is not written directly in terms of raw outcomes in a sample space.

It is written in terms of random quantities:

the number of errors in a transmission
the runtime of an algorithm
the noise in a sensor
the loss on a fresh data point
the label of a sampled example

So random variables are the bridge from a probability model to a mathematical object we can average, transform, estimate, or optimize over.

4 Prerequisite Recall

a sample space \(\Omega\) is the set of possible outcomes
an event is a subset of \(\Omega\)
conditional probability changes the relevant world by restricting to observed information

5 Intuition

A random variable does not create randomness. The randomness is already present in the experiment.

A random variable is a rule that reads an outcome and reports a number.

So if \(\omega\) is the outcome, the value \(X(\omega)\) is the quantity we care about. Different outcomes can produce the same value of \(X\), which is why the probability of a value is found by grouping together all outcomes that map to it.

That leads to the distribution of \(X\): a description of how the values of \(X\) are spread out probabilistically.

The clean picture is:

experiment -> outcomes -> random variable -> distribution of values

6 Formal Core

Definition 1 (Definition) A random variable \(X\) is a function from the sample space \(\Omega\) to the real numbers:

\[ X : \Omega \to \mathbb{R}. \]

For each outcome \(\omega \in \Omega\), the value \(X(\omega)\) is the numerical quantity attached to that outcome.

Definition 2 (Distribution) The distribution of \(X\) tells us the probabilities of the values that \(X\) can take.

For a discrete random variable,

\[ p_X(x) = P(X=x) \]

is the probability mass function, or PMF.

For any random variable, the cumulative distribution function is

\[ F_X(t) = P(X \le t). \]

For a continuous random variable, we often describe the distribution by a density \(f_X\) such that

\[ P(a \le X \le b) = \int_a^b f_X(x)\,dx. \]

The density itself is not a probability at a point. It is a function whose integral over an interval gives probability.

Proposition 1 (Key Statement) To compute the distribution of a discrete random variable \(X\), group together all outcomes \(\omega\) that produce the same value \(X(\omega)=x\) and add their probabilities:

\[ P(X=x) = P(\{\omega \in \Omega : X(\omega)=x\}). \]

This is the basic move that turns outcome models into distribution models.

7 Worked Example

Flip a fair coin twice and let \(X\) be the number of heads.

The sample space is

\[ \Omega = \{HH, HT, TH, TT\}. \]

Define

\[ X(HH)=2, \qquad X(HT)=1, \qquad X(TH)=1, \qquad X(TT)=0. \]

So the possible values of \(X\) are \(0,1,2\).

To build the PMF, group the outcomes by the value of \(X\):

\(X=0\) happens only on \(TT\), so \(P(X=0)=1/4\)
\(X=1\) happens on \(HT\) and \(TH\), so \(P(X=1)=2/4=1/2\)
\(X=2\) happens only on \(HH\), so \(P(X=2)=1/4\)

Therefore

\[ p_X(0)=\frac{1}{4}, \qquad p_X(1)=\frac{1}{2}, \qquad p_X(2)=\frac{1}{4}. \]

The CDF is then:

\[ F_X(t)= \begin{cases} 0, & t < 0 \\ 1/4, & 0 \le t < 1 \\ 3/4, & 1 \le t < 2 \\ 1, & t \ge 2. \end{cases} \]

This example shows three levels clearly:

the sample space stores the raw outcomes
the random variable compresses those outcomes into a numerical quantity
the distribution summarizes how often each value occurs

8 Computation Lens

For a discrete random variable, the practical workflow is:

write the underlying outcome model
define the quantity of interest as a function on outcomes
group outcomes that give the same value
add their probabilities

For continuous variables, the same logic still holds, but we usually work through densities or cumulative distribution functions rather than explicit outcome lists.

This is why CDFs are useful even when PDFs are hard to manipulate: the CDF always encodes the full distribution, while a density only exists in the continuous setting.

9 Application Lens

Random variables are the language behind nearly every quantitative object in modern applied theory:

in statistics, an estimator is a random variable because the data are random
in ML, the loss on a fresh example is a random variable
in signal processing, measurement noise is modeled as a random variable
in reliability, time to failure is a random variable

So the distinction between outcome, random variable, and distribution is not bookkeeping. It is what keeps a probabilistic argument logically stable.

10 Stop Here For First Pass

If you can now explain:

why a random variable is a function on outcomes
why the distribution is not the same object as the random variable
how PMFs, PDFs, and CDFs play different roles
how to derive a simple PMF from a finite experiment

then this page has done its main job.

11 Go Deeper

The next page is Expectation, Variance, Covariance, where probability starts summarizing random variables by center, spread, and dependence.

12 Optional Paper Bridge

MIT RES.6-012 lecture notes - Second pass - official MIT notes that separate outcome models, PMFs, PDFs, and CDFs very cleanly. Checked 2026-04-24.
Statistics 110: Probability - Paper bridge - official Harvard course hub that keeps random variables tied to concrete examples instead of only formulas. Checked 2026-04-24.

13 Optional After First Pass

If you want more practice before moving on:

define two different random variables on the same sample space
derive both their PMFs and compare them
compute a CDF from a small PMF and sketch its step shape

14 Common Mistakes

thinking a random variable is the same thing as a random outcome
confusing a value like \(X=2\) with the full random variable \(X\)
treating a density value \(f_X(x)\) as if it were a probability at a point
forgetting that different outcomes can map to the same value of \(X\)

15 Exercises

Roll a fair six-sided die and let \(X\) be the parity indicator: \(X=1\) if the roll is even and \(X=0\) otherwise. Write the PMF of \(X\).
Roll two fair dice and let \(Y\) be their sum. What is \(P(Y=7)\)?
For the two-coin example above, compute \(F_X(0.5)\), \(F_X(1)\), and \(F_X(3)\).

16 Sources and Further Reading

Harvard Stat 110 - First pass - strong official probability course hub with consistently good examples and intuition. Checked 2026-04-24.
Penn State STAT 414 - First pass - official open course notes with clean coverage of discrete and continuous random variables. Checked 2026-04-24.
MIT RES.6-012 lecture notes - Second pass - official MIT notes with a strong theory-first treatment of random variables and distributions. Checked 2026-04-24.

Sources checked online on 2026-04-24:

Harvard Stat 110 course homepage
Penn State STAT 414 overview
MIT RES.6-012 lecture notes page