UBC DSCI 200 – prob-distributions

Random Variables and Distributions

DSCI 200

Katie Burak, Gabriela V. Cohen Freue

Last modified – 14 January 2026

\[ \DeclareMathOperator*{\argmin}{argmin} \DeclareMathOperator*{\argmax}{argmax} \DeclareMathOperator*{\minimize}{minimize} \DeclareMathOperator*{\maximize}{maximize} \DeclareMathOperator*{\find}{find} \DeclareMathOperator{\st}{subject\,\,to} \newcommand{\E}{E} \newcommand{\Expect}[1]{\E\left[ #1 \right]} \newcommand{\Var}[1]{\mathrm{Var}\left[ #1 \right]} \newcommand{\Cov}[2]{\mathrm{Cov}\left[#1,\ #2\right]} \newcommand{\given}{\ \vert\ } \newcommand{\X}{\mathbf{X}} \newcommand{\x}{\mathbf{x}} \newcommand{\y}{\mathbf{y}} \newcommand{\P}{\mathcal{P}} \newcommand{\R}{\mathbb{R}} \newcommand{\norm}[1]{\left\lVert #1 \right\rVert} \newcommand{\snorm}[1]{\lVert #1 \rVert} \newcommand{\tr}[1]{\mbox{tr}(#1)} \newcommand{\brt}{\widehat{\beta}^R_{s}} \newcommand{\brl}{\widehat{\beta}^R_{\lambda}} \newcommand{\bls}{\widehat{\beta}_{ols}} \newcommand{\blt}{\widehat{\beta}^L_{s}} \newcommand{\bll}{\widehat{\beta}^L_{\lambda}} \newcommand{\U}{\mathbf{U}} \newcommand{\D}{\mathbf{D}} \newcommand{\V}{\mathbf{V}} \]

Attribution

This material is based on content adapted from

Learning Objectives

Distinguish between sampling from a finite population and model-based approaches to randomness.
Define and interpret basic probability concepts such as a sample space, event and random variable.
Classify random variables as discrete or continuous, with supporting examples.
Recognize common types of probability distributions and when they are appropriate.

Last time: Sampling from a finite population

We sampled from some finite population (e.g., all students in a university)
Our sample is a subset of actual individuals or observations
Variation occurred because each sample was selected at random
- different samples yield different results.
This random selection generates a random sample.

Today, we’ll focus on defining and characterizing random phenomena more broadly, beyond random selection

Randomness beyond sampling

Words like probability, chance, and odds are commonly used to express uncertainty:

‘Very high’ probability that Canada will see coronavirus, says Toronto respirologist, (CBC News, Jan, 2020)
CDC says flu activity probably has not peaked amid record-breaking season (CNN, Jan 2026)
The riskiest asteroid on record now has near-zero chance of hitting Earth (CNN, Feb 2025)
The probability of a recession has fallen to 40%, (JP Morgan, May 2025)
2025-26 Stanley Cup odds: Avalanche, Hurricanes, Lightning lead favorites (ESPN, Jan 2026)
The odds of winning the Powerball lottery with single ticket is about 1 in 300 million.

Each story reflects a case where the outcome is uncertain, capturing randomness.

Randomness can come from:

physical events (e.g, flipping a coin)
selection mechanisms (e.g., selecting individuals from a population)
experimental designs (random assignment in a randomized experiment)
complex systems (e.g., stock market fluctuations)
simulations or algorithms (e.g., random number generator in software)
biological processes (e.g., random assortment of chromosomes)

In a random phenomenon there is uncertainty about which of several potential outcomes will take place. Outcomes or events may not be equally likely.

Terminology

Outcomes: possible results of the random phenomenon, not necessarily numeric (e.g, side of a coin, students selected for a survey)
Events: a collection of outcomes (e.g, winning all games of the season)
Probability of events: a number between 0 and 1 that measures the uncertainty of an event, how likely an event is to occur.

Note

In many situations, we won’t be focus on defining the set of all possible outcomes (aka sample space). However, it can help us understand what is plausible

Example of random phenomenom:

Tossing a Coin

All possible outcomes (aka sample space): {H, T}
Event: Toss a head = {H}

Since the coin is fair, both outcomes are equally likely

\[P(H) = P(\text{event}) = \frac{(\text{number of outcomes in the event})} {(\text{total number of possible outcomes})} = \frac{1}{2} = 0.5\]

The interpretation of probability as frequency applies to many real examples.

For example, probability of high temperature in wildfires randomly selected from all fires in Alberta

A different example of a random phenomenom

Student arrival to class

The exact arrival time is uncertain and depends on many unpredictable factors (traffic, delays, decisions).

All possible outcomes: any time of the day
Event: times in the interval [8:55, 9:05]

Photo by Dominic Kurniawan Suryaputra, Unsplash

We can’t compute the probability of the event using frequencies.

Instead, we rely on a model-based approach: assume a distribution to reflect uncertainty about the arrival time.

Random variables

Random variables: a number assigned to each outcome, representing a quantity of interest (e.g., age of students selected, temperature the day of a fire)
It can take on multiple possible values, some potentially more likely than others.
Before we observe it, the value is unknown. After observation, it is fixed.

Why is it called a random variable?

Technically, random variables are functions that assign numbers to outcomes. The name reflects two ideas: the value can change (it’s a variable), and it’s uncertain which value it will take (it’s random).

Random variable notation

Random variables are usually represented by uppercase letters (e.g., ( X ), ( Y ), ( Z ))
The values they take are represented by lowercase letters (e.g., ( x ), ( y ), ( z ))

Example: tossing a coin

Let \(X\) represent the number of “heads”
Possible values:
- \(x = 1\) → outcome is “heads”
- \(x = 0\) → outcome is “tails”

Distribution

The distribution of a random variable: is a function that describes the variability of the random variable.

We write this as: \[X \sim \text{Distribution}(\text{parameters})\]
Just like we can have parameters from a finite population, a distribution takes parameters that describe the behavior of a random variable across all possible values
Distributions can be used to calculate probabilities associated with random variables.
In some cases, but not always, the distribution is defined by a specific formula.
Some distributions have have special names (e.g., Bernoulli, Normal)

Finite vs Model-Based

In Statistics and Data Science:

rows of data can be thought as realizations of outcomes
columns of data can be thought as realizations of random variables

Finite Population	Model-Based Approach
Finite population	Data generated from a probability model
Outcomes are a subset of actual units	Outcomes are generated from a distribution
Finite sample space	Infinite or Finite sample space
Variability comes from the sampling	Variability comes from randomness in the model
Probability as frequencies	Model-based probability

Discrete vs Continuous Random Variables

Random variables are classified based on the type of values they can take:
- Discrete: values are countable (e.g., 0, 1, 2, …)
- Continuous: values fall in a range and can take infinitely many possibilities
Examples:
- Discrete: number of heads in 10 coin tosses
- Continuous: height of a randomly selected student

Discrete Random Variables

Today, we’ll look at three types of discrete random variables:

Bernoulli
Binomial
Poisson

Bernoulli

A Bernoulli random variable assigns a numeric value to the outcome of a single trial with two possible results:
- The event of interest occurs → we call this a success
- The event of interest does not occur → we call this a failure
Notation: \[X \sim \text{Bernoulli}(p)\]

where \(p\) is the probability of success fixed in advance (so \(1 - p\) is failure).

\[x = \left\{ \begin{array}{ll} 1 & \text{if oucome = success};\\ 0 & \text{if oucome = failure}\end{array} \right.\]

for example: did a user click on an ad? (yes = 1, no = 0)

Binomial

A binomial random variable represents the number of successes in a fixed number of independent Bernoulli trials.
Notation:

\[X \sim \text{Binomial}(n, p)\]

where:

\(n\) - number of trials
\(p\) - probability of success on each trial fixed in advance
\(X\) can take values in \(\{0, 1, \ldots, n\}\)

for example: number of people who vote in favour of a proposal out of 100 surveyed

Poisson

A Poisson random variable counts the number of times an event occurs in a fixed interval of time or space.
Events happen independently of one another and occur at a constant average rate.
There’s no upper limit to the number of occurrences, but large counts become increasingly rare
Notation:

\[X \sim \text{Poisson}(\lambda)\]

where \(\lambda\) is the expected number of events in the interval.

for example: the number of customers arriving at a store in an hour

Summary: Discrete Random Variables

Distribution	Notation	Parameters	Scenario Example
Bernoulli	\(X \sim \text{Bernoulli}(p)\)	\(p\) = chance of success	Single yes/no outcome (e.g., clicked or not)
Binomial	\(X \sim \text{Binomial}(n, p)\)	\(n\) = trials, \(p\) = success rate	Count of successes in fixed # of trials
Poisson	\(X \sim \text{Poisson}(\lambda)\)	\(\lambda\) = rate	Count of events in time or space (e.g., arrivals)

Note: There are many other types of discrete random variables you may encounter in future courses (e.g., STAT 302).

Continuous random variables

A continuous random variable can take on any value in a continuous range (not just whole numbers).

Today, we’ll look at three types of continuous random variables:

Uniform
Exponential
Normal

We’ll describe the type of data each models and how we use parameters to define their distributions.

Uniform Distribution

All values in a given interval are equally likely
The graph of the distribution is a flat horizontal line
Notation: \[X \sim \text{Uniform}(a, b)\]

where:

\(a\) is the minimum value
\(b\) the maximum value
\(X\) takes any value in the interval \([a, b]\)

Examples of a Uniform Random Variable

Randomly picking a number between 0 and 1
Simulating noise or jitter in synthetic datasets
Sampling a timestamp uniformly within a day for testing

Exponential Distribution

Models waiting times between independent events that happen at a constant average rate.
Events are memoryless, meaning that the past doesn’t affect the future.
Notation:
\[X \sim \text{Exponential}(\lambda)\]

where:

\(\lambda\) is rate parameter (average number of events per unit time)
\(X\) represents the time until the next event

Examples of an Exponential Random Variable

Time until next website visitor arrives
Time until next customer enters a store
Time between radioactive particle emissions

Normal Distribution

Perhaps the most widely used distribution in statistics!
Bell-shaped and symmetric around the mean.
Values can go from \(-\infty\) to \(+\infty\).
Many natural phenomena tend to follow (or approximate) this distribution.
Notation:
\[X \sim \text{Normal}(\mu, \sigma)\]

where:

\(\mu\) - mean (center of the distribution)
\(\sigma\) - standard deviation (spread)

Examples of a Normal Random Variable

Heights of adult humans
Measurement errors in instruments
IQ scores
Daily temperature fluctuations

Summary: Continuous Random Variables

Distribution	Notation	Parameters	Scenario Example
Uniform	\(X \sim \text{Uniform}(a, b)\)	\(a\), \(b\) = min/max	Value equally likely anywhere in a range
Exponential	\(X \sim \text{Exponential}(\lambda)\)	\(\lambda\) = rate	Time between independent events
Normal	\(X \sim \text{Normal}(\mu, \sigma)\)	\(\mu\) = mean, \(\sigma\) = SD	Symmetric, bell-shaped data (e.g., heights, measurements)

iClicker 1

You’re running an online A/B test where each website visitor is randomly assigned to either version A or version B. For each visitor, you record whether they click the “Sign Up” button (yes or no).

Which random variable best models the click outcome for a single visitor?