1.3. Random variables

$\newcommand{\argmin}{\mathop{\mathrm{argmin}}\limits}$ $\newcommand{\argmax}{\mathop{\mathrm{argmax}}\limits}$

We take a closer look at random variables and random elements.


Random elements (measurable maps)

Random elements are generalizations of random variables. While the textbook focuses on random variables, we introduce them to provide properties that are not only related to random variables, but also to more general functions.

$X: (\Omega, \mathcal{F}) \to (S, \Sigma)$ is a random element if $X^{-1}(B) \in \mathcal{F},~ \forall B \in \Sigma$.

If the codomain is $(\mathbb{R}, \mathcal{B}(\mathbb{R}))$ we call a random element a random variable. If the codomain is $(\mathbb{R}^d, \mathcal{B}(\mathbb{R}^d))$ we call it a random vector. If the codomain is a class of functions, we call it a random function.

As we can see in the definition of random variables (measurable functions), functions are closely related to sets. To be specific, properties of a class of functions is closely related to its domain and image. Similar to $\pi$-$\lambda$ theorem, we can check if a function is a random element by just checking inverse images of elements of the generator of $\sigma$-field of its codomain.

Let $X: \Omega \to S$ be a function and $\mathcal{A}$ be a collection of sets such that $\sigma(\mathcal{A}) = \Sigma$. If $X^{-1}(A) \in \mathcal{F},~ \forall A \in \mathcal{A}$, then $X$ is a random element.

Let $\mathcal{S}=\{B:~ X^{-1}(B) \in \mathcal{F}\}$ ,then $\mathcal{S}$ is a $\sigma$-field and $\mathcal{A} \subset \mathcal{S}$. Hence by definition $\sigma(\mathcal{A}) = \Sigma \subset \mathcal{S}$ and the desired result follows.

In the proof of the theorem, observe that if $\mathcal{A}$ is a $\sigma$-field on the codomain $S$ of a random element $X$, then ${A:~ X^{-1}(A)\in\mathcal{F}}$ is also a $\sigma$-field on $S$ and ${X^{-1}(A):~ A\in\mathcal{A}}$ is a $\sigma$-field on $\Omega$. In fact, the latter is the smallest $\sigma$-field that makes $X$ a random element. We call it a $\sigma$-field generated by a measurable map $X$.

Closure properties of random variables

In this subsection, we are interested in operations on random elements/variables that preserves its measurability.

The first two theorems are about a composition of two measurable functions.

$X: (\Omega, \mathcal{F}) \to (S, \Sigma)$ is a random element.
$f: (S, \Sigma) \to (T, \mathcal{T})$ is a measurable function.
$\Rightarrow f \circ X: (\Omega, \mathcal{F}) \to (T, \mathcal{T})$ is a random element.

Given $B \in \mathcal{T}$, $$(f\circ X)^{-1}(B) = X^{-1}(f^{-1}(B)) = X^{-1}(S) \in \mathcal{F}$$ where $S = f^{-1}(B) \in \Sigma$.
$X_1,\cdots,X_n: (\Omega, \mathcal{F})\to(\mathbb{R}^d, \mathcal{B}(\mathbb{R}^d))$ are random variables.
$f: \mathbb{R}^d\to\mathbb{R}$ is a Borel measurable function.
$\Rightarrow f(X_1,\cdots,X_n)$ is a random variable.

Let $Y=(X_1,\cdots,X_n)$, $\mathcal{A}=\{A_1\times\cdots\times A_n:~ A_i \in \mathcal{B}(\mathbb{R}),~ i=1,\cdots,n\}$ then use the fact that $\sigma(\mathcal{A})=\mathcal{B}(\mathbb{R}^d)$ and theorem 1 to show that $Y$ is a random vector. By theorem 2, $f(Y)$ is a random variable.

The next two theorems are about minimum/maximum and limiting behaviors of random variables.

Let $X_n$, $n=1,2,\cdots$ be random variables. The followings are random variables.
(i) $\inf_n X_n$
(ii) $\sup_n X_n$
(iii) $\liminf_n X_n$
(iii) $\limsup_n X_n$

Let $\mathcal{A}=\{(-\infty,x]:~ x\in\mathbb{R}\}$ so that $\sigma(\mathcal{A})=\mathcal{B}(\mathbb{R})$.
(i) Let $Y = \inf_n X_n$. Given $A=(-\infty, x] \in \mathcal{A}$. $Y^{-1}(A) = \{\inf_n X_n \le x\} = \cup_{i=1}^\infty \{X_i \le x\} \in \mathcal{F}$.
(iii) $\liminf_n X_n = \sup_n \inf_{m\ge n} X_m$. Use (i), (ii).
(ii), (iv) Use $-X_n$ and (i), (iii).

By (iii) and (iv), if $X_n$ is a random variable and its limit $X_\infty$ exists, then $X_\infty$ is also a random variable.

$$\begin{align} \Omega_0 &:= \{\omega:~ \lim_n X_n(\omega) = X_\infty(\omega)\} \\ &= \{\omega:~ \limsup_n X_n(\omega) - \liminf_n X_n(\omega) = 0\} \in \mathcal{F}. \end{align}$$ i.e. $\Omega_0$ is a measurable set.

We used $\limsup_n X_n(\omega) - \liminf_n X_n(\omega) = 0$ instead of $\limsup_n X_n(\omega) = \liminf_n X_n(\omega)$ to cover the case where $X_\infty$ is either $\pm\infty$.

I will finish this section by defining a kind of convergence used in probability theory.

We say that $X_n$ converges almost surely to $X_\infty$ or write $X_n \to X_\infty \text{ a.s.}$, if $P(\{\lim_n X_n = X_\infty\}) = 1$.

Here, $\text{a.s.}$ stands for “almost surely”. In general measure theory, this term is usually replaced with $\text{a.e.}$ or “almost everywhere”.

Almost sure convergence implies that in probability space, in most of the times only the subsets with positive measures are important: it might be enough to have the fact that $X_n$ converges a.s. Convergence theorems in Lebesgue integration are the examples. We will cover this in detail later.



Acknowledgement

This post is based on the textbook Probability: Theory and Examples, 5th edition (Durrett, 2019) and the lecture at Seoul National University, Republic of Korea (instructor: Prof. Johan Lim).