(PTE) 2.2. Weak laws of large numbers
We say a random variable $X_n$ converges in probability (or $P$converges) to another random variable $X$ and write $X_n \overset{P}{\to} X$ if $\lim_n P(X_n  X > \epsilon) = 0$ for all $\epsilon >0.$ We can also define convergence in probability to a constant by letting $X = c\in\mathbb{R}$. It is easy yet useful to know that $X_n \overset{L^p}{\to} X$ with $p > 0$ implies $X_n \overset{P}{\to} X$ by MarkovChebyshev inequality with $\varphi(x) = x^p$.^{1}
Week law of large numbers is about convergence of the sample mean of independent random variables to their expectation in probability. In this section, we cover variants of the week law of large numbers (WLLN), from the simplest $L^2$WLLN to the most familiar form which only requires $L^1$ condition.
$L^2$ weak law
If $X_n$’s are independent then it is uncorrelated but the converse does not hold. It is also clear that if $X_i$’s are uncorrelated then $\text{Var}(X_1+\cdots+X_n) = \sum_{i=1}^n \text{Var}(X_i).$ Using Chebyshev’s inequality we can easier get our first WLLN for uncorrelated random variables.
By chebyshev's inequality, given $\epsilon > 0$, $$\begin{align} P(S_n/n  \mu > \epsilon) \le \text{Var}(S_n/n)/\epsilon^2 \le C/\epsilon^2n \to 0 \end{align} $$ and by definition of variance, $$ ES_n/n  \mu^2 = \text{Var}(S_n/n) \le C/n \to 0. $$
An example of $L^2$ weak law is in highdimensional problem.
Theorem 1 requires strong conditions that $X_i$’s should have the same expectation and their variance should be uniformly bounded. Theorem 2 relieves these so that the result could be more useful.
$\frac{\sigma_n^2}{b_n^2} \to_n 0 \implies \frac{S_n  \mu_n}{b_n} \overset{P}{\to} 0.$
$\sigma_n^2 / b_n^2 = E\left(\frac{S_n  \mu_n}{b_n}\right)^2 \to 0.$
This can be used in any kind of sequence of random variable $(S_n)_{n\in\mathbb{N}}$. (Even the uncorrelatedness is not required!)
Weak law for triangular arrays
Many topics in probability theory concerns triangular arrays $(X_{nk})_{1\le k \le n}$ such that $X_{nk}$, $k=1,\cdots,n$ are independent (rowwise independent). We are interested in limiting behaviors of rowwise sum $S_n := X_{n1} + \cdots + X_{nn}$.
To deal with the case where the finite second moment condition is not ensured, we introduce truncated random variables.
$\bar X := X\mathbf{1}_{(X \le M)} = \begin{cases} X &,~ \text{if } X \le M \\ 0 &,~ \text{otherwise} \end{cases}$
By ignoring the tail part of $X$, truncated random variable $\bar X$ always has finite moments.
Let $b_n > 0$, $b_n \to_n \infty$.
Define $\bar X_{nk} := X_{nk} \mathbf{1}_{(X_{nk} \le b_n)}.$
(i) $\sum_{k=1}^n P(X_{nk} > b_n) \to_n 0.$
(ii) $\frac{1}{b_n^2} \sum_{k=1}^n E\bar X_{nk}^2 \to_n 0.$
$\implies \frac{S_n  a_n}{b_n} \overset{P}{\to} 0$, where $a_n = \sum_{k=1}^n E\bar X_{nk}.$
Let $\bar S_n = \bar X_{n1} + \cdots + \bar X_{nn}$ so that $a_n = E\bar S_n.$ Then for an arbitrary $\epsilon>0$, $P( \frac{S_n  a_n}{b_n} > \epsilon) \le \underbrace{P(S_n \ne \bar S_n)}_\text{(i)} + \underbrace{P(\frac{\bar S_n  a_n}{b_n} > \epsilon)}_\text{(ii)}.$
For the first part, $$\begin{align} \text{(i)} &\le P(\bigcup\limits_{k=1}^n \{X_{nk} \ne \bar X_{nk}\}) \le \sum\limits_{k=1}^n P(X_{nk} \ne \bar X_{nk}) \\ & = \sum\limits_{k=1}^n P(X_{nk} > b_n) \to_n 0 \end{align} $$ and for the latter, $$ \begin{align} \text{(ii)} &\le E(\frac{\bar S_n  a_n}{b_n})^2 / \epsilon^2 \le \text{Var}(\bar S_n) / \epsilon^2b_n^2 \\ &= \sum\limits_{k=1}^n \text{Var}(\bar S_n) / \epsilon^2b_n^2 \\ &= \sum\limits_{k=1}^n E\bar X_{nk}^2 / \epsilon^2b_n^2 \to_n 0 \end{align} $$ $\therefore P(\frac{S_n  a_n}{b_n} > \epsilon) \to_n 0$ for all $\epsilon >0.$
Weak law of large numbers
Result for triangular arrays paves the way for general weak law of large numbers. Before we get to it, let’s take a look at a highly useful lemma.
$\implies EY^p = \int_0^\infty py^{p1} P(Y > y) dy.$
$$ \begin{align} \int_0^\infty py^{p1} P(Y > y) dy &= \int_0^\infty \int_{(Y>y)} py^{p1} dP dy \\ &= \int_\Omega \int_0^Y py^{p1} dy dP \\ &= \int_\Omega Y^p dP = EY^p. \end{align} $$ The second equality comes from the Fubini's theorem.
$x P(X_i > x) \to 0$ as $x \to \infty.$
$\implies S_n/n  \mu_n \overset{P}{\to} 0$ where $\mu_n = EX_1 \mathbf{1}_{(X_1\le n)}.$
Use the weak law for triangular arrays. Let $X_{nk} = X_k$ and $b_n = n.$
(i) $\sum\limits_{i=1}^n P(X_i > n) = n P(X_1 > n) \to_n 0.$
(ii) Let $\bar X_k = X_k\mathbf{1}_{(X_k\le n)}.$ Then $\frac{1}{n^2} \sum\limits_{k=1}^n E\bar X_k^2 = \frac{1}{n} E\bar X_1^2.$ By the lemma, $E\bar X_1^2 = \int_0^n 2xP(X_1 > x) dx$ so $\frac{1}{n} E\bar X_1^2 \to 0$ as $x\to\infty.$
By (i) and (ii) and theorem 3, $S_n/n  \mu_n \overset{P}{\to} 0.$
Use this to get the weak law of large numbers.
Let $S_n = X_1 + \cdots X_n$, $\mu = EX_1.$
$\implies S_n/n \overset{P}{\to} \mu.$
(i) $x P(X_1 > x) = x E\mathbf{1}_{(X_1 > x)} \le EX_1\mathbf{1}_{(X_1 > x)}.$ Since $X_1\mathbf{1}_{(X_1 > x)} \le X_1$ and $EX_1 < \infty$ by DCT $EX_1\mathbf{1}_{(X_1 > x)} \to 0$ as $x\to\infty.$
(ii) Let $\mu_n = EX_1\mathbf{1}_{(X_1 \le n)}.$ Since $X_1\mathbf{1}_{(X_1 \le n)} \le X_1$ and $EX_1 < \infty$ by DCT $EX_1\mathbf{1}_{(X_1 \le n)} \to_n \mu.$
By theorem 5, $S_n/n  \mu_n \overset{P}{\to} 0$. Since $\mu_n \overset{P}{\to} \mu$, the desired result follows.
Note that we do not need finite second moment condition.
While I will leave other examples as further readings, I would like to discuss the example of coupon collector’s problem (example 2.2.7).
Here, $\mathcal{U}\{1,2,\cdots,n\}$ is a discrete uniform distribution from one to $n$. In this example it can be interpreted that $n$ is the total number of different coupons and $\tau_k^n$ represents the number of trials until $k$ different kinds of coupons are collected, $T_n$ is the number of trials until all of the coupons are collected. This example says that asymptotically $n\log n$ trials are required for all $n$ coupons to be collected. Likewise, asymptotic rate of a stochastic process can be calculated using the weak laws.
Let $b_n = n\log n$ then it satisfies the condition of theorem 2.
Acknowledgement
This post series is based on the textbook Probability: Theory and Examples, 5th edition (Durrett, 2019) and the lecture at Seoul National University, Republic of Korea (instructor: Prof. Johan Lim).

Relationship between $L^p$ convergence, convergence in probability, almost sure convergence will be covered later in detail. ↩