Convergence in distribution and convergence in probability are not the same concept. However, in the special case of convergence to a constant, they are equivalent, and for this reason are sometimes referred to collectively as “weak convergence” (as opposed to strong, or almost sure convergence).

Theorem: Let $\a \in \real^d$. Then $\x_n \inP \a$ if and only if $\x_n \inD \a$.

Proof: i) Forward direction: $\x_n \inP \a \implies \x_n \inD \a$.

Let $\x$ be a point in $\real^d$ such that $\x \gg \a$. Our strategy here is to show that $F_n(\x) \to 1$. The proof is similar for the case where $x_j < a_j$ for at least one $j$. The case where $\x \gge \a$ with $x_j = a_j$ for at least one $j$ is irrelevant, as $F$ is not continuous at such points.

Let $N_\delta(\a)$ be a neighborhood of $\a$ such that $\p \gl \x$ for all $\p \in N_\delta(\a)$; see remarks below for details on the explicit construction such a neighborhood.

\[\begin{alignat*}{2} F_n(x) &= \Pr\{\x_n \gle \x\}\\ &= \Pr\{\x_n \gle \x \given \x_n \in N_\delta(\a)\} \Pr\{\x_n \in N_\delta(\a)\} &\hspace{4em}& \textnormal{Law of total probability} \\ &\quad + \Pr\{\x_n \gle \x \given \x_n \notin N_\delta(\a)\} \Pr\{\x_n \notin N_\delta(\a)\} \\ &\ge \Pr\{\x_n \gle \x \given \x_n \in N_\delta(\a)\} \Pr\{\x_n \in N_\delta(\a)\} && \textnormal{Probabilities are nonnegative} \\ &= \Pr\{\x_n \in N_\delta(\a)\} && \x_n \in N_\delta(\a) \implies \x_n \gle \x \\ &\to 1 && \href{convergence-in-probability.html}{\x_n \inP \a} \end{alignat*}\]

ii) Backward direction: $\x_n \inD \a \implies \x_n \inP \a $. The following proof is specific to two dimensions; the idea is the same in higher dimensions.

Let $\delta > 0$.

\[\begin{alignat*}{2} \Pr\{\norm{\x_n-\a}_\infty \le \delta\} &\ge F_n(\a + \one\delta) &\hspace{4em}& \to 1; \x_n \inD \a \\ &\quad - F_n(\a + (1 \quad {-1}) \Tr \delta) && \to 0; \x_n \inD \a \\ &\quad - F_n(\a + (-1 \quad 1) \Tr \delta) && \to 0; \x_n \inD \a \\ &\quad + F_n(\a - \one\delta) && \to 0; \x_n \inD \a \\ &\to 1 \end{alignat*}\]

If $\a \gl \x$ (note that this inequality must be strict), we can always construct a neighborhood around $\a$ such that $\p \gl \x$ for all points in the neighborhood (this is fairly straightforward if you draw a picture of the situation). For example, let $\delta = \min_j \{ x_j - a_j \} / 2$.