Pseudo likelihood (general theory)

Theorem (Gong & Samaniego): Suppose regularity conditions (A)-(C) hold. Then

(a) If $\beh$ is consistent, there exists a sequence of consistent roots $\th$
(b) If

\[\begin{alignat*}{2} \left[\begin{array}{c} \tfrac{1}{\sqrt{n}}u_1(\ts, \be^*) \\ \sqrt{n}(\beh-\be^*) \end{array}\right] \inD \Norm\left(\zero, \left[\begin{array}{cc} \Sigma_{11} & \bS_{12} \\ \bS_{21} & \bS_{22} \end{array}\right]\right), \end{alignat*}\]

then

\[\sqrt{n}(\th-\ts) \inD \Norm(0, \sigma^2),\]

where

\[\begin{alignat*}{2} \sigma^2 = \mathscr{I}_{11}^{-1} + \mathscr{I}_{11}^{-2}\fI_{12}(\bS_{22}\fI_{21} - 2\bS_{21}), \end{alignat*}\]

and the Fisher information matrices are for a single observation and evaluated at $(\ts, \be^*)$.

Proof: Here, we prove part b and assume the result of part (a). Let $\th$ denote the pseudo-MLE and $\beh$ denote a consistent estimator, not necessarily the MLE. Then we can take a Taylor series expansion of the score to obtain

\[\begin{alignat*}{2} \tfrac{1}{\sqrt{n}} u_1(\th, \beh) &= \tfrac{1}{\sqrt{n}} u_1(\ts, \beh) - \fI_{11}(\ts,\beh)\sqrt{n}(\th-\ts) + o_p(1) &\hspace{4em}& \text{$\th$ is consistent by (a)} \\ \tag*{$\tcirc{1}$} \implies 0 &= \tfrac{1}{\sqrt{n}} u_1(\ts, \beh) - \fI_{11}\sqrt{n}(\th-\ts) + o_p(1) && \beh \inP \be^*; \fI \text{ continuous} \end{alignat*}\]

However, we don’t know the distribution of $\tfrac{1}{\sqrt{n}} u_1(\ts, \beh)$, so we must take another Taylor series expansion:

\[\begin{alignat*}{2} \tag*{$\tcirc{2}$} \tfrac{1}{\sqrt{n}} u_1(\ts, \beh) &= \tfrac{1}{\sqrt{n}} u_1(\ts, \be^*) - \fI_{21} \Tr \sqrt{n}(\beh-\be^*) + o_p(1) &\hspace{4em}& \beh \text{ consistent} \\ \end{alignat*}\]

Thus, by $\tcirc{1}$ and $\tcirc{2}$, we have

\[\begin{alignat*}{2} \tag*{$\tcirc{3}$} \fI_{11}\sqrt{n}(\th-\ts) &= \tfrac{1}{\sqrt{n}} u_1(\ts, \be^*) - \fI_{12} \sqrt{n}(\beh-\be^*) + o_p(1) \\ &\inD [1 \quad -\fI_{12}] \z \end{alignat*}\]

by our starting assumption, where $\z \sim \Norm(\zero, \bS)$. Now, the variance of this distribution is

\[\begin{alignat*}{2} [1 \quad -\fI_{12}] \bS [1 \quad -\fI_{12}] \Tr &= [\Sigma_{11} - \fI_{12}\bS_{21} \quad \bS_{12} - \fI_{12}\bS_{22}] [1 \quad -\fI_{12}] \Tr \\ &= \Sigma_{11} - \fI_{12}\bS_{21} - \bS_{12}\fI_{21} + \fI_{12} \bS_{22} \fI_{21} \\ &= \Sigma_{11} - 2\fI_{12}\bS_{21} + \fI_{12} \bS_{22} \fI_{21} \\ &= \Sigma_{11} + \fI_{12}(\bS_{22} \fI_{21} - 2\bS_{21}) \end{alignat*}\]

Finally, note that $\Sigma_{11} = \fI_{11}$, since both are the variance of $\tfrac{1}{\sqrt{n}}u_1(\ts, \be^*)$, and that $\fI_{11}$ is a scalar in this setup. Thus, by $\tcirc{3}$ and the continuous mapping theorem,

\[\begin{alignat*}{2} \sqrt{n}(\th-\ts) &\inD \Norm(0, \fI_{11}^{-1} + \mathscr{I}_{11}^{-2}\fI_{12}(\bS_{22}\fI_{21} - 2\bS_{21}) \end{alignat*}\]