Theorem: Suppose regularity conditions (A)-(C) are met. Then for any consistent estimator $\bth$, we have

\[\tfrac{1}{\sqrt{n}}\u(\bth) = \tfrac{1}{\sqrt{n}}\u(\bts) - \{\fI(\bts) + o_p(1)\} \sqrt{n}(\bth-\bts).\]

If $\bth$ is $\sqrt{n}$-consistent, then

\[\tfrac{1}{\sqrt{n}}\u(\bth) = \tfrac{1}{\sqrt{n}}\u(\bts) - \fI(\bts) \sqrt{n}(\bth-\bts) + o_p(1).\]

Proof: Taking Taylor series expansions of the contributions to the score vector, we have

\[\u_i(\bth) = \u_i(\bts) - \left[ \oI_i(\bts) + M(x_i)O(\norm{\bth-\bts}) \right]\Tr (\bth-\bts).\]

Summing these contributions and dividing by $\sqrt{n}$,

\[\tfrac{1}{\sqrt{n}}\u(\bth) = \tfrac{1}{\sqrt{n}}\u(\bts) - \left[ \tfrac{1}{n}\oI_n(\bts) + \left\{\tfrac{1}{n}\sum_{i=1}^n M(x_i)\right\} O(\norm{\bth-\bts})\one \right]\Tr \sqrt{n}(\bth-\bts).\]

Finally, note that

$\tfrac{1}{n} \oI_n(\bts) \inP \fI(\bts)$ by the Fisher information theorem
$\tfrac{1}{n}\sum M(x_i) = O_p(1)$ by C(iii)
$\norm{\bth-\bts} = o_p(1)$ because $\bth$ is consistent

Thus, the rules of O notation tell us that the entire term inside the square brackets is converging to $\fI(\bts)$ in probability.

Finally, if $\bth$ is $\sqrt{n}$-consistent, then $\sqrt{n}(\bth-\bts)$ is $O_p(1)$ and $ o_p(1)\sqrt{n}(\bth-\bts) = o_p(1)$.

Corollary

Similarly, for any two consistent estimators $\bth_1$ and $\bth_2$, we have

\[\tfrac{1}{\sqrt{n}}\u(\bth_1) = \tfrac{1}{\sqrt{n}}\u(\bth_2) - \{\fI(\bts) + o_p(1)\} \sqrt{n}(\bth_1-\bth_2).\]

If both estimators are $\sqrt{n}$-consistent,

\[\tfrac{1}{\sqrt{n}}\u(\bth_1) = \tfrac{1}{\sqrt{n}}\u(\bth_2) - \fI(\bts) \sqrt{n}(\bth_1-\bth_2) + o_p(1).\]