Taylor series for vector-valued functions

Unfortunately, Lagrange-type results do not hold for vector-valued functions. In other words, it is not true that there exists an $\bar{\x}$ such that

\[\f(\x) = \f(\x_0) + \nabla \f(\bar{\x}) \Tr (\x - \x_0);\]

such a point exists for each element of $\f$ separately, but these points will not be the same.

However, big O versions of the Taylor series are still valid, as in the following theorem.

Theorem: Suppose $\f:\real^d \to \real^k$ is twice differentiable on $N_r(\x_0)$, and that $\nabla^2 f$ is bounded on $N_r(\x_0)$. Then for any $\x \in N_r(\x_0)$,

\[\f(\x) = \f(\x_0) + \left[ \nabla \f(\x_0) + O(\norm{\x-\x_0}) \right] \Tr (\x - \x_0),\]

where $O(\cdot)$ applies to each element of the $d \times k$ matrix.

Proof: For any individual component $j$ of $\f$, there exists $\bar{\x_j}$ on the line segment connecting $\x$ and $\x_0$ such that

\[\as{ f_j(\x) &= f_j(\x_0) + \nabla f_j(\x_0)\Tr(\x-\x_0) + \tfrac{1}{2}(\x-\x_0)\Tr \nabla^2 f_j(\bar{\x}_j)(\x-\x_0) \\ &= f_j(\x_0) + \left[\nabla f_j(\x_0) + \tfrac{1}{2}\nabla^2 f_j(\bar{\x}_j) (\x-\x_0) \right]\Tr (\x-\x_0) \\ &= f_j(\x_0) + \left[\nabla f_j(\x_0) + O(1) (\x-\x_0) \right]\Tr (\x-\x_0) \\ }\]

because $\nabla^2 f$ is bounded. Stacking these individual equations into a system of equations, we obtain the result stated in the theorem.

In statistics, this theorem is usually applied with respect to the score.