Single parameter

A one-parameter exponential family has the form

\[p(x|\theta) = \exp\{ s\theta - \psi(\theta) \} p_0(x),\]

where

\(\theta\) is the natural parameter
\(s\) is the natural statistic
\(\psi(\theta)\) is the cumulant generating function
\(p_0\) is the base or reference distribution, although it need not be a proper distribution

Multiple parameter

All of these concepts extend in a straightforward way to the \(d\)-parameter exponential family:

\[p(x|\bt) = \exp\{ \s \Tr \bt - \psi(\bt) \} p_0(x),\]

where \(\bt\) and \(\s\) are now \(d \times 1\) vectors.

Curved exponential family

Note that \(d\) (the dimension of \(\bt\) and \(\s\)) is not necessarily the same as the number of unknown parameters.

For example, suppose \(x \sim \Norm(\mu, c^2 \mu^2)\), where \(c\), the coefficient of variation, is known. The natural parameter and statistic are 2-dimensional, but there is only one unknown parameter. Note that \(\bT\) is a one-dimensional parabola curving through \(\real^2\).

Exponential dispersion family

A variation on exponential tilting, and one that is often very useful in statistical modeling, is to introduce a dispersion parameter and tilt by \(\exp\{ \s \Tr \bt / \phi \}\). The resulting model is then of the form

\[p(x|\bt,\phi) = \exp \left\{ \frac{\s \Tr \bt - \psi(\bt)}{\phi} \right\} p_0(x, \phi).\]

This family of distributions is called the exponential dispersion family. Note that the normalizing constant is now \(\exp\{\psi(\bt)/\phi \}\).