Gibbs’ inequality, also known as the Shannon-Kolmogorov Information inequality, states that the Kullback-Leibler divergence is always positive.

Theorem (Shannon-Kolmogorov information inequality; Gibbs’ inequality): If \(F\) and \(G\) are two distributions with the same support, then

\[\KL(F|G) \geq 0.\]

The inequality is strict unless \(F(x)=G(x)\).

Below, two different proofs are given. Note that both proofs rely on integrating \(f\) and \(g\) over the same support and obtaining 1. If this is not the case, then Gibbs’ inequality holds trivially since the KL divergence is infinite by definition.

Proof: Proof: First, we note that \(\log(x) \leq x-1\), with equality holding only at \(x=1\). Thus,

\[\log\left(\frac{g(x)}{f(x)}\right) \leq \frac{g(x)}{f(x)}-1\]

and, multiplying both sides by \(f(x)\),

\[F(x)\log\left(\frac{g(x)}{f(x)}\right) \leq g(x)-f(x).\]

Integrating both sides, we arrive at

\[-\KL(F|G) \leq 0.\]

Finally, note that equality occurs only when \(f(x)=g(x)\) everywhere where \(F(x)\) has positive measure.

Proof: Alternatively, we can use convexity. By Jensen’s Inequality (with \(-\log\) as the function and \(g(X)/f(X)\) as the random variable), we have

\[\as{\KL(F|G) &= \Ex_f \left\{ -\log \frac{g(X)}{f(X)} \right\} \\ &\geq -\log \Ex_f \frac{g(X)}{f(X)} \\ &= 0, }\]

where the final equality comes from

\[\Ex_f \frac{g(X)}{f(X)} = \int \frac{g(x)}{f(x)}f(x)dx = \int g(x) = 1\]