Chapter 2 Entropy, Relative Entropy & Mutual Information

  • DEF Entropy H(x)H(x) of a discrete random variable X is defined by H(X)=p(x)logp(x)H(X) = - \sum p(x) \log p(x)

    • PROP H(X)0H(X)\geq 0

  • DEF Joint Entropy H(X,Y)H(X,Y) of a pair of discrete random variables (X,Y)(X, Y)with a joint distribution p(x,y)p(x,y)is defined as H(X,Y)=x,yp(x,y)logp(x,y)H(X,Y) = -\sum_{x,y} p(x,y)\log p(x,y).

  • DEF Conditional Entropy H(YX)H(Y|X)is defined as: H(YX)=p(x)H(YX=x)=p(x)p(yx)logp(yx)=x,yp(x,y)logp(yx)H(Y|X) = \sum p(x)H(Y|X=x) = -\sum p(x) \sum p(y|x) \log p(y|x) = - \sum_{x,y} p(x,y) \log p(y|x)

  • THEOREM H(X,Y)=H(X)+H(YX)H(X,Y) = H(X)+H(Y|X).

    • COR H(X,YZ)=H(XZ)+H(YX,Z)H(X,Y|Z) = H(X|Z)+H(Y|X,Z)

    • PROOF H(X,Y,Z)=H(X,YZ)+H(Z)=H(YX,Z)+H(X,Z)=H(YX,Z)+H(XZ)+H(Z)H(X,Y,Z) = H(X,Y|Z)+H(Z)=H(Y|X,Z)+H(X,Z) =H(Y|X,Z)+H(X|Z)+H(Z)

  • DEF Kullback-Leibler Distance / Relative Entropy D(pq)=p(x)logp(x)q(x)D(p||q)=\sum p(x) \log \frac{p(x)}{q(x)}.

    • PROP 0log00=0,0log0q=0,plogp0=+0\log\frac{0}{0} = 0, 0\log \frac{0}{q} = 0, p\log \frac{p}{0} = +\infty.

  • DEF Mutual Information I(X;Y)I(X;Y)is the relative entropy between the joint distribution and the product distribution p(x)p(y)p(x)p(y):I(X;Y)=D(p(x,y)p(x)p(y))I(X;Y) = D(p(x,y)||p(x)p(y)).

    • PROP I(X;Y)=x,yp(x,y)logp(x,y)p(x)=H(X)H(XY)I(X;Y) = \sum_{x,y} p(x,y) \log \frac{p(x,y)}{p(x)} = H(X) - H(X|Y).

    • PROP I(X;X)=H(X)I(X;X) = H(X).

    • PROP I(X;Y)=H(X)+H(Y)H(X,Y)I(X;Y) = H(X)+H(Y)-H(X,Y)

    • It's easy to see that relations between entropy, joint entropy, conditional entropy and mutual information can be represented by a Venn graph.

Last updated

Was this helpful?