Entropy and total variation distance
Clash Royale CLAN TAG#URR8PPP
up vote
7
down vote
favorite
Let $X$, $Y$ be discrete random variables taking values within the same set of $N$ elements. Let the total variation distance $|P-Q|$ (which is half the $L_1$ distance between the distributions of $P$ and $Q$) be at most $epsilon$. What is a standard, easily provable bound on the difference between the entropies of $X$ and $Y$?
(It is easy to see that such a bound must depend not only on $epsilon$ but also on $N$.)
pr.probability probability-distributions inequalities it.information-theory entropy
add a comment |Â
up vote
7
down vote
favorite
Let $X$, $Y$ be discrete random variables taking values within the same set of $N$ elements. Let the total variation distance $|P-Q|$ (which is half the $L_1$ distance between the distributions of $P$ and $Q$) be at most $epsilon$. What is a standard, easily provable bound on the difference between the entropies of $X$ and $Y$?
(It is easy to see that such a bound must depend not only on $epsilon$ but also on $N$.)
pr.probability probability-distributions inequalities it.information-theory entropy
add a comment |Â
up vote
7
down vote
favorite
up vote
7
down vote
favorite
Let $X$, $Y$ be discrete random variables taking values within the same set of $N$ elements. Let the total variation distance $|P-Q|$ (which is half the $L_1$ distance between the distributions of $P$ and $Q$) be at most $epsilon$. What is a standard, easily provable bound on the difference between the entropies of $X$ and $Y$?
(It is easy to see that such a bound must depend not only on $epsilon$ but also on $N$.)
pr.probability probability-distributions inequalities it.information-theory entropy
Let $X$, $Y$ be discrete random variables taking values within the same set of $N$ elements. Let the total variation distance $|P-Q|$ (which is half the $L_1$ distance between the distributions of $P$ and $Q$) be at most $epsilon$. What is a standard, easily provable bound on the difference between the entropies of $X$ and $Y$?
(It is easy to see that such a bound must depend not only on $epsilon$ but also on $N$.)
pr.probability probability-distributions inequalities it.information-theory entropy
pr.probability probability-distributions inequalities it.information-theory entropy
edited 2 hours ago
Iosif Pinelis
14.8k12154
14.8k12154
asked 4 hours ago
H A Helfgott
4,1582476
4,1582476
add a comment |Â
add a comment |Â
2 Answers
2
active
oldest
votes
up vote
1
down vote
$newcommanddedelta
newcommandepepsilon$
Note that $pln p-p$ is decreasing in $pin[0,1]$, so that $pln p-ple qln q-q$ and hence $pln p-qln qle p-q=|p-q|$ if $0le qle ple1$.
Next, take any real $cge1$. Note that $g(p):=pln p+cp$ (with $g(0):=0$) is convex in $pin[0,1]$. So, assuming $0le ple qle1$ and $qge e^-c$, we have
$g(p)le g(0)vee g(q)=0vee g(q)=g(q)$ and hence
$pln p-qln qle c(q-p)=c|p-q|$.
Thus,
beginequation
pln p-qln qle c|p-q|
endequation
whenever $0le p,qle1$ and $qge e^-c$.
Also, $-qln q$ is increasing in $qin[0,e^-1]$ and hence in $qin[0,e^-c]$, so that $-qln qle ce^-c$ for $qin[0,e^-c]$. Also, $pln ple0$ if $0le ple1$.
Therefore, the difference between the entropies of $Q=(q_i)_i=1^N$ and $P=(p_i)_i=1^N$ is
beginequation
sum_1^N p_iln p_i-sum_1^N q_iln q_ile S_1+S_2+S_3,
endequation
where
beginalign*
S_1 &:=sum_icolon q_ige e^-c (p_iln p_i-q_iln q_i)le sum_icolon q_ige e^-cc|p_i-q_i|
le cde quadtextif dege|P-Q|_1, \
S_2 &:= sum_icolon q_i< e^-cp_iln p_ile0, \
S_3 &:= sum_icolon q_i< e^-c(-q_iln q_i)lesum_icolon q_i< e^-cce^-cle Nce^-c.
endalign*
So, taking now $c=lnfrac Nde$ and assuming $Nge ede$, we see that
beginequation
sum_1^N p_iln p_i-sum_1^N q_iln q_ile
cde+Nce^-c
=2delnfrac Nde.
endequation
Taking here $de=ep/2$ and noting that $Nge1$, we conclude that
beginequation
sum_1^N p_iln p_i-sum_1^N q_iln q_ile
eplnfrac2Nep
endequation
if $eple2/e$.
add a comment |Â
up vote
0
down vote
Claim. If $|P-Q|leqvarepsilonleqfrac12$, then $|H(P)-H(Q)| leq H(varepsilon) + varepsilonlog N$.
Proof.
Let $varepsilon':=|P-Q|$.
Let $(X,Y)$ be an optimal coupling of $P$ and $Q$, so that
beginalign
mathbbP(Xneq Y) = |P-Q| ;.
endalign
Using a standard construction, we can assume that $X$ and $Y$ have the particular form
beginalign
X &:=
begincases
Z & textif $B=0$, \
tildeX & textif $B=1$,
endcases &
Y &:=
begincases
Z & textif $B=0$, \
tildeY & textif $B=1$,
endcases
endalign
where $B$, $Z$ and $(tildeX,tildeY)$ are independent and $BsimtextBern(varepsilon')$.
Note that
beginalign
H(X|B) leq H(X) leq H(B) + H(X|B) ;.
endalign
For $H(X|B)$ we can write
beginalign
H(X|B) &= varepsilon' H(X|B=1) + (1-varepsilon') H(X|B=0) \
&= varepsilon' H(tildeX) + (1-varepsilon') H(Z) ;.
endalign
Thus,
beginalign
varepsilon' H(tildeX) + (1-varepsilon') H(Z) &leq H(X)
leq H(B) + varepsilon' H(tildeX) + (1-varepsilon') H(Z) ;,
tag$clubsuit$
endalign
and similarly,
beginalign
varepsilon' H(tildeY) + (1-varepsilon') H(Z) &leq H(Y)
leq H(B) + varepsilon' H(tildeY) + (1-varepsilon') H(Z) ;.
tag$spadesuit$
endalign
Combining ($clubsuit$) and ($spadesuit$) we get
beginalign
|H(X)-H(Y)| &leq
H(B) + varepsilon' |H(tildeX) - H(tildeY)| \
&leq H(varepsilon') + varepsilon' log N \
&leq H(varepsilon) + varepsilon log N ;,
endalign
as claimed.
QED
add a comment |Â
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
$newcommanddedelta
newcommandepepsilon$
Note that $pln p-p$ is decreasing in $pin[0,1]$, so that $pln p-ple qln q-q$ and hence $pln p-qln qle p-q=|p-q|$ if $0le qle ple1$.
Next, take any real $cge1$. Note that $g(p):=pln p+cp$ (with $g(0):=0$) is convex in $pin[0,1]$. So, assuming $0le ple qle1$ and $qge e^-c$, we have
$g(p)le g(0)vee g(q)=0vee g(q)=g(q)$ and hence
$pln p-qln qle c(q-p)=c|p-q|$.
Thus,
beginequation
pln p-qln qle c|p-q|
endequation
whenever $0le p,qle1$ and $qge e^-c$.
Also, $-qln q$ is increasing in $qin[0,e^-1]$ and hence in $qin[0,e^-c]$, so that $-qln qle ce^-c$ for $qin[0,e^-c]$. Also, $pln ple0$ if $0le ple1$.
Therefore, the difference between the entropies of $Q=(q_i)_i=1^N$ and $P=(p_i)_i=1^N$ is
beginequation
sum_1^N p_iln p_i-sum_1^N q_iln q_ile S_1+S_2+S_3,
endequation
where
beginalign*
S_1 &:=sum_icolon q_ige e^-c (p_iln p_i-q_iln q_i)le sum_icolon q_ige e^-cc|p_i-q_i|
le cde quadtextif dege|P-Q|_1, \
S_2 &:= sum_icolon q_i< e^-cp_iln p_ile0, \
S_3 &:= sum_icolon q_i< e^-c(-q_iln q_i)lesum_icolon q_i< e^-cce^-cle Nce^-c.
endalign*
So, taking now $c=lnfrac Nde$ and assuming $Nge ede$, we see that
beginequation
sum_1^N p_iln p_i-sum_1^N q_iln q_ile
cde+Nce^-c
=2delnfrac Nde.
endequation
Taking here $de=ep/2$ and noting that $Nge1$, we conclude that
beginequation
sum_1^N p_iln p_i-sum_1^N q_iln q_ile
eplnfrac2Nep
endequation
if $eple2/e$.
add a comment |Â
up vote
1
down vote
$newcommanddedelta
newcommandepepsilon$
Note that $pln p-p$ is decreasing in $pin[0,1]$, so that $pln p-ple qln q-q$ and hence $pln p-qln qle p-q=|p-q|$ if $0le qle ple1$.
Next, take any real $cge1$. Note that $g(p):=pln p+cp$ (with $g(0):=0$) is convex in $pin[0,1]$. So, assuming $0le ple qle1$ and $qge e^-c$, we have
$g(p)le g(0)vee g(q)=0vee g(q)=g(q)$ and hence
$pln p-qln qle c(q-p)=c|p-q|$.
Thus,
beginequation
pln p-qln qle c|p-q|
endequation
whenever $0le p,qle1$ and $qge e^-c$.
Also, $-qln q$ is increasing in $qin[0,e^-1]$ and hence in $qin[0,e^-c]$, so that $-qln qle ce^-c$ for $qin[0,e^-c]$. Also, $pln ple0$ if $0le ple1$.
Therefore, the difference between the entropies of $Q=(q_i)_i=1^N$ and $P=(p_i)_i=1^N$ is
beginequation
sum_1^N p_iln p_i-sum_1^N q_iln q_ile S_1+S_2+S_3,
endequation
where
beginalign*
S_1 &:=sum_icolon q_ige e^-c (p_iln p_i-q_iln q_i)le sum_icolon q_ige e^-cc|p_i-q_i|
le cde quadtextif dege|P-Q|_1, \
S_2 &:= sum_icolon q_i< e^-cp_iln p_ile0, \
S_3 &:= sum_icolon q_i< e^-c(-q_iln q_i)lesum_icolon q_i< e^-cce^-cle Nce^-c.
endalign*
So, taking now $c=lnfrac Nde$ and assuming $Nge ede$, we see that
beginequation
sum_1^N p_iln p_i-sum_1^N q_iln q_ile
cde+Nce^-c
=2delnfrac Nde.
endequation
Taking here $de=ep/2$ and noting that $Nge1$, we conclude that
beginequation
sum_1^N p_iln p_i-sum_1^N q_iln q_ile
eplnfrac2Nep
endequation
if $eple2/e$.
add a comment |Â
up vote
1
down vote
up vote
1
down vote
$newcommanddedelta
newcommandepepsilon$
Note that $pln p-p$ is decreasing in $pin[0,1]$, so that $pln p-ple qln q-q$ and hence $pln p-qln qle p-q=|p-q|$ if $0le qle ple1$.
Next, take any real $cge1$. Note that $g(p):=pln p+cp$ (with $g(0):=0$) is convex in $pin[0,1]$. So, assuming $0le ple qle1$ and $qge e^-c$, we have
$g(p)le g(0)vee g(q)=0vee g(q)=g(q)$ and hence
$pln p-qln qle c(q-p)=c|p-q|$.
Thus,
beginequation
pln p-qln qle c|p-q|
endequation
whenever $0le p,qle1$ and $qge e^-c$.
Also, $-qln q$ is increasing in $qin[0,e^-1]$ and hence in $qin[0,e^-c]$, so that $-qln qle ce^-c$ for $qin[0,e^-c]$. Also, $pln ple0$ if $0le ple1$.
Therefore, the difference between the entropies of $Q=(q_i)_i=1^N$ and $P=(p_i)_i=1^N$ is
beginequation
sum_1^N p_iln p_i-sum_1^N q_iln q_ile S_1+S_2+S_3,
endequation
where
beginalign*
S_1 &:=sum_icolon q_ige e^-c (p_iln p_i-q_iln q_i)le sum_icolon q_ige e^-cc|p_i-q_i|
le cde quadtextif dege|P-Q|_1, \
S_2 &:= sum_icolon q_i< e^-cp_iln p_ile0, \
S_3 &:= sum_icolon q_i< e^-c(-q_iln q_i)lesum_icolon q_i< e^-cce^-cle Nce^-c.
endalign*
So, taking now $c=lnfrac Nde$ and assuming $Nge ede$, we see that
beginequation
sum_1^N p_iln p_i-sum_1^N q_iln q_ile
cde+Nce^-c
=2delnfrac Nde.
endequation
Taking here $de=ep/2$ and noting that $Nge1$, we conclude that
beginequation
sum_1^N p_iln p_i-sum_1^N q_iln q_ile
eplnfrac2Nep
endequation
if $eple2/e$.
$newcommanddedelta
newcommandepepsilon$
Note that $pln p-p$ is decreasing in $pin[0,1]$, so that $pln p-ple qln q-q$ and hence $pln p-qln qle p-q=|p-q|$ if $0le qle ple1$.
Next, take any real $cge1$. Note that $g(p):=pln p+cp$ (with $g(0):=0$) is convex in $pin[0,1]$. So, assuming $0le ple qle1$ and $qge e^-c$, we have
$g(p)le g(0)vee g(q)=0vee g(q)=g(q)$ and hence
$pln p-qln qle c(q-p)=c|p-q|$.
Thus,
beginequation
pln p-qln qle c|p-q|
endequation
whenever $0le p,qle1$ and $qge e^-c$.
Also, $-qln q$ is increasing in $qin[0,e^-1]$ and hence in $qin[0,e^-c]$, so that $-qln qle ce^-c$ for $qin[0,e^-c]$. Also, $pln ple0$ if $0le ple1$.
Therefore, the difference between the entropies of $Q=(q_i)_i=1^N$ and $P=(p_i)_i=1^N$ is
beginequation
sum_1^N p_iln p_i-sum_1^N q_iln q_ile S_1+S_2+S_3,
endequation
where
beginalign*
S_1 &:=sum_icolon q_ige e^-c (p_iln p_i-q_iln q_i)le sum_icolon q_ige e^-cc|p_i-q_i|
le cde quadtextif dege|P-Q|_1, \
S_2 &:= sum_icolon q_i< e^-cp_iln p_ile0, \
S_3 &:= sum_icolon q_i< e^-c(-q_iln q_i)lesum_icolon q_i< e^-cce^-cle Nce^-c.
endalign*
So, taking now $c=lnfrac Nde$ and assuming $Nge ede$, we see that
beginequation
sum_1^N p_iln p_i-sum_1^N q_iln q_ile
cde+Nce^-c
=2delnfrac Nde.
endequation
Taking here $de=ep/2$ and noting that $Nge1$, we conclude that
beginequation
sum_1^N p_iln p_i-sum_1^N q_iln q_ile
eplnfrac2Nep
endequation
if $eple2/e$.
edited 11 mins ago
answered 2 hours ago
Iosif Pinelis
14.8k12154
14.8k12154
add a comment |Â
add a comment |Â
up vote
0
down vote
Claim. If $|P-Q|leqvarepsilonleqfrac12$, then $|H(P)-H(Q)| leq H(varepsilon) + varepsilonlog N$.
Proof.
Let $varepsilon':=|P-Q|$.
Let $(X,Y)$ be an optimal coupling of $P$ and $Q$, so that
beginalign
mathbbP(Xneq Y) = |P-Q| ;.
endalign
Using a standard construction, we can assume that $X$ and $Y$ have the particular form
beginalign
X &:=
begincases
Z & textif $B=0$, \
tildeX & textif $B=1$,
endcases &
Y &:=
begincases
Z & textif $B=0$, \
tildeY & textif $B=1$,
endcases
endalign
where $B$, $Z$ and $(tildeX,tildeY)$ are independent and $BsimtextBern(varepsilon')$.
Note that
beginalign
H(X|B) leq H(X) leq H(B) + H(X|B) ;.
endalign
For $H(X|B)$ we can write
beginalign
H(X|B) &= varepsilon' H(X|B=1) + (1-varepsilon') H(X|B=0) \
&= varepsilon' H(tildeX) + (1-varepsilon') H(Z) ;.
endalign
Thus,
beginalign
varepsilon' H(tildeX) + (1-varepsilon') H(Z) &leq H(X)
leq H(B) + varepsilon' H(tildeX) + (1-varepsilon') H(Z) ;,
tag$clubsuit$
endalign
and similarly,
beginalign
varepsilon' H(tildeY) + (1-varepsilon') H(Z) &leq H(Y)
leq H(B) + varepsilon' H(tildeY) + (1-varepsilon') H(Z) ;.
tag$spadesuit$
endalign
Combining ($clubsuit$) and ($spadesuit$) we get
beginalign
|H(X)-H(Y)| &leq
H(B) + varepsilon' |H(tildeX) - H(tildeY)| \
&leq H(varepsilon') + varepsilon' log N \
&leq H(varepsilon) + varepsilon log N ;,
endalign
as claimed.
QED
add a comment |Â
up vote
0
down vote
Claim. If $|P-Q|leqvarepsilonleqfrac12$, then $|H(P)-H(Q)| leq H(varepsilon) + varepsilonlog N$.
Proof.
Let $varepsilon':=|P-Q|$.
Let $(X,Y)$ be an optimal coupling of $P$ and $Q$, so that
beginalign
mathbbP(Xneq Y) = |P-Q| ;.
endalign
Using a standard construction, we can assume that $X$ and $Y$ have the particular form
beginalign
X &:=
begincases
Z & textif $B=0$, \
tildeX & textif $B=1$,
endcases &
Y &:=
begincases
Z & textif $B=0$, \
tildeY & textif $B=1$,
endcases
endalign
where $B$, $Z$ and $(tildeX,tildeY)$ are independent and $BsimtextBern(varepsilon')$.
Note that
beginalign
H(X|B) leq H(X) leq H(B) + H(X|B) ;.
endalign
For $H(X|B)$ we can write
beginalign
H(X|B) &= varepsilon' H(X|B=1) + (1-varepsilon') H(X|B=0) \
&= varepsilon' H(tildeX) + (1-varepsilon') H(Z) ;.
endalign
Thus,
beginalign
varepsilon' H(tildeX) + (1-varepsilon') H(Z) &leq H(X)
leq H(B) + varepsilon' H(tildeX) + (1-varepsilon') H(Z) ;,
tag$clubsuit$
endalign
and similarly,
beginalign
varepsilon' H(tildeY) + (1-varepsilon') H(Z) &leq H(Y)
leq H(B) + varepsilon' H(tildeY) + (1-varepsilon') H(Z) ;.
tag$spadesuit$
endalign
Combining ($clubsuit$) and ($spadesuit$) we get
beginalign
|H(X)-H(Y)| &leq
H(B) + varepsilon' |H(tildeX) - H(tildeY)| \
&leq H(varepsilon') + varepsilon' log N \
&leq H(varepsilon) + varepsilon log N ;,
endalign
as claimed.
QED
add a comment |Â
up vote
0
down vote
up vote
0
down vote
Claim. If $|P-Q|leqvarepsilonleqfrac12$, then $|H(P)-H(Q)| leq H(varepsilon) + varepsilonlog N$.
Proof.
Let $varepsilon':=|P-Q|$.
Let $(X,Y)$ be an optimal coupling of $P$ and $Q$, so that
beginalign
mathbbP(Xneq Y) = |P-Q| ;.
endalign
Using a standard construction, we can assume that $X$ and $Y$ have the particular form
beginalign
X &:=
begincases
Z & textif $B=0$, \
tildeX & textif $B=1$,
endcases &
Y &:=
begincases
Z & textif $B=0$, \
tildeY & textif $B=1$,
endcases
endalign
where $B$, $Z$ and $(tildeX,tildeY)$ are independent and $BsimtextBern(varepsilon')$.
Note that
beginalign
H(X|B) leq H(X) leq H(B) + H(X|B) ;.
endalign
For $H(X|B)$ we can write
beginalign
H(X|B) &= varepsilon' H(X|B=1) + (1-varepsilon') H(X|B=0) \
&= varepsilon' H(tildeX) + (1-varepsilon') H(Z) ;.
endalign
Thus,
beginalign
varepsilon' H(tildeX) + (1-varepsilon') H(Z) &leq H(X)
leq H(B) + varepsilon' H(tildeX) + (1-varepsilon') H(Z) ;,
tag$clubsuit$
endalign
and similarly,
beginalign
varepsilon' H(tildeY) + (1-varepsilon') H(Z) &leq H(Y)
leq H(B) + varepsilon' H(tildeY) + (1-varepsilon') H(Z) ;.
tag$spadesuit$
endalign
Combining ($clubsuit$) and ($spadesuit$) we get
beginalign
|H(X)-H(Y)| &leq
H(B) + varepsilon' |H(tildeX) - H(tildeY)| \
&leq H(varepsilon') + varepsilon' log N \
&leq H(varepsilon) + varepsilon log N ;,
endalign
as claimed.
QED
Claim. If $|P-Q|leqvarepsilonleqfrac12$, then $|H(P)-H(Q)| leq H(varepsilon) + varepsilonlog N$.
Proof.
Let $varepsilon':=|P-Q|$.
Let $(X,Y)$ be an optimal coupling of $P$ and $Q$, so that
beginalign
mathbbP(Xneq Y) = |P-Q| ;.
endalign
Using a standard construction, we can assume that $X$ and $Y$ have the particular form
beginalign
X &:=
begincases
Z & textif $B=0$, \
tildeX & textif $B=1$,
endcases &
Y &:=
begincases
Z & textif $B=0$, \
tildeY & textif $B=1$,
endcases
endalign
where $B$, $Z$ and $(tildeX,tildeY)$ are independent and $BsimtextBern(varepsilon')$.
Note that
beginalign
H(X|B) leq H(X) leq H(B) + H(X|B) ;.
endalign
For $H(X|B)$ we can write
beginalign
H(X|B) &= varepsilon' H(X|B=1) + (1-varepsilon') H(X|B=0) \
&= varepsilon' H(tildeX) + (1-varepsilon') H(Z) ;.
endalign
Thus,
beginalign
varepsilon' H(tildeX) + (1-varepsilon') H(Z) &leq H(X)
leq H(B) + varepsilon' H(tildeX) + (1-varepsilon') H(Z) ;,
tag$clubsuit$
endalign
and similarly,
beginalign
varepsilon' H(tildeY) + (1-varepsilon') H(Z) &leq H(Y)
leq H(B) + varepsilon' H(tildeY) + (1-varepsilon') H(Z) ;.
tag$spadesuit$
endalign
Combining ($clubsuit$) and ($spadesuit$) we get
beginalign
|H(X)-H(Y)| &leq
H(B) + varepsilon' |H(tildeX) - H(tildeY)| \
&leq H(varepsilon') + varepsilon' log N \
&leq H(varepsilon) + varepsilon log N ;,
endalign
as claimed.
QED
answered 33 mins ago
Algernon
8691712
8691712
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathoverflow.net%2fquestions%2f310689%2fentropy-and-total-variation-distance%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password