Brascamp–Lieb inequality

In mathematics, the Brascamp–Lieb inequality is either of two inequalities. The first is a result in geometry concerning integrable functions on n-dimensional Euclidean space R n {\displaystyle \mathbb {R} ^{n}} . It generalizes the Loomis–Whitney inequality and Hölder's inequality. The second is a result of probability theory which gives a concentration inequality for log-concave probability distributions. Both are named after Herm Jan Brascamp and Elliott H. Lieb.

The geometric inequality

Fix natural numbers m and n. For 1 ≤ i ≤ m, let ni ∈ N and let ci > 0 so that

i = 1 m c i n i = n . {\displaystyle \sum _{i=1}^{m}c_{i}n_{i}=n.}

Choose non-negative, integrable functions

f i L 1 ( R n i ; [ 0 , + ] ) {\displaystyle f_{i}\in L^{1}\left(\mathbb {R} ^{n_{i}};[0,+\infty ]\right)}

and surjective linear maps

B i : R n R n i . {\displaystyle B_{i}:\mathbb {R} ^{n}\to \mathbb {R} ^{n_{i}}.}

Then the following inequality holds:

R n i = 1 m f i ( B i x ) c i d x D 1 / 2 i = 1 m ( R n i f i ( y ) d y ) c i , {\displaystyle \int _{\mathbb {R} ^{n}}\prod _{i=1}^{m}f_{i}\left(B_{i}x\right)^{c_{i}}\,\mathrm {d} x\leq D^{-1/2}\prod _{i=1}^{m}\left(\int _{\mathbb {R} ^{n_{i}}}f_{i}(y)\,\mathrm {d} y\right)^{c_{i}},}

where D is given by

D = inf { det ( i = 1 m c i B i A i B i ) i = 1 m ( det A i ) c i | A i  is a positive-definite  n i × n i  matrix } . {\displaystyle D=\inf \left\{\left.{\frac {\det \left(\sum _{i=1}^{m}c_{i}B_{i}^{*}A_{i}B_{i}\right)}{\prod _{i=1}^{m}(\det A_{i})^{c_{i}}}}\right|A_{i}{\text{ is a positive-definite }}n_{i}\times n_{i}{\text{ matrix}}\right\}.}

Another way to state this is that the constant D is what one would obtain by restricting attention to the case in which each f i {\displaystyle f_{i}} is a centered Gaussian function, namely f i ( y ) = exp { ( y , A i y ) } {\displaystyle f_{i}(y)=\exp\{-(y,\,A_{i}\,y)\}} .[1]

Alternative forms

Consider a probability density function p ( x ) = exp ( ϕ ( x ) ) {\displaystyle p(x)=\exp(-\phi (x))} . This probability density function p ( x ) {\displaystyle p(x)} is said to be a log-concave measure if the ϕ ( x ) {\displaystyle \phi (x)} function is convex. Such probability density functions have tails which decay exponentially fast, so most of the probability mass resides in a small region around the mode of p ( x ) {\displaystyle p(x)} . The Brascamp–Lieb inequality gives another characterization of the compactness of p ( x ) {\displaystyle p(x)} by bounding the mean of any statistic S ( x ) {\displaystyle S(x)} .

Formally, let S ( x ) {\displaystyle S(x)} be any derivable function. The Brascamp–Lieb inequality reads:

var p ( S ( x ) ) E p ( T S ( x ) [ H ϕ ( x ) ] 1 S ( x ) ) {\displaystyle \operatorname {var} _{p}(S(x))\leq E_{p}(\nabla ^{T}S(x)[H\phi (x)]^{-1}\nabla S(x))}

where H is the Hessian and {\displaystyle \nabla } is the Nabla symbol.[2]

BCCT inequality

The inequality is generalized in 2008[3] to account for both continuous and discrete cases, and for all linear maps, with precise estimates on the constant.

Definition: the Brascamp-Lieb datum (BL datum)

  • d , n 1 {\displaystyle d,n\geq 1} .
  • d 1 , . . . , d n { 1 , 2 , . . . , d } {\displaystyle d_{1},...,d_{n}\in \{1,2,...,d\}} .
  • p 1 , . . . , p n [ 0 , ) {\displaystyle p_{1},...,p_{n}\in [0,\infty )} .
  • B i : R d R d i {\displaystyle B_{i}:\mathbb {R} ^{d}\to \mathbb {R} ^{d_{i}}} are linear surjections, with zero common kernel: i k e r ( B i ) = { 0 } {\displaystyle \cap _{i}ker(B_{i})=\{0\}} .
  • Call ( B , p ) = ( B 1 , . . . , B n , p 1 , . . . , p n ) {\displaystyle (B,p)=(B_{1},...,B_{n},p_{1},...,p_{n})} a Brascamp-Lieb datum (BL datum).

For any f i L 1 ( R d i ) {\displaystyle f_{i}\in L^{1}(R^{d_{i}})} with f i 0 {\displaystyle f_{i}\geq 0} , define

B L ( B , p , f ) := H j = 1 m ( f j B j ) p j j = 1 m ( H j f j ) p j {\displaystyle BL(B,p,f):={\frac {\int _{H}\prod _{j=1}^{m}\left(f_{j}\circ B_{j}\right)^{p_{j}}}{\prod _{j=1}^{m}\left(\int _{H_{j}}f_{j}\right)^{p_{j}}}}}


Now define the Brascamp-Lieb constant for the BL datum:

B L ( B , p ) = max f B L ( B , p , f ) {\displaystyle BL(B,p)=\max _{f}BL(B,p,f)}

Theorem — (BCCT, 2007)

B L ( B , p ) {\displaystyle BL(B,p)} is finite iff d = i p i d i {\displaystyle d=\sum _{i}p_{i}d_{i}} , and for all subspace V {\displaystyle V} of R d {\displaystyle \mathbb {R} ^{d}} ,

d i m ( V ) i p i d i m ( B i ( V ) ) {\displaystyle dim(V)\leq \sum _{i}p_{i}dim(B_{i}(V))}

B L ( B , p ) {\displaystyle BL(B,p)} is reached by gaussians:

  • If B L ( B , p ) {\displaystyle BL(B,p)} is finite, then there exists some linear operators A i : R d i R d i {\displaystyle A_{i}:\mathbb {R} ^{d_{i}}\to \mathbb {R} ^{d_{i}}} such that f i = e A i x , x {\displaystyle f_{i}=e^{-\langle A_{i}x,x\rangle }} achieves the upper bound.
  • If B L ( B , p ) {\displaystyle BL(B,p)} is infinite, then there exists a sequence of gaussians for which

H j = 1 m ( f j B j ) p j j = 1 m ( H j f j ) p j {\displaystyle {\frac {\int _{H}\prod _{j=1}^{m}\left(f_{j}\circ B_{j}\right)^{p_{j}}}{\prod _{j=1}^{m}\left(\int _{H_{j}}f_{j}\right)^{p_{j}}}}\to \infty }

Discrete case

Setup:

  • Group homomorphisms ϕ j : G G j {\displaystyle \phi _{j}:G\to G_{j}} .
  • BL datum defined as ( G , G 1 , . . . , G n , ϕ 1 , . . . ϕ n ) {\displaystyle (G,G_{1},...,G_{n},\phi _{1},...\phi _{n})}
  • T ( G ) {\displaystyle T(G)} is the torsion subgroup, that is, the subgroup of finite-order elements.

With this setup, we have (Theorem 2.4,[4] Theorem 3.12 [5])

Theorem — If there exists some s 1 , . . . , s n [ 0 , 1 ] {\displaystyle s_{1},...,s_{n}\in [0,1]} such that

r a n k ( H ) j s j r a n k ( ϕ j ( H ) ) H G {\displaystyle rank(H)\leq \sum _{j}s_{j}rank(\phi _{j}(H))\quad \forall H\leq G}

Then for all 0 f j 1 / s j ( G j ) {\displaystyle 0\geq f_{j}\in \ell ^{1/s_{j}}(G_{j})} ,

j f j ϕ j 1 | T ( G ) | j f j 1 / s j {\displaystyle \left\|\prod _{j}f_{j}\circ \phi _{j}\right\|_{1}\leq |T(G)|\prod _{j}\|f_{j}\|_{1/s_{j}}}
and in particular,

| E | | T ( G ) | j | ϕ j ( E ) | s j E G {\displaystyle |E|\leq |T(G)|\prod _{j}|\phi _{j}(E)|^{s_{j}}\quad \forall E\subset G}

Note that the constant | T ( G ) | {\displaystyle |T(G)|} is not always tight.

BL polytope

Given BL datum ( B , p ) {\displaystyle (B,p)} , the conditions for B L ( B , p ) < {\displaystyle BL(B,p)<\infty } are

  • d = i p i d i {\displaystyle d=\sum _{i}p_{i}d_{i}} , and
  • for all subspace V {\displaystyle V} of R d {\displaystyle \mathbb {R} ^{d}} ,
    d i m ( V ) i p i d i m ( B i ( V ) ) {\displaystyle dim(V)\leq \sum _{i}p_{i}dim(B_{i}(V))}

Thus, the subset of p [ 0 , ) n {\displaystyle p\in [0,\infty )^{n}} that satisfies the above two conditions is a closed convex polytope defined by linear inequalities. This is the BL polytope.

Note that while there are infinitely many possible choices of subspace V {\displaystyle V} of R d {\displaystyle \mathbb {R} ^{d}} , there are only finitely many possible equations of d i m ( V ) i p i d i m ( B i ( V ) ) {\displaystyle dim(V)\leq \sum _{i}p_{i}dim(B_{i}(V))} , so the subset is a closed convex polytope.

Similarly we can define the BL polytope for the discrete case.

Relationships to other inequalities

The geometric Brascamp–Lieb inequality

The geometric Brascamp–Lieb inequality, first derived in 1976,[6] is a special case of the general inequality. It was used by Keith Ball, in 1989, to provide upper bounds for volumes of central sections of cubes.[7]

For i = 1, ..., m, let ci > 0 and let ui ∈ Sn−1 be a unit vector; suppose that ci and ui satisfy

x = i = 1 m c i ( x u i ) u i {\displaystyle x=\sum _{i=1}^{m}c_{i}(x\cdot u_{i})u_{i}}

for all x in Rn. Let fi ∈ L1(R; [0, +∞]) for each i = 1, ..., m. Then

R n i = 1 m f i ( x u i ) c i d x i = 1 m ( R f i ( y ) d y ) c i . {\displaystyle \int _{\mathbb {R} ^{n}}\prod _{i=1}^{m}f_{i}(x\cdot u_{i})^{c_{i}}\,\mathrm {d} x\leq \prod _{i=1}^{m}\left(\int _{\mathbb {R} }f_{i}(y)\,\mathrm {d} y\right)^{c_{i}}.}

The geometric Brascamp–Lieb inequality follows from the Brascamp–Lieb inequality as stated above by taking ni = 1 and Bi(x) = x · ui. Then, for zi ∈ R,

B i ( z i ) = z i u i . {\displaystyle B_{i}^{*}(z_{i})=z_{i}u_{i}.}

It follows that D = 1 in this case.

Hölder's inequality

Take ni = n, Bi = id, the identity map on R n {\displaystyle \mathbb {R} ^{n}} , replacing fi by f1/ci
i
, and let ci = 1 / pi for 1 ≤ i ≤ m. Then

i = 1 m 1 p i = 1 {\displaystyle \sum _{i=1}^{m}{\frac {1}{p_{i}}}=1}

and the log-concavity of the determinant of a positive definite matrix implies that D = 1. This yields Hölder's inequality in R n {\displaystyle \mathbb {R} ^{n}} :

R n i = 1 m f i ( x ) d x i = 1 m f i p i . {\displaystyle \int _{\mathbb {R} ^{n}}\prod _{i=1}^{m}f_{i}(x)\,\mathrm {d} x\leq \prod _{i=1}^{m}\|f_{i}\|_{p_{i}}.}

Poincaré inequality

The Brascamp–Lieb inequality is an extension of the Poincaré inequality which only concerns Gaussian probability distributions.[8]

Cramér–Rao bound

The Brascamp–Lieb inequality is also related to the Cramér–Rao bound.[8] While Brascamp–Lieb is an upper-bound, the Cramér–Rao bound lower-bounds the variance of var p ( S ( x ) ) {\displaystyle \operatorname {var} _{p}(S(x))} . The Cramér–Rao bound states

var p ( S ( x ) ) E p ( T S ( x ) ) [ E p ( H ϕ ( x ) ) ] 1 E p ( S ( x ) ) {\displaystyle \operatorname {var} _{p}(S(x))\geq E_{p}(\nabla ^{T}S(x))[E_{p}(H\phi (x))]^{-1}E_{p}(\nabla S(x))\!} .

which is very similar to the Brascamp–Lieb inequality in the alternative form shown above.

References

  1. ^ This inequality is in Lieb, Elliott H. (1990). "Gaussian Kernels have only Gaussian Maximizers". Inventiones Mathematicae. 102: 179–208. Bibcode:1990InMat.102..179L. doi:10.1007/bf01233426.
  2. ^ This theorem was originally derived in Brascamp, Herm J.; Lieb, Elliott H. (1976). "On Extensions of the Brunn–Minkowski and Prékopa–Leindler theorems, including inequalities for log concave functions, and with an application to the diffusion equation". Journal of Functional Analysis. 22 (4): 366–389. doi:10.1016/0022-1236(76)90004-5. Extensions of the inequality can be found in Hargé, Gilles (2008). "Reinforcement of an Inequality due to Brascamp and Lieb". Journal of Functional Analysis. 254 (2): 267–300. doi:10.1016/j.jfa.2007.07.019. and Carlen, Eric A.; Cordero-Erausquin, Dario; Lieb, Elliott H. (2013). "Asymmetric Covariance Estimates of Brascamp-Lieb Type and Related Inequalities for Log-concave Measures". Annales de l'Institut Henri Poincaré B. 49 (1): 1–12. arXiv:1106.0709. Bibcode:2013AIHPB..49....1C. doi:10.1214/11-aihp462.
  3. ^ Bennett, Jonathan; Carbery, Anthony; Christ, Michael; Tao, Terence (2008-01-01). "The Brascamp–Lieb Inequalities: Finiteness, Structure and Extremals". Geometric and Functional Analysis. 17 (5): 1343–1415. doi:10.1007/s00039-007-0619-6. hdl:20.500.11820/b13abfca-453c-4aea-adf6-d7d421cec7a4. ISSN 1420-8970. S2CID 10193995.
  4. ^ Bennett, Jonathan; Carbery, Anthony; Christ, Michael; Tao, Terence (2005-05-31). "Finite bounds for Holder-Brascamp-Lieb multilinear inequalities". arXiv:math/0505691.
  5. ^ Christ, Michael; Demmel, James; Knight, Nicholas; Scanlon, Thomas; Yelick, Katherine (2013-07-31). "Communication lower bounds and optimal algorithms for programs that reference arrays -- Part 1". arXiv:1308.0068 [math.CA].
  6. ^ This was derived first in Brascamp, H. J.; Lieb, E. H. (1976). "Best Constants in Young's Inequality, Its Converse and Its Generalization to More Than Three Functions". Advances in Mathematics. 20 (2): 151–172. doi:10.1016/0001-8708(76)90184-5.
  7. ^ Ball, Keith M. (1989). "Volumes of Sections of Cubes and Related Problems". In Lindenstrauss, Joram; Milman, Vitali D. (eds.). Geometric Aspects of Functional Analysis. Lecture Notes in Mathematics. Vol. 1376. Berlin: Springer. pp. 251–260. doi:10.1007/BFb0090058. ISBN 978-3-540-51303-2.
  8. ^ a b Saumard, Adrien; Wellner, Jon A. (2014). "Log-concavity and strong log-concavity: a review". Statistics Surveys. 8: 45–114. doi:10.1214/14-SS107. PMC 4847755. PMID 27134693.