A lower bound for expected value of log-sum

Lately, I have been working with Poisson Matrix Factorization models and
at some point a needed to work a lower bound for \text{E}_q[\log \sum_k X_k]. After seeing some people using this lower bound without a good explanation, I decided to write this blog post. Also, this is included as an appendix to my ECML-PKDD 2017 paper about poisson factorizatiom model for recommendation.
The function \log(.) is a concave function, which means that: \log(p_1 x_1+p_2 x_2) \geq p_1\log x_1+p_2 \log x_2, \forall p_1,p_2:p_1+p_2=1 
By induction this property can be generalized to any convex combination of x_k (\sum_k p_k x_k with \sum_k p_k=1 ):

\log \sum_k p_k x_k \geq \sum_k p_k\log x_k

Now with the a random variable we can create a similar convex combination by multiplying and dividing each random variable X_k by p_k and apply the sum of of expectation property:
 \text{E}_q[\log \sum_k X_k] = \text{E}_q[\sum_k\log \frac{p_k X_k}{p_k}]
\log \sum_k p_k\frac{X_k}{p_k} \geq \sum_k p_k\log \frac{X_k}{p_k}
\Rightarrow\text{E}_q [\log \sum_k p_k\frac{X_k}{p_k}] \geq \sum_k p_k \text{E}_q[\log \frac{X_k}{p_k}]
\Rightarrow \text{E}_q [\log \sum_k X_k ] \geq \sum_k p_k \text{E}_q[\log X_k]- p_k\log p_k

If we want a tight lower bound we should use Lagrange multipliers to choose the set of p_k that maximize the lower-bound given that they should sum to 1.

L(p_1,\ldots,p_K) = \left(\sum_k p_k \text{E}_q[\log X_k]- p_k\log p_k\right)+\lambda \left(1-\sum_k p_k\right)
 \frac{\partial L}{\partial p_k} =\text{E}_q[\log X_k]-\log p_k-1-\lambda = 0
\frac{\partial L}{\partial \lambda} =1-\sum_k p_k = 0
 \Rightarrow \sum_k p_k = 1
 \Rightarrow\text{E}_q[\log X_k]=\log p_k+1+\lambda
\Rightarrow\text{E}_q[\log X_k]=\log p_k+1+\lambda
\Rightarrow \exp\text{E}_q[\log X_k]=p_k \exp(1+\lambda)
\Rightarrow \sum_k \exp\text{E}_q[\log X_k]=\exp(1+\lambda)\underbrace{\sum_k p_k}_{=1}
\Rightarrow p_k=\frac{\exp \{\text{E}_q[\log X_k]\}}{\sum_k \exp \{\text{E}_q[\log X_k]\}}

The final formula for p_k is exactly the same that we can find for the parameters of the the Multinomial distribution of the auxiliary variables in a Poisson model with rate parameter as sum of Gamma distributed latent variables. Also using this optimal p_k we can show a tight bound without the auxiliary variables.

\text{E}_q [\log \sum_k X_k ] \geq \sum_k \frac{\exp \{\text{E}_q[\log X_k]\}}{\sum_j \exp \{\text{E}_q[\log X_j]\}}\text{E}_q[\log X_k]- \frac{\exp \{\text{E}_q[\log X_k]\}}{\sum_j \exp \{\text{E}_q[\log X_j]\}}\log \frac{\exp \{\text{E}_q[\log X_k]\}}{\sum_j \exp \{\text{E}_q[\log X_j]\}}
= \sum_k \frac{\exp \{\text{E}_q[\log X_k]\}}{\sum_j \exp \{\text{E}_q[\log X_j]\}} \log \sum_j \exp \{\text{E}_q[\log X_j]\}
= \log \sum_j \exp \{\text{E}_q[\log X_j]\} \underbrace{ \sum_k \frac{\exp \{\text{E}_q[\log X_k]\}}{\sum_j \exp \{\text{E}_q[\log X_j]\}} }_{=1}
This results in:
\text{E}_q [\log \sum_k X_k ] \geq \log \sum_k \exp \{\text{E}_q[\log X_k]\}

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s