Lately, I have been working with Poisson Matrix Factorization models and
at some point a needed to work a lower bound for . After seeing some people using this lower bound without a good explanation, I decided to write this blog post. Also, this is included as an appendix to my ECML-PKDD 2017 paper about poisson factorizatiom model for recommendation.
The function is a concave function, which means that:
By induction this property can be generalized to any convex combination of (
with
and
):
Now with the a random variable we can create a similar convex combination by multiplying and dividing each random variable by
and apply the sum of of expectation property:
If we want a tight lower bound we should use Lagrange multipliers to choose the set of that maximize the lower-bound given that they should sum to 1.
The final formula for is exactly the same that we can find for the parameters of the the Multinomial distribution of the auxiliary variables in a Poisson model with rate parameter as sum of Gamma distributed latent variables. Also using this optimal
we can show a tight bound without the auxiliary variables.
This results in:
Is this bound not obtainable just by using convexity of log sum exp and Jensen’s inequality?
yup (you mean the final bound without the p_k right?). The nice thing about this proof here is that the intermediate bound with p_k is useful for calculating the ELBO in Poisson-gamma matrix/tensor factorization models once you augment the model with latent Poisson counts.
Using p_k = 1/K also seems to work with Jensen’s inequality on the summation.
there is a discussion here about this particular bound and other similar bounds and how it is useful to retain the auxiliary variables sometimes http://www.columbia.edu/~jwp2128/Teaching/E6720/Fall2016/papers/twobounds.pdf (what I was trying to accomplish here with this post is just to point to that particular bound because it is useful in poisson-gamma factorization and maybe show a pedagogical derivation of it)