Sampling from Dirichlet Distribution using Gamma distributed samples

There is an algorithm to generate Dirichlet samples using a sampler for Gamma distribution for any \alpha > 0 and \beta > 0. We will generate Gamma distributed variables z_k \sim \text{gamma}(\alpha_k,1), for k \in {1,\cdots,d}, and do the following variable transformation to get Dirichlet samples x_k = \frac{z_k}{\sum_k z_k}. First we should demonstrate that this transformation results in Dirichlet distributed samples.

Consider the following tranformation (z_1,\cdots,z_d) \leftarrow (x_1,\cdots,x_d,v), where x_k = \frac{z_k}{\sum_k z_k} and v = {\sum_k z_k}. We can rewrite this transformation as (x_1,\cdots,x_d,v)=h(z_1,\cdots,z_d), where x_k = \frac{z_k}{v} and v = {\sum_k z_k}. Also we can imediatly calculate the inverse transformation (z_1,\cdots,z_d)=h^{-1}(x_1,\cdots,x_d,v), with z_k=v x_k. From the transformation definition we know that {\sum_{k=1}^d x_k=1}, implying that x_d = 1-\sum_{k=1}^{d-1} x_k and z_d=v(1-\sum_{k=1}^{d-1}x_k).

Continue reading “Sampling from Dirichlet Distribution using Gamma distributed samples”

Probabilistic models for Recommender systems (part I): Probabilistic Matrix Factorization

In Recommender Systems design we are faced with the following problem: given incomplete information about users preference, content information, user-items rating and  contextual information, learn the user preference and suggest new items for users based on features as:

  • previous pattern of items preference of the user;
  • preference of users with similar rating pattern;
  • contextual information: location, time (of the day, week, month, year), social network.

This is usually formulated as a matrix completion problem using Matrix Factorization techniques to offer a optimal solution. Is this case latent features for users and item are inferred from the observed/rated items for each user, and from this latent features the missing entries are estimated. One major modelling tool for this problem is probabilistic modelling, and there are many proposals in the literature of different Probabillistic Matrix Factorization approaches. We will briefly discuss some of this models, starting with the seminal paper: Probabilistic Matrix Factorization (PMF) – [Salakhutdinov and Mnih, 2008, NIPS].

Continue reading “Probabilistic models for Recommender systems (part I): Probabilistic Matrix Factorization”

Hidden Markov Models (part I): recurrence equations for filtering and prediction

This semester I will be attending the doctoral course MA8702 – Advanced Modern Statistical Methods  with the excellent Prof. Håvard Rue. It will be course about statistical models defined over sparse structures (chains and graphs). We will start with Hidden Markov Chains and after go to Gaussian Markov Random Fields, Latent Gaussian Models and approximate inference with Integrated Nested Laplace Approximation (INLA). All this models are interesting for my research objective of developing sound latent models for recommender systems and I am really happy of taking this course with this great teacher and researcher. So, I will try to cover some of the material of the course, starting from what we saw in the first lecture: exact recurrence for Hidden Markov Chains and dynamic programming. In other words, general equations for predictions, filtering, smoothing, sampling, mode and marginal likelihood calculation of state-space model with latent variables. We will start by introduction the general model and specifying how to obtain the prediction and filtering equation.

Screenshot from 2016-01-18 03:09:23

  • Markovian property: \pi(x_1,x_2,\cdots,x_T)=\pi(x_{1:T})=\prod_{t=1}^{t=T}\pi(x_t|x_{t-1}) , with \pi(x_1|x_0)=\pi(x_1)
  • y_t are observed and x_t are latent, so \pi(y_t|x_t) is always known.
  • If we know x_{t-1} than no other variable will add any information to the conditional distribution of x_t.

Continue reading “Hidden Markov Models (part I): recurrence equations for filtering and prediction”