Paper accepted at European Conference on Machine Learning (ECML-PKDD) 2017

We have a paper accepted at ECML-PKDD 2017: “Content-Based Social Recommendation with Poisson Matrix Factorization” (Eliezer de Souza da Silva, Helge Langseth and Heri Ramampiaro). This is our first full paper resulting from our research on Poisson factorization and integration of multiple sources of information in a single recommendation model. If you have interest on the paper please email me and I will be happy to discuss.

Also, I am uploading the supplement of the paper here (you can find it also on my publications page)

Supplementary material for: “Content-Based Social
Recommendation with Poisson Matrix Factorization”

Continue reading “Paper accepted at European Conference on Machine Learning (ECML-PKDD) 2017”

Hidden Markov Models (part II): forward-backward algorithm for marginal conditional probability of the states

(in the same series HMM (part I): recurrence equations for filtering and prediction)

Consider a Hidden Markov Model (HMM) with hidden states x_t (for t \in {1, 2, \cdots, T}), initial probability p(x_1), observed states y_t, transition probability p(x_t|x_{t-1}) and observation model p(y_t|x_t). This model can be factorized as
p(x_{1:T},y_{1:T}) = p(y_1|x_1)p(x_1)\prod_{t=2}^{t=T}p(y_t|x_t)p(x_t|x_{t-1}) . We will use the notation X=x_{1:T} to represent the set X=\{x_1,x_2,\cdots,x_T\}.
In this post we will present the details of the method to find the smoothing distribution p(x_t|y_{1:T}) of a HMM, given a set of observations y_{1:T}:
Our starting point is the marginal probability p(x_t|y_{1:T}) of x_t given all the observations y_{1:T}.

\begin{aligned} p(x_t|y_{1:T}) &= \frac{p(x_t,y_{1:T})}{p(y_{1:T})} \\ &= \frac{p(x_t,y_{1:t},y_{(t+1):T})}{p(y_{1:T})}\\ &= \underbrace{p(y_{(t+1):T}|x_t)}_{\beta_t(x_t)}\underbrace{p(x_t,y_{1:t})}_{\alpha_t(x_t)}\frac{1}{p(y_{1:T})} \\ &= \frac{\alpha_t(x_t) \beta_t(x_t)}{p(y_{1:T})} \end{aligned}

Continue reading “Hidden Markov Models (part II): forward-backward algorithm for marginal conditional probability of the states”

Sampling from Dirichlet Distribution using Gamma distributed samples

There is an algorithm to generate Dirichlet samples using a sampler for Gamma distribution for any \alpha > 0 and \beta > 0. We will generate Gamma distributed variables z_k \sim \text{gamma}(\alpha_k,1), for k \in {1,\cdots,d}, and do the following variable transformation to get Dirichlet samples x_k = \frac{z_k}{\sum_k z_k}. First we should demonstrate that this transformation results in Dirichlet distributed samples.

Consider the following tranformation (z_1,\cdots,z_d) \leftarrow (x_1,\cdots,x_d,v), where x_k = \frac{z_k}{\sum_k z_k} and v = {\sum_k z_k}. We can rewrite this transformation as (x_1,\cdots,x_d,v)=h(z_1,\cdots,z_d), where x_k = \frac{z_k}{v} and v = {\sum_k z_k}. Also we can imediatly calculate the inverse transformation (z_1,\cdots,z_d)=h^{-1}(x_1,\cdots,x_d,v), with z_k=v x_k. From the transformation definition we know that {\sum_{k=1}^d x_k=1}, implying that x_d = 1-\sum_{k=1}^{d-1} x_k and z_d=v(1-\sum_{k=1}^{d-1}x_k).

Continue reading “Sampling from Dirichlet Distribution using Gamma distributed samples”

Probabilistic models for Recommender systems (part I): Probabilistic Matrix Factorization

In Recommender Systems design we are faced with the following problem: given incomplete information about users preference, content information, user-items rating and  contextual information, learn the user preference and suggest new items for users based on features as:

  • previous pattern of items preference of the user;
  • preference of users with similar rating pattern;
  • contextual information: location, time (of the day, week, month, year), social network.

This is usually formulated as a matrix completion problem using Matrix Factorization techniques to offer a optimal solution. Is this case latent features for users and item are inferred from the observed/rated items for each user, and from this latent features the missing entries are estimated. One major modelling tool for this problem is probabilistic modelling, and there are many proposals in the literature of different Probabillistic Matrix Factorization approaches. We will briefly discuss some of this models, starting with the seminal paper: Probabilistic Matrix Factorization (PMF) – [Salakhutdinov and Mnih, 2008, NIPS].

Continue reading “Probabilistic models for Recommender systems (part I): Probabilistic Matrix Factorization”