## Paper accepted at European Conference on Machine Learning (ECML-PKDD) 2017

We have a paper accepted at ECML-PKDD 2017: “Content-Based Social Recommendation with Poisson Matrix Factorization” (Eliezer de Souza da Silva, Helge Langseth and Heri Ramampiaro). This is our first full paper resulting from our research on Poisson factorization and integration of multiple sources of information in a single recommendation model. If you have interest on the paper please email me and I will be happy to discuss.

Also, I am uploading the supplement of the paper here (you can find it also on my publications page)

Supplementary material for: “Content-Based Social
Recommendation with Poisson Matrix Factorization”

## Hidden Markov Models (part II): forward-backward algorithm for marginal conditional probability of the states

(in the same series HMM (part I): recurrence equations for filtering and prediction)

Consider a Hidden Markov Model (HMM) with hidden states $x_t$ (for $t \in {1, 2, \cdots, T}$), initial probability $p(x_1)$, observed states $y_t$, transition probability $p(x_t|x_{t-1})$ and observation model $p(y_t|x_t)$. This model can be factorized as
$p(x_{1:T},y_{1:T}) = p(y_1|x_1)p(x_1)\prod_{t=2}^{t=T}p(y_t|x_t)p(x_t|x_{t-1})$. We will use the notation $X=x_{1:T}$ to represent the set $X=\{x_1,x_2,\cdots,x_T\}$.
In this post we will present the details of the method to find the smoothing distribution $p(x_t|y_{1:T})$ of a HMM, given a set of observations $y_{1:T}$:
Our starting point is the marginal probability $p(x_t|y_{1:T})$ of $x_t$ given all the observations $y_{1:T}$.

\begin{aligned} p(x_t|y_{1:T}) &= \frac{p(x_t,y_{1:T})}{p(y_{1:T})} \\ &= \frac{p(x_t,y_{1:t},y_{(t+1):T})}{p(y_{1:T})}\\ &= \underbrace{p(y_{(t+1):T}|x_t)}_{\beta_t(x_t)}\underbrace{p(x_t,y_{1:t})}_{\alpha_t(x_t)}\frac{1}{p(y_{1:T})} \\ &= \frac{\alpha_t(x_t) \beta_t(x_t)}{p(y_{1:T})} \end{aligned}

## Sampling from Dirichlet Distribution using Gamma distributed samples

There is an algorithm to generate Dirichlet samples using a sampler for Gamma distribution for any $\alpha > 0$ and $\beta > 0$. We will generate Gamma distributed variables $z_k \sim \text{gamma}(\alpha_k,1)$, for $k \in {1,\cdots,d}$, and do the following variable transformation to get Dirichlet samples $x_k = \frac{z_k}{\sum_k z_k}$. First we should demonstrate that this transformation results in Dirichlet distributed samples.

Consider the following tranformation $(z_1,\cdots,z_d) \leftarrow (x_1,\cdots,x_d,v)$, where $x_k = \frac{z_k}{\sum_k z_k}$ and $v = {\sum_k z_k}$. We can rewrite this transformation as $(x_1,\cdots,x_d,v)=h(z_1,\cdots,z_d)$, where $x_k = \frac{z_k}{v}$ and $v = {\sum_k z_k}$. Also we can imediatly calculate the inverse transformation $(z_1,\cdots,z_d)=h^{-1}(x_1,\cdots,x_d,v)$, with $z_k=v x_k$. From the transformation definition we know that ${\sum_{k=1}^d x_k=1}$, implying that $x_d = 1-\sum_{k=1}^{d-1} x_k$ and $z_d=v(1-\sum_{k=1}^{d-1}x_k)$.

## Probabilistic models for Recommender systems (part I): Probabilistic Matrix Factorization

In Recommender Systems design we are faced with the following problem: given incomplete information about users preference, content information, user-items rating and  contextual information, learn the user preference and suggest new items for users based on features as:

• previous pattern of items preference of the user;
• preference of users with similar rating pattern;
• contextual information: location, time (of the day, week, month, year), social network.

This is usually formulated as a matrix completion problem using Matrix Factorization techniques to offer a optimal solution. Is this case latent features for users and item are inferred from the observed/rated items for each user, and from this latent features the missing entries are estimated. One major modelling tool for this problem is probabilistic modelling, and there are many proposals in the literature of different Probabillistic Matrix Factorization approaches. We will briefly discuss some of this models, starting with the seminal paper: Probabilistic Matrix Factorization (PMF) – [Salakhutdinov and Mnih, 2008, NIPS].

Continue reading “Probabilistic models for Recommender systems (part I): Probabilistic Matrix Factorization”