Hidden Markov Models (part II): forward-backward algorithm for marginal conditional probability of the states

(in the same series HMM (part I): recurrence equations for filtering and prediction)

Consider a Hidden Markov Model (HMM) with hidden states x_t (for t \in {1, 2, \cdots, T}), initial probability p(x_1), observed states y_t, transition probability p(x_t|x_{t-1}) and observation model p(y_t|x_t). This model can be factorized as
p(x_{1:T},y_{1:T}) = p(y_1|x_1)p(x_1)\prod_{t=2}^{t=T}p(y_t|x_t)p(x_t|x_{t-1}) . We will use the notation X=x_{1:T} to represent the set X=\{x_1,x_2,\cdots,x_T\}.
In this post we will present the details of the method to find the smoothing distribution p(x_t|y_{1:T}) of a HMM, given a set of observations y_{1:T}:
Our starting point is the marginal probability p(x_t|y_{1:T}) of x_t given all the observations y_{1:T}.

\begin{aligned} p(x_t|y_{1:T}) &= \frac{p(x_t,y_{1:T})}{p(y_{1:T})} \\ &= \frac{p(x_t,y_{1:t},y_{(t+1):T})}{p(y_{1:T})}\\ &= \underbrace{p(y_{(t+1):T}|x_t)}_{\beta_t(x_t)}\underbrace{p(x_t,y_{1:t})}_{\alpha_t(x_t)}\frac{1}{p(y_{1:T})} \\ &= \frac{\alpha_t(x_t) \beta_t(x_t)}{p(y_{1:T})} \end{aligned}

Continue reading “Hidden Markov Models (part II): forward-backward algorithm for marginal conditional probability of the states”

Hidden Markov Models (part I): recurrence equations for filtering and prediction

This semester I will be attending the doctoral course MA8702 – Advanced Modern Statistical Methods  with the excellent Prof. Håvard Rue. It will be course about statistical models defined over sparse structures (chains and graphs). We will start with Hidden Markov Chains and after go to Gaussian Markov Random Fields, Latent Gaussian Models and approximate inference with Integrated Nested Laplace Approximation (INLA). All this models are interesting for my research objective of developing sound latent models for recommender systems and I am really happy of taking this course with this great teacher and researcher. So, I will try to cover some of the material of the course, starting from what we saw in the first lecture: exact recurrence for Hidden Markov Chains and dynamic programming. In other words, general equations for predictions, filtering, smoothing, sampling, mode and marginal likelihood calculation of state-space model with latent variables. We will start by introduction the general model and specifying how to obtain the prediction and filtering equation.

Screenshot from 2016-01-18 03:09:23

  • Markovian property: \pi(x_1,x_2,\cdots,x_T)=\pi(x_{1:T})=\prod_{t=1}^{t=T}\pi(x_t|x_{t-1}) , with \pi(x_1|x_0)=\pi(x_1)
  • y_t are observed and x_t are latent, so \pi(y_t|x_t) is always known.
  • If we know x_{t-1} than no other variable will add any information to the conditional distribution of x_t.

Continue reading “Hidden Markov Models (part I): recurrence equations for filtering and prediction”