Hidden Markov Models (part II): forward-backward algorithm for marginal conditional probability of the states

(in the same series HMM (part I): recurrence equations for filtering and prediction)

Consider a Hidden Markov Model (HMM) with hidden states $x_t$ (for $t \in {1, 2, \cdots, T}$), initial probability $p(x_1)$, observed states $y_t$, transition probability $p(x_t|x_{t-1})$ and observation model $p(y_t|x_t)$. This model can be factorized as
$p(x_{1:T},y_{1:T}) = p(y_1|x_1)p(x_1)\prod_{t=2}^{t=T}p(y_t|x_t)p(x_t|x_{t-1})$. We will use the notation $X=x_{1:T}$ to represent the set $X=\{x_1,x_2,\cdots,x_T\}$.
In this post we will present the details of the method to find the smoothing distribution $p(x_t|y_{1:T})$ of a HMM, given a set of observations $y_{1:T}$:
Our starting point is the marginal probability $p(x_t|y_{1:T})$ of $x_t$ given all the observations $y_{1:T}$.

\begin{aligned} p(x_t|y_{1:T}) &= \frac{p(x_t,y_{1:T})}{p(y_{1:T})} \\ &= \frac{p(x_t,y_{1:t},y_{(t+1):T})}{p(y_{1:T})}\\ &= \underbrace{p(y_{(t+1):T}|x_t)}_{\beta_t(x_t)}\underbrace{p(x_t,y_{1:t})}_{\alpha_t(x_t)}\frac{1}{p(y_{1:T})} \\ &= \frac{\alpha_t(x_t) \beta_t(x_t)}{p(y_{1:T})} \end{aligned}