Forward algorithm
<templatestyles src="Module:Hatnote/styles.css"></templatestyles>
The forward algorithm, in the context of a hidden Markov model, is used to calculate a 'belief state': the probability of a state at a certain time, given the history of evidence. The process is also known as filtering. The forward algorithm is closely related to, but distinct from, the Viterbi algorithm.
For an HMM such as this one:
this probability is written as . Here is the hidden state which is abbreviated as and are the observations to . A belief state can be calculated at each time step, but doing this does not, in a strict sense, produce the most likely state sequence, but rather the most likely state at each time step, given the previous history.
Algorithm
The goal of the forward algorithm is to compute the joint probability , where for notational convenience we have abbreviated as and as . Computing directly would require marginalizing over all possible state sequences , the number of which grows exponentially with . Instead, the forward algorithm takes advantage of the conditional independence rules of the hidden Markov model (HMM) to perform the calculation recursively.
To demonstrate the recursion, let
-
- .
Using the chain rule to expand , we can then write
-
- .
Because is conditionally independent of everything but , and is conditionally independent of everything but , this simplifies to
-
- .
Thus, since and are given by the model's emission distributions and transition probabilities, one can quickly calculate from and avoid incurring exponential computation time.
The forward algorithm is easily modified to account for observations from variants of the hidden Markov model as well, such as the Markov jump linear system.
Smoothing
In order to take into account future history (i.e., if one wanted to improve the estimate for past times), you can run the backward algorithm, which complements the forward algorithm. This is called smoothing.[why?] The forward/backward algorithm computes for . So the full forward/backward algorithm takes into account all evidence.
Decoding
In order to achieve the most likely sequence, the Viterbi algorithm is required. It computes the most likely state sequence given the history of observations, that is, the state sequence that maximizes .
See also
Further reading
- Russell and Norvig's Artificial Intelligence, a Modern Approach, starting on page 541 of the 2003 edition, provides a succinct exposition of this and related topics
<templatestyles src="Asbox/styles.css"></templatestyles>