Hidden Markov Model Matlab

Unlike other books on the subject, it is generic and does not focus on a specific theme, e.g. Speech processing. Moreover, it presents the translation of hidden Markov models’ concepts from the domain of formal mathematics into computer codes using MATLAB ®. Hidden Markov Models: Theory and Implementation using MATLAB® - CRC Press Book.

Introduction to Hidden Markov Models (HMM)

A hidden Markov model (HMM)is one in which you observe a sequence of emissions, but do not knowthe sequence of states the model went through to generate the emissions.Analyses of hidden Markov models seek to recover the sequence of statesfrom the observed data.

As an example, consider a Markov model with two states and sixpossible emissions. The model uses:

A red die, having six sides, labeled 1 through 6.
A green die, having twelve sides, five of which arelabeled 2 through 6, while the remaining seven sides are labeled 1.
A weighted red coin, for which the probability ofheads is .9 and the probability of tails is .1.
A weighted green coin, for which the probability ofheads is .95 and the probability of tails is .05.

The model creates a sequence of numbers from the set {1, 2,3, 4, 5, 6} with the following rules:

Begin by rolling the red die and writing down thenumber that comes up, which is the emission.
Toss the red coin and do one of the following:
- If the result is heads, roll the red die and writedown the result.
- If the result is tails, roll the green die and writedown the result.
At each subsequent step, you flip the coin that hasthe same color as the die you rolled in the previous step. If thecoin comes up heads, roll the same die as in the previous step. Ifthe coin comes up tails, switch to the other die.

The state diagram for this model has two states, red and green,as shown in the following figure.

You determine the emission from a state by rolling the die withthe same color as the state. You determine the transition to the nextstate by flipping the coin with the same color as the state.

The transition matrix is:

The emissions matrix is:

The model is not hidden because you know the sequence of statesfrom the colors of the coins and dice. Suppose, however, that someoneelse is generating the emissions without showing you the dice or thecoins. All you see is the sequence of emissions. If you start seeingmore 1s than other numbers, you might suspect that the model is inthe green state, but you cannot be sure because you cannot see thecolor of the die being rolled.

Hidden Markov models raise the following questions:

Given a sequence of emissions, what is the most likelystate path?
Given a sequence of emissions, how can you estimatetransition and emission probabilities of the model?
What is the forward probability thatthe model generates a given sequence?
What is the posterior probability thatthe model is in a particular state at any point in the sequence?

Description

[ESTTR,ESTEMIT] = hmmtrain(seq,TRGUESS,EMITGUESS) estimatesthe transition and emission probabilities for a hidden Markov modelusing the Baum-Welch algorithm. seq can be a rowvector containing a single sequence, a matrix with one row per sequence,or a cell array with each cell containing a sequence. TRGUESS and EMITGUESS areinitial estimates of the transition and emission probability matrices. TRGUESS(i,j) isthe estimated probability of transition from state i tostate j. EMITGUESS(i,k) is theestimated probability that symbol k is emittedfrom state i.

hmmtrain(...,'Algorithm',algorithm) specifiesthe training algorithm. algorithm can beeither 'BaumWelch' or 'Viterbi'.The default algorithm is 'BaumWelch'.

hmmtrain(...,'Symbols',SYMBOLS) specifies the symbols that are emitted. SYMBOLS can be a numeric array, a string array, or a cell array of the names of the symbols. The default symbols are integers 1 through N, where N is the number of possible emissions.

hmmtrain(...,'Tolerance',tol) specifiesthe tolerance used for testing convergence of the iterative estimationprocess. The default tolerance is 1e-4.

hmmtrain(...,'Maxiterations',maxiter) specifiesthe maximum number of iterations for the estimation process. The defaultmaximum is 100.

hmmtrain(...,'Verbose',true) returns thestatus of the algorithm at each iteration.

hmmtrain(...,'Pseudoemissions',PSEUDOE) specifiespseudocount emission values for the Viterbi training algorithm. Usethis argument to avoid zero probability estimates for emissions withvery low probability that might not be represented in the sample sequence. PSEUDOE shouldbe a matrix of size m-by-n,where m is the number of states in the hidden Markovmodel and n is the number of possible emissions.If the i→k emission doesnot occur in seq, you can set PSEUDOE(i,k) tobe a positive number representing an estimate of the expected numberof such emissions in the sequence seq.

hmmtrain(...,'Pseudotransitions',PSEUDOTR) specifiespseudocount transition values for the Viterbi training algorithm.Use this argument to avoid zero probability estimates for transitionswith very low probability that might not be represented in the samplesequence. PSEUDOTR should be a matrix of size m-by-m,where m is the number of states in the hidden Markovmodel. If the i→j transitiondoes not occur in states, you can set PSEUDOTR(i,j) tobe a positive number representing an estimate of the expected numberof such transitions in the sequence states.

If you know the states corresponding to the sequences, use hmmestimate toestimate the model parameters.

Tolerance

The input argument 'tolerance' controls howmany steps the hmmtrain algorithm executes beforethe function returns an answer. The algorithm terminates when allof the following three quantities are less than the value that youspecify for tolerance:

The log likelihood that the input sequence seq isgenerated by the currently estimated values of the transition andemission matrices
The change in the norm of the transition matrix, normalizedby the size of the matrix
The change in the norm of the emission matrix, normalizedby the size of the matrix

The default value of 'tolerance' is 1e-6.Increasing the tolerance decreases the number of steps the hmmtrain algorithmexecutes before it terminates.

`maxiterations`

The maximum number of iterations, 'maxiterations',controls the maximum number of steps the algorithm executes beforeit terminates. If the algorithm executes maxiter iterationsbefore reaching the specified tolerance, the algorithm terminatesand the function returns a warning. If this occurs, you can increasethe value of 'maxiterations' to make the algorithmreach the desired tolerance before terminating.