Fellows Research Meetings

Paul Malcolm
Cyber & Crypto-Mathematics Research,
Defence Science and Technology Group

Modelling and estimation with higher-order Markov chains observed through arbitrary disturbance models

The most basic models representing an indirectly observed stochastic process is a Markov chain in discrete time of known order. This class of stochastic processes offers us a very simple and accessible finite-state model with a temporal dependency. In this work we propose to extend modelling and estimation techniques with latent Markov chains in two separate, but complimentary ways. Firstly, we develop a new representation for higher order Markov chains which is immediately amenable to the writing down of recursive estimators, such as filters smoothers and detectors in higher-order scenarios. Our representation of higher-order chains is firstly based upon taking the state-space of a Markov chain as a canonical basis of indicator-function-valued random variables.

This state-space is extended to higher-order models through the use of a Kronecker product.
Our representation naturally leads to recursive dynamics additively decomposed into a "weighted immediate past" and a Martingale increment term. State estimation with higher order Markov models has application in computational linguistics, especially Parts of Speech Tagger's (POS) and in Genomics.
As a product of this work, we develop a convenient representation for the first-order transition matrix corresponding to a higher order chain of any number of states N and any order M >=2. This representation proves useful for implementation and analysis of this class or Markov Chain.

Our second extension to estimation with Hidden Markov models enables relaxing the usual and perhaps ubiquitous assumption of simple Gaussian additive noise in measurement models. We assume an arbitrary non-Gaussian noise model may be represented by a Gaussian mixtures model and in so doing
appeal to an L1 approximation theorem for approximating a known uni-variate non-Gaussian density by a Gaussian mixture. A proof of the L1 approximation is sketched. Finally, making use of the approximation result just described, we develop state estimation schemes for non Gaussian noise models.
We do so via change of probability measure techniques exploiting the fact that a convex combination of Radon Nikodym derivatives is again a Radon Nikodym derivative.

In any approximation used in modelling, one must always consider the question "how good is my approximation?". For example, only recently have confidence intervals been determined for Leo Breiman's Random Forests. Correspondingly, one must ask here, how good is a Gaussian mixture approximation to a given probability density? Can such an accuracy be scored quantitatively? If so, might such a score extend naturally to higher dimensional Gaussian mixtures?

These extending points of analysis will be briefly discussed.

Remark: This work is joint work with Professor John van der Hoek at the University of South Australia.

Last reviewed: 10 January, 2017