All the versions of this article:
“Parsimonious Markov models and applications to biological sequence analysis”.
Markov chains are massively used in biological sequence analysis, although their practical use raises statistical issues regarding the choice of their memory length. While increased memory length allows to capture more information from the sequence, this benefit can be more than compensated by the associated degradation of the quality of estimation. Adaptive solutions, namely Variable length Markov chains, have been proposed in the early 80s, and further developed afterwards in the fields of text compression and statistical modelling of discrete-valued sequences. This thesis proposes a generalization of this approach, resulting in the introduction of Parsimonious Markov models. Besides the definition of this class of models, a bayesian model selection algorithm and the associated convergence theorem are presented.
Viva will therefore take place on Monday, December 15th at 16 h, in the room of conference of the 10è graduates of the “Evry2” tower (Place to be confirmed by the University).