Gene Prediction Module
Guide
In this
module, you will be exposed to the basic ideas of de novo gene
prediction. Specifically, you should be able to:
·
Create a simple gene predictor using an
ordinary HMM
·
Explain the limitations on gene
prediction imposed by the ordinary HMM assumptions
·
Explain how semi-Markov models alleviate
some of those limitations
·
Explain how tree HMMs allow information
from evolutionary changes to be used for improving gene prediction
·
Explain how Conditional Random Fields
allow a broader range of sequence features to bed used in gene prediction
·
Name some of the most important gene
prediction programs and the probabilistic formalisms they are based on
Please read Sections 1 & 2 of the Gene Prediction Notes as well as this brief tutorial in advance of the Gene
Prediction Day 1 meeting.
Day 1: Monday, October
5
In class
- Discussion: Genome annotation project, ordinary HMMs and their
limitations, semi-Markov models
Before the next class (this is useful for your end-of-semester project)
Day 2: We are not doing Day 2 in 2009
In class
A hot topic that we
didn’t have time for is Conditional Random Fields (CRFs). Here is some
info on them.
CRF, CRF-tutorial, CRF-gene-finder
For greater depth on the state of the art in practical genome annotation,
see my recent review article.