logo

Computational Molecular Biology, aka Algorithms for Computational Biology

Home * Syllabus * Course Information * Lecturers,TAs, office hours * Programming Help

Gene Prediction Module Guide

In this module, you will be exposed to the basic ideas of de novo gene prediction. Specifically, you should be able to:

·         Create a simple gene predictor using an ordinary HMM

·         Explain the limitations on gene prediction imposed by the ordinary HMM assumptions

·         Explain how semi-Markov models alleviate some of those limitations

·         Explain how tree HMMs allow information from evolutionary changes to be used for improving gene prediction

·         Explain how Conditional Random Fields allow a broader range of sequence features to bed used in gene prediction

·         Name some of the most important gene prediction programs and the probabilistic formalisms they are based on

Please read Sections 1 & 2 of the Gene Prediction Notes as well as this brief tutorial in advance of the Gene Prediction Day 1 meeting.

Day 1: Monday, October 5

In class

Before the next class (this is useful for your end-of-semester project)

·         Read: Chapter 1 and Sections 2.1-2.6 and 3.1-3.5 of Chris Burge’s doctoral dissertation.

Day 2: We are not doing Day 2 in 2009

In class

 

 

 

A hot topic that we didn’t have time for is Conditional Random Fields (CRFs). Here is some info on them.
CRF, CRF-tutorial, CRF-gene-finder

 

For greater depth on the state of the art in practical genome annotation, see my recent review article.