Genome annotation can be classified into three levels: the nucleotide, protein and process levels. Gene finding is a chief aspect of nucleotide-level annotation. For complex genomes, the most successful methods use a combination of ab initio gene prediction and sequence comparison with expressed sequence databases and other organisms.
Nucleotide-level annotation also allows the integration of genome sequence with other genetic and physical maps of the genome.
The principal aim of protein-level annotation is to assign function to the products of the genome. Databases of protein sequences, functional domains and motifs are powerful resources for this type of annotation. Nevertheless, half of the predicted proteins in a new genome sequence tend to have no obvious function. Understanding the function of genes and their products in the context of cellular and organismal physiology is the goal of process-level annotation.
One of the obstacles to this level of annotation has been the inconsistency of terms used by different model systems. The Gene Ontology Consortium is helping to solve this problem.
Stay tuned for more updates and opportunities.