Workshop on automatic gene annotation

Join us for a workshop titled, “Introduction to automatic gene annotation”. This workshop, running 1-2 November at CSHL, is aimed at developers.  Two Ensembl developers will present sessions on how to create your own core database, including the loading of a genome assembly into a database and the running of simple analyses using the Ensembl genebuild pipeline. This meeting will therefore follow the same format as the 2007-2010 automatic gene annotation workshops.

Participants will be expected to have experience in programming and a background in object-oriented programming. A good familiarity with Perl, a Unix/Linux environment, and MySQL are essential to follow the workshop and the programming examples. Knowledge of the Ensembl core API is also essential.  We will be working from a Virtual Machine and participants are expected to bring their own laptops (preferabley Mac) to work from – more details will be provided on registration.

Topics to be presented:

  • Introduction to the GeneBuild pipeline, including data input types, generating protein-coding transcript models, and adding UTR to these models
  • An introduction to assembly structure (toplevel, contigs, scaffolds, chromosomes)
  • Overview of the different Ensembl APIs
  • Obtaining the Ensembl API (cvs checkout)
  • Core database schema
  • Tracking jobs in the pipeline
  • Runnable and RunnableDB modules

Practical sessions:

  • Creating a genebuild database
  • Loading an assembly into the database
  • Running algorithms first on the commandline and then using the pipeline
  • Understanding how the pipeline code interacts with the algorithms and the database
  • Understanding the pipeline’s job tracking system
  • Visualisation of results with Apollo.

Would you like to join us? Please contact Bert (bert@ebi.ac.uk) for more details or to register.