Eukaryotic Genome Sequence Assembly and Annotation

The Eukaryotic Genome Sequence Assembly and Annotation learning path is a structured training resource designed to support researchers in developing practical skills for assembling and annotating eukaryotic genomes. It provides a comprehensive learning pathway across three key domains: FAIR essentials, genome assembly, and genome annotation. Together, these domains introduce core concepts and methodologies, including sequencing technologies, assembly strategies, quality assessment, structural and functional annotation, and best practices for data management. The learning path also emphasises reproducibility, the application of community standards, and responsible data handling.

The development of the Eukaryotic Genome Sequence Assembly and Annotation learning path was a collaborative effort within the ELIXIR Biodiversity Community. It was created as part of an ELIXIR Implementation Study and refined through contributions from experts across the network, ensuring that the content reflects current practices and community needs.

ELIXIR Learning Paths

The Eukaryotic Genome Sequence Assembly and Annotation is an ELIXIR Learning Path. A distinctive feature of ELIXIR learning paths is that they are composed of training materials developed by multiple institutions, showcasing resources produced by diverse groups across Europe. Each topic in this learning path brings together materials from different providers, and as a result, some similarities between resources is expected.

Note: If you have training materials you would like to see included in this learning path, please contact the authors of this learning path.

Licence: Creative Commons Attribution 4.0 International

Keywords: Data management, Sequence assembly, Genomics, FAIR

Authors: Valeria Di Cola, Anne-Françoise Adam-Blondon

Contributors: Robert Waterhouse, Anthony Bretaudeau, Valeria Di Cola, Anne-Françoise Adam-Blondon, Patricia Palagi

Scientific topics: Genomics, Data management, Biodiversity, Sequence assembly, Sequence analysis, FAIR data

Status: Active

Target audience: Bioinformatician, Biologist, Data Scientist, genomics researchers

Learning objectives:

At the end of the learning path, researchers or data managers specialized in biodiversity genomics are able to :
- retrieve metadata for the biological materials of their interest
- retrieve unique identifiers for their samples from BioSamples
- retrieve metadata on omics experiments
- publish metadata on omics experiments in the recommended databases
- use tools to version their metadata curation before publication
- submit the raw data and the annotated genomes to ENA, the samples to BioSamples with the correct metadata
- analyze the data of the genome in reproducible way and manage the data in a FAIR way

People should be able to do genome assembly by themself (starting with raw data), and do the functional annotation of the genome that they obtain.

1

Data Management: Basics of the genome sequencing

• Beginner 12 materials
3

Eukaryotic Genome Assembly

••• Advanced 23 materials
4

Eukaryotic Genome Annotation

••• Advanced 11 materials

Activity log