Coordinate Systems for Pangenome Graphs based on the Level Function and Minimum Path Covers

Thomas Büchler, Caroline Räther, Pascal Weber, Enno Ohlebusch

Publications: Contribution to bookContribution to proceedingsPeer Reviewed

Abstract

The Computational Pan-Genomics Consortium (Consortium, 2016) described the role of coordinate systems in genomics as follows: “A pan-genome defines the space in which (pan-)genomic analyses take place. It should provide a ‘coordinate system’ to unambiguously identify genetic loci and (potentially nested) genetic variants.” The most natural representations of pangenomes are graphs. The Computational Pan-Genomics Consortium identified desirable properties of the linear reference genome model that graphical frameworks should attempt to preserve: spatiality, monotonicity, and readability. In this paper, we introduce a coordinate system for DAGs that has these properties. It is based on the level function and a minimum path cover of the graph. Moreover, we describe a new method for finding a minimum path cover in a DAG, which works very well in practice.
Original languageEnglish
Title of host publicationProceedings of the 14th International Joint Conference on Biomedical Engineering Systems and Technologies - BIOINFORMATICS
EditorsR Lorenz, A Fred, H Gamboa
Pages21-29
Number of pages9
Volume3
DOIs
Publication statusPublished - 11 Feb 2021
Externally publishedYes

Austrian Fields of Science 2012

  • 102033 Data mining

Keywords

  • Pangenome
  • Coordinate System
  • Directed Acyclic Graph
  • Level Function
  • Minimum Path Cover

Cite this