Abstract
The Computational Pan-Genomics Consortium (Consortium, 2016) described the role of coordinate systems in genomics as follows: “A pan-genome defines the space in which (pan-)genomic analyses take place. It should provide a ‘coordinate system’ to unambiguously identify genetic loci and (potentially nested) genetic variants.” The most natural representations of pangenomes are graphs. The Computational Pan-Genomics Consortium identified desirable properties of the linear reference genome model that graphical frameworks should attempt to preserve: spatiality, monotonicity, and readability. In this paper, we introduce a coordinate system for DAGs that has these properties. It is based on the level function and a minimum path cover of the graph. Moreover, we describe a new method for finding a minimum path cover in a DAG, which works very well in practice.
Original language | English |
---|---|
Title of host publication | Proceedings of the 14th International Joint Conference on Biomedical Engineering Systems and Technologies - BIOINFORMATICS |
Editors | R Lorenz, A Fred, H Gamboa |
Pages | 21-29 |
Number of pages | 9 |
Volume | 3 |
DOIs | |
Publication status | Published - 11 Feb 2021 |
Externally published | Yes |
Austrian Fields of Science 2012
- 102033 Data mining
Keywords
- Pangenome
- Coordinate System
- Directed Acyclic Graph
- Level Function
- Minimum Path Cover