Pangenome

Concept

A Genenees pangenome is a representation of all sequence material in a group of genomes. The pangenome is gradually modeled by fragmenting all genomes and analyizing every fragment for similarity to the pangenome. If the fragment is deemed unique it is added to the pangenome, and otherwise discarded. The result is a unit that contains all the genomes sequence material with low redundancy.

 

Creating a pangenome

In the menu bar, click "New" -> "New pangenome..." and a dialog will apear. Here you should select a name for your pangenome and add genomes from the database to the list of included genomes by selecting them and then clicking on the right arrow (">"). Then click "Finish" followed by "Start".

Creating a pangenome

 

Viewing the pangenome

The tab Overview

In short, the overview visualizes the fragments of the pangeomes similarity to each genome in the analysis. You can also obtain information of the origin of each fragment in the pangeome if you hoover over a part of the pangenome.

Overview of the pangenome On top of the view there is a horizontal line and some vertical lines crossing it. The horizontal line is a representation of the pangenome and the vertical lines separates the subsequences of the pangenome. Below the pangneome are more horizontal lines representing all the genomes in the analysis. If a fragment of the pangenome is similar enough (i.e. a BLAST score over a threshold caled the draw parameter) to a genome Gegenees draws a dot over the genome. Several dots in a row becomes a peak. The color of the peak indicates the mean distanse over the draw parameter for all fragments represented by that peak.

It is also noteworthy that the overview does not describe the position of the similarities to the pangenome.

 

The tab Fragment/Content distribution

The Fragment/Content dirstributions tab describes how many genomes each fragment in the pangenome covers. If some of the genomes in the pangenom are annotated you can also search for annotation features and locate their fragments in the distribution. The search syntax is case sensitive words separated by commas. It is also possible to visualize the distribution of fragments from individual genomes in the same way.

Annotation content This image shows the distribution of all fragments from the pangenome of 57 E.Coli strains. To the left is the core genome, wich is made up of fragments that exists in all 57 genomes in the analysis and to the right is fragments that exists in only one genome, i.e. completly unique fragments. On top of this, the fragments are colored according to the search terms, in this case "phage" and "hypothetical", which means that if a fragment origins from an annotated part of a genome and that annotation includes the termn "phage" or hypothetical", it will be colored blue or green respectively.

Genome content This image shows how individual genomes are distributed.

 

The tab Content table

The tab Content table shows fragments and annotations from a column in the fragment distribution, which means all fragments and annotaions covering a certain number of genomes in the pangenome analysis. Selecting a column can be done either by dubbleclicking on a column in the Fragment/Content distribution tab, or by selecting a column in the combo box at the top of the Content table tab. It is also possible to search for annotations in the table by using the textfield at the top of the tab.

Content table

 

The tab Genome coverage

The tab genome coverage how many fragments that covers a certain number of genomes. The x-axis represent the number of fragments and the y-axis the fraction of genomes covered. The draw parameter is a threshold that a fragment has to exeed in order to make the calculation.

Content table