Previous Up Next

Chapter 5  Reconciliation Mode

In Reconciliation mode, Notung compares a gene tree with a species tree to infer gene duplications and losses. Notung will display a reconciled tree in the tree panel with the inferred duplications and losses indicated on the tree. The D/L Score of a reconciled tree will be displayed in the lower left corner of the screen (see Figure 5.1(b)).

Click on image to see larger version


Figure 5.1: A binary gene tree before and after reconciliation with the species tree in Figure 3.1b.

Notung requires that gene and species trees have compatible labels, so that the species from which each gene originated can be identified. An error message will appear if one or more gene labels cannot be matched to a label in the species tree. See Appendix A.4 - Specifying the Species Associated with Each Gene for further information on gene labels.

All species represented in the gene tree must be present in the species tree, but the species tree may include additional species. During reconciliation, Notung automatically identifies the species in the species tree that are not present in the gene tree, and generates a pruned species tree with those species removed. The pruned species tree is stored in Notung’s internal data structures. This tree is not shown or saved unless the user does so explicitly.

Once a gene tree has been reconciled, Notung can infer orthologous and paralogous relationships, described in Section 5.3. Notung can also determine lower and upper bounds on the time of each duplication and conditional duplication, where bounds are represented in terms of internal nodes in the species tree; i.e., relative to speciation events. The upper bound on the time of duplication is the most recent species in which the duplication was not present. The lower bound is the oldest species in which the duplication must have been present. This information, along with statistics on losses, can be viewed in a pop-up window by selecting “Duplication Bounds and Loss Counts” from the “About This Tree” menu. Duplications and bounds in this window are identified by internal node names. For losses, each node in the species tree is listed, followed by the number of losses associated with that taxon.

5.1  Reconciling Non-Binary Trees

Notung can reconcile binary gene trees with non-binary species trees, as well as non-binary gene trees with binary species trees. The differences between these functions and traditional reconciliation of binary gene trees with binary species trees are summarized briefly here. For a more detailed discussion of reconciliation with non-binary trees, see Chapter 4 - Non-Binary Trees. Note that orthologs and paralogs can only be inferred on binary gene trees reconciled with binary species trees.

Reconciling a binary gene tree with a non-binary species tree results in a binary gene tree with duplications and losses added. Notung distinguishes between cases in which disagreement can only be explained by a gene duplication (required duplications) and cases in which it is not possible to determine whether the disagreement is due to deep coalescence or gene duplication (conditional duplications). When reconciling a gene tree with a non-binary species tree, duplications appear in the tree as small red squares with red D’s, while conditional duplications are small pink squares with pink cD’s (see Figure 5.2).

Click on image to see larger version


Figure 5.2: A binary gene tree reconciled with the non-binary species tree in Figure 4.1. Conditional duplications are marked by pink cD’s, while required duplications are indicated with red D’s. Polytomy losses are labeled with the name of the associated polytomy, as well as the information about the species from which they are absent.

If two or more orthologous genes are missing from species that are children of the same polytomy, then it is more parsimonious to infer a loss of the common ancestor of those genes. We refer to such losses as polytomy losses. For example, in Figure 5.2, members of the hypothetical Y gene family are missing from two species, bandicoot and opossum. These species are children of the same polytomy in the species tree in Figure 4.1. Notung infers a single loss, labeled with the names of species from which the gene is absent, as well as the label of the corresponding polytomy in the species tree. By default, polytomy losses are labeled with the species that lack the gene. However, if a polytomy loss is associated with many sibling species, the default display can produce very long labels. Users can instead opt to label polytomy losses with the number of species in which the loss occurred, as well as the label and the total number of children of the polytomy, illustrated in Figure 5.2(b).

Reconciling a non-binary gene tree with a binary species tree results in a non-binary, reconciled gene tree. A reconciled, binary gene tree can be obtained by using the Resolve function (see Chapter 8 - Resolve Mode).

Reconciliation of a non-binary gene tree with a binary species tree differs from binary reconciliation in two important ways. First, a polytomy in a non-binary gene tree may be annotated with more than one duplication. For example, the reconciled non-binary gene tree in Figure 5.3(a) has a polytomy annotated with two duplications and a loss.

Click on image to see larger version


Figure 5.3: Reconciliation of a non-binary gene tree with the binary species tree in (b). More than one duplication may be inferred at polytomies in the gene tree. In addition, it is possible to have more than one optimal event history, as seen in the lower left-hand corner of the reconciliation panel in (a).

Recall that a gene tree polytomy is an indication that although its children evolved by successive binary divergences, the order in which the taxa diverged is unknown. Since this binary branching pattern is unknown, the relative order of duplications and losses with respect to those divergences cannot not be determined, either. The polytomy in Figure 5.3(a) communicates that at least two duplications and one loss occurred in the subtree descending from the polytomy, but the exact timing of those events is unknown. See Chapter 4 - Non-Binary Trees for a detailed explanation of duplications and losses in reconciled non-binary gene trees.

Second, there may be several alternate hypotheses for the reconciliation of a non-binary gene tree. Since the true binary branching pattern of a polytomy is unknown, Notung infers duplications and losses for all binary resolutions with minimal D/L Score. If there is more than one optimal binary resolution, multiple reconciliations will result. Notung addresses this issue by presenting all alternate event histories to the user. Each event history represents a different combination of duplications and losses that could result in the same minimal D/L Score. Initially, Notung arbitrarily selects one event history to present in the tree panel. The other optimal histories may be viewed using the drop-down menu labeled “Select an optimal event history,” as shown in Figure 5.3. This menu gives a list of up to 50 optimal event histories. If there are more than 50 optimal event histories, they can be generated using the Command Line Interface (see Chapter 12 - Command Line Options and Batch Processing). For a more detailed discussion of alternate event histories, see Chapter 7 - Rearrange Mode.

5.2  Using Reconciliation Commands

To reconcile a gene tree with a species tree:

  1. Click the Reconciliation tab to enter Reconciliation mode.
  2. Click the “Reconcile/Rereconcile” button. A dialog box appears.
  3. In the dialog box, select the correct species tree in the drop-down menu.
  4. Check that Notung correctly identified the species naming convention used in the gene tree. The available settings are:
    If the convention selected by Notung is not the naming convention used in the gene tree, change it by selecting the appropriate radio button. See Appendix A.4 - Specifying the Species Associated with Each Gene for details about species tag specifications.
    NOTE: The Prefix and Postfix formats require species names to be embedded in the gene names. NHX Species Tag format embeds the species information in a Newick comment field. When this format is used, the information will not appear on the screen unless the “Display Leaf Node Species Names” option in the Display Options menu is selected (See Chapter 11.1 - Display Options).
  5. In the dialog box, click “Reconcile.”

The reconciled tree appears in the tree panel (see Figure 5.1(b)). Duplication nodes are indicated by a square and the letter “D”, shown in red. In non-binary gene trees, the number of duplications associated with a polytomy will also be shown with a red D (e.g.Figure 5.3(a)). Loss nodes appear in light gray type and state in which species the loss occurred. A message at the bottom of the program window reminds you which species tree was used in reconciliation (e.g., “Reconciled with: <speciestreeName>”; see Figure 5.2).

To hide loss nodes/duplications:

The duplication marks or loss nodes can be hidden to avoid a cluttered image.

Options that are not currently available are displayed in gray type to indicate that they are disabled. In particular, the above options will be grayed out if no reconciliation has been performed. The “Display Conditional Duplications” option will also be displayed in gray if the gene tree was reconciled with a binary species tree.

To view alternate optimal event histories:

If the gene tree is non-binary, there may be more than one reconciliation. If more than one optimal event history exists for a rearranged tree, the drop down menu, “Select an optimal event history,” will be enabled.

If there is only one optimal history or if the tree has not been reconciled, the drop down menu will be grayed out. Recall that in Reconciliation mode multiple optimal histories are only possible when the gene tree is non-binary.

To undo the reconciliation:

To display a pruned species tree:

  1. Click the “Show pruned species tree” button. A dialog box appears.
  2. Enter a title in the text field and click “OK.”

This option is grayed out if the gene tree has not been reconciled.

To show time bounds and information on losses:

This option is grayed out if the gene tree has not been reconciled.

To display the number of species in polytomy losses:

By default, polytomy losses are labeled with the names of the species from which they are absent.

  1. Go to the “Display Options menu”.
  2. Click the “Use Species Names in Polytomy Losses” box.
This causes polytomy losses to be labeled with the number of children of the polytomy lost, the total number of children of the polytomy, and the name of the polytomy in which these losses occurred.

5.3  Inferring Orthologs and Paralogs

Notung can infer orthologous and paralogous relationships between genes in binary gene trees reconciled with binary species trees. Recall that two genes are orthologous if they diverged from a common ancestor via speciation. If they diverged by duplication, they are paralogous [7, 6]. Notung infers orthology by finding the least common ancestor of two genes in a gene tree. If that least common ancestor is a duplication node, then the two genes are paralogous. Otherwise, the two genes are orthologous.

Notung will output a matrix of pairwise orthologous and paralogous relationships in several table formats. In addition, the Notung GUI includes an interactive Ortholog/Paralog feature in the Reconciliation task panel, that allows the user to investigate these features through a point and click interface.

Ortholog/Paralog Tables

Orthologs and paralogs can be reported in comma-separated (CSV), tab separated, or HTML formatted tables. For each of these options, genes in the gene tree are listed in both column and row headers. Orthologous genes are indicated by an “O” in the table, while paralogous genes are indicated by a “P.” An example table, showing orthologs and paralogs from genetree_SMALL, is shown in Table 5.1. In HTML tables, CSS is used to color cells representing orthologs with a blue background, and cells representing paralogs with a pink background.


Homolog Table for: genetree_SMALL
P == Paralogous
O == Orthologous
. == Genes on X and Y axis are the same.
 gB_humangA_humangA_mouseg_gorillagB_mouse gY_cowgX_cow
gB_human.PPPPOO
gA_humanP.PPPOO
gA_mousePP.OPOO
g_gorillaPPO.POO
gB_mousePPPP.OO
gY_cowOOOOO.P
gX_cowOOOOOP.
Table 5.1: An example Ortholog/Paralog table, showing orthologs and paralogs from genetree_SMALL, reconciled with speciestree_SMALL. Orthologous genes are labeld with ’O’, Paralogous genes are labeled with ’P’. Notice that this table is symmetric. Cells at the intersection of the column and row representing the same gene are labeled with ’.’.

To view an Ortholog/Paralog table:

  1. Go to the “About This Tree” menu.
  2. Click the “Ortholog/Paralog Table” option with the desired format (CSV, Tab delimited, or HTML).
    NOTE: The selected table will be displayed in a popup dialog box. To copy the table, click “Copy to clipboard”. Tab delimited tables can usually be pasted directly into spreadsheet applications like Excel. CSV formatted tables can be opened by most spreadsheet programs via the file menu. HTML format tables can be pasted directly into web pages.

Interactive Ortholog/Paralog Mode

To enter the interactive Ortholog/Paralog mode, click on the “Orthologs/Paralogs” button in the Reconciliation task panel. A legend will appear in the tree panel. Mousing over or clicking on a gene will highlight it in light blue. Orthologs of this gene are highlighted in darker blue, and paralogs are highlighted in pink. The legend can be minimized by clicking on “hide”, in the legend. Click on the minimized legend to show the full legend again. The legend can be dismissed entirely by clicking “close”. The next time you enter Ortholog/Paralog mode, the legend will be visible again.

NOTE: If you use “File Save Current View as Image (PNG)”, the image will contain the Ortholog/Paralog legend, and if a gene is currently selected, orthologs and paralogs of that gene. Currently, “File Save Whole Tree as Image (PNG)” will not show orthologs and paralogs.


Previous Up Next