PharaohFUN: Phylogenomic Analysis for Plant Protein History and Function Elucidation

PharaohFUN
allows researchers to explore orthologous proteins across different evolutionarily
distant species given a target protein. The tool focuses on photosynthetic eukaryotes, whether brown, red, green microalgae
or land plants. A tutorial can be found
here
. It offers flexibility in terms of the available search modalities, allowing the use of
up to 5 different ones:
Gene ID
Gene ID from one of the listed organisms
Sequence
Protein sequence from one of the listed organisms
Orthogroup ID
Orthogroup ID (from STRING result)
Batch mode
Sequences set from one of the listed organisms
New organism
Protein sequence from any organism
- Gene ID-based search . When the identifier of a gene in one of the species available in the database is known, it allows the direct search for a gene. Only one gene per query corresponding to a species supported by the tool.
- Sequence-based search . If an identifier is not known or differs from the nomenclature used by the tool, this modality maps an amino acid sequence onto the proteome of the species, identifying the protein and eliminating possible ambiguities. Only one sequence per query corresponding to a species supported by the tool.
- Orthogroup ID-based search . When the identifier of an orthogroup is known (typically from STRING results of previous searches), it is possible to explore it directly without resorting to searching for individual genes. Only one orthogroup per query.
- Batch mode search . It allows searching for multiple (up to 300) IDs or sequences automatically, creating a folder with the sorted results. All genes must correspond to a single species supported by the tool.
- New organism search . Unlike the previous cases, if you want to perform a study of a protein that does not come from a species supported by the tool, this search method allows the inclusion of the sequence in the gene tree and in subsequent analyses. Only one sequence per query from any organism.
The exploration of orthology is based on the construction of orthogroups, sets of genes that descend from a single gene in the common ancestor of the species under study. In this way, it is possible to trace the evolutionary history of these genes and to analyze the changes that the orthogroup has undergone from its appearance to its current situation in the extant species.
Regarding the species supported by the tool, two different models are offered. The default model focuses on 36 species of the Viridiplantae clade, with representatives of the Chlorophyta and Streptophyta clades chosen to span the evolutionary set of these groups.
Furthermore, an additional Global model includes species from the whole Archaeplastida clade, as well as examples of Stramenopiles and Cryptophytes. Note that although both models contain the green lineage, their results may vary because the common ancestors of the groups are very distant in time, so for a more accurate analysis it is recommended to use the Viridiplantae model, while for a more generalist one the Global model should yield more information.
After the construction of the tree, PharaohFUN allows the exploration of the proteins encoded by the orthogroup genes, implementing modules for interactive tree viewing, PFAM module determination, multiple sequence alignment, Gene Ontology terms annotation, KEGG pathways annotation, exploration of physical interactions between proteins and scientific literature annotation.

Gene ID-based search
Single gene, available organism
Organism selection
Please select the desired organisms from the following list for performing the analysis.
Please take care of selecting the organism whose gene ID is being inputted. Organisms in
green belong to Viridiplantae, while other colors indicate groups outside this clade.
Tree building method
Insert gene ID
Viridiplantae
The first step in any study using PharaohFUN is the construction of the
gene tree corresponding to the selected orthogroup. For this, four standard methods
are offered, select one before continuing. Default is performed with FastTree using Orthofinder 2
pipeline, which renders trees by aproximate maximum-likelihood and then reconciliate
gene trees with the species tree shown in the Home tab. For the other three methods,
no reconciliation is performed but support for each branch is shown as bootstrap
values.
Below you can write the ID associated to the protein whose evolutionary history you
wish to analyze. Then, select a model: Global or Viridiplantae (default
is Viridiplantae). Note that the groups TSAR
and Cryptophytes, Rhodophytes and Glaucophytes do not belong to Viridiplantae, so they
will be ignored in case this is the selected model.
Global
Results
Results for the query gene and species are displayed below, arranged in
different tabs. The execution of each analysis is initiated from the start
button within each of the tabs, with specific instructions for each analysis. First, go to Gene Tree tab,
which shows a gene tree containing the genes corresponding to the same orthogroup as the query in the
selected species. The rest of the tabs allow for a deeper exploration of these genes. Do not start any
analysis before creating gene tree.
This tab shows three different results. First of all, a complete list of the genes
that are assigned to the same orthogroup as the query. Secondly, the proportion of
genes of each species. Finally, a gene tree show the evolutionary relationships
between those genes. All results can be downloaded using the buttons at the bottom
of the page.
Please be patient, building tree ...
This tab allows an interactive visualization of the previous tree.
In particular, in situations where gene duplications have given rise
to several clades and you want to reduce the tree not in relation to
the species that appear, but to these clades, this visualization allows
the collapse of subtrees and the simple exploration of the areas of interest.
Press Show Collapsable Tree to show the tree.
Click the Show Evolutionary History button to calculate the species tree showing
the reconstruction of ancestral states of the orthogroup, i.e. the number of genes
calculated for each common ancestor of each species. In case there has been no significant
variation in the size of the orthogroup or the variability is so high between species that
the sizes cannot be reliably calculated, the tree will not be plotted. For orthogroups that
have undergone significant expansions or contractions along any branch, these are marked
in red and blue on the tree, respectively. In turn, the loss or non-existence of orthogroup
genes in the different clades is represented in gray.
Please be patient, reconstructing expansion/contraction events ...
Next, click on the Show Gene Selection for Pfam button, select the desired proteins
from the tree and click on the Show Pfam Domains button to determine their PFAM domains.
A table with the domains of each protein and their positions will be displayed, as well
as a plot showing the same information. Links to each domain's data are provided.
Warning: this process may take a long time in the case of selecting many proteins.
Please be patient, identifying domains ...
Press the Show Gene Selection for MSA button to enable the selection of genes
from the tree to be aligned. Two alignment methods are supported: ClustalOmega uses this
algorithm to perform a de novo multiple sequence alignment of the selected proteins, while
the second option filters the precalculated MSA of the entire orthogroup (computed using the
software MAFFT) to retain
only the selected sequences, removing columns containing only gaps in the chosen subset. Thus,
this second option retains the evolutionary framework to which the entire orthogroup is ascribed,
while the first aligns only the chosen sequences. The resulting alignment can be explored interactively,
including searching for patterns USING Selection -> Find Motifs and different cholor schemes can be applied
based on different properties of amino acids. A consensus sequence is shown to summarize the alignment. For
a graphical representation with the different amino acids colored according to their chemical nature, click
on the Download Colored MSA button. The aligned sequences can also be download as a standard FASTA file.
Warning: This process (specially the colored download) may take a long time in the case of selecting many proteins.
Please be patient, aligning sequences ...
Click on the button to select the genes of interest from the tree and select
the ontology type to display the GO terms associated with those genes. After
pressing the Show GO terms button, the results are displayed in tabular form
and are accompanied by a GO association plot (each node is a GO and an edge
is drawn between two nodes if a gene from the set share both terms) and a treeplot
showing the hierarchy of the identified terms and their summary. Links to each
individual GO term are avalaible from table and all results can be downloaded in
standard PNG and TSV files.
Please be patient, preparing results ...
Click on the button to select the genes of interest from the tree and press
the Show KEGG information button to show the results. These include a table
showing the KEGG Orthology IDs, indicating how many and which of the selected
proteins correspond to each ID. In addition, the application performs an enrichment
in KEGG pathways from the identified KOs and allows for the plotting of these
pathways with the genes mapped onto them, using a selector to choose the pathway
in case several enriched ones exist. Warning: this process may take a long time
in the case of selecting many proteins. Attempts to plot large pathways will
produce an error, only more specific ones are supported i.e. Carbon metabolism
will return an error while Biotin metabolism will not.
Please be patient, exploring pathways ...
This tab presents the results of interactions of the different proteins
for the species supported by the STRING database. After clicking the button,
the proteins of the species included in the database are displayed. Once the
desired selection has been made, press the Show String Interactions
button and a table will be displayed
indicating in each row an interaction and the type of evidence for its existence,
either experimental (direct) or ortholog-based (interolog). In addition, a count
of the orthogroups to which the target proteins belong is computed, presented in
tabular form and by means of a pie chart, to assess preferential interaction of
the orthogroup under study with other specific orthogroups. Finally, for
specific proteins, an
interaction network can be generated with the proteins that have the most reliable
interactions with the selected protein. In case more than one protein is selected,
all of them will appear in the network, which will contain their interactions to
determine pathways between them. The network is shown as an image and a button
that creates the network is provided on the STRING page, where an interactive
network with additional protein structure information and SMART domains appears.
Species currently supported by STRING: Aegilops, Arabidopsis, Bathycoccus,
Chara, Chlamydomonas, Coccomyxa, Cyanidioschyzon, Galdieria, Gracilariopsis,
Guillardia, Klebsormidium, Micromonas, Oryza, Ostreococcus, Phaeodactylum,
Physcomitrium, Raphidocelis, Scenedesmus, Selaginella, Solanum, Sorghum,
Triticum and Volvox. Warning: the selection of many proteins in the first
step can lead to particularly high waiting times.
Please be patient, retrieving interactions ...
In the text box, type the search term to query your bibliographic information,
i.e., CCA1. Then select one of the 4 search modes: Normal returns the entities
containing the term, Exact returns those that are identical, Alias returns all
aliases associated with the term and Substring returns all those containing a
given string. Associations found in the literature are returned in tabular form,
with links to the papers from which the information was extracted. This
functionality is based on PlantConnectome, to extend this analysis, it is recommended
to go to https://connectome.plant.tools/.
Please be patient, browsing literature ...
Sequence-based search
Single gene, available organism
Organism selection
Please select the desired organisms from the following list for performing the analysis.
Please take care of selecting the organism whose gene ID is being inputted. Organisms in
green belong to Viridiplantae, while other colors indicate groups outside this clade.
Tree building method
Insert a protein chain
Viridiplantae
The first step in any study using PharaohFUN is the construction of the
gene tree corresponding to the selected orthogroup. For this, four standard methods
are offered, select one before continuing. Default is performed with FastTree using Orthofinder 2
pipeline, which renders trees by aproximate maximum-likelihood and then reconciliate
gene trees with the species tree shown in the Home tab. For the other three methods,
no reconciliation is performed but support for each branch is shown as bootstrap
values.
Below you can paste the sequence of the protein whose evolutionary history you
wish to analyze. Then use the selection bar to choose the organism which the pasted sequence
belongs to. After pasting it, select a model: Global or Viridiplantae (default
is Viridiplantae). Note that the groups TSAR
and Cryptophytes, Rhodophytes and Glaucophytes do not belong to Viridiplantae, so they
will be ignored in case this is the selected model.
Global
Results
Results for the query gene and species are displayed below, arranged in
different tabs. The execution of each analysis is initiated from the start
button within each of the tabs, with specific instructions for each analysis. First, go to Gene Tree tab,
which shows best matches for query sequence and a tree containing the genes corresponding to the same
orthogroup as the query in the
selected species. The rest of the tabs allow for a deeper exploration of these genes. Do not start any
analysis before creating gene tree.
This tab shows four different results. First of all, a table with up to 5 best
matches for the query sequence in the chosen proteome, with decreasing confidence. First row
corresponds to best match, which is used to perform subsequent analysis. If you are interested
in another one, please copy de ID and paste it in Gene ID-based search tab. Secondly, a complete list of the genes
that are assigned to the same orthogroup as the query. Next, the proportion of genes of each species.
Finally, a gene tree show the evolutionary relationships between those genes. All results can be downloaded
using the buttons at the bottom of the page.
Please be patient, building tree ...
This tab allows for an interactive visualization of the previous tree.
In particular, in situations where gene duplications have given rise
to several clades and you want to reduce the tree not in relation to
the species that appear, but to these clades, this visualization allows
the collapse of subtrees and the simple exploration of the areas of interest.
Press Show Collapsable Tree to show the tree.
Click the Show Evolutionary History button to calculate the species tree showing
the reconstruction of ancestral states of the orthogroup, i.e. the number of genes
calculated for each common ancestor of each species. In case there has been no significant
variation in the size of the orthogroup or the variability is so high between species that
the sizes cannot be reliably calculated, the tree will not be plotted. For orthogroups that
have undergone significant expansions or contractions along any branch, these are marked
in red and blue on the tree, respectively. In turn, the loss or non-existence of orthogroup
genes in the different clades is represented in gray.
Please be patient, reconstructing expansion/contraction events ...
Next, click on the Show Gene Selection for Pfam button, select the desired proteins
from the tree and click on the Show Pfam Domains button to determine their PFAM domains.
A table with the domains of each protein and their positions will be displayed, as well
as a plot showing the same information. Links to each domain's data are provided.
Warning: this process may take a long time in the case of selecting many proteins.
Please be patient, identifying domains ...
Press the Show Gene Selection for MSA button to enable the selection of genes
from the tree to be aligned. Two alignment methods are supported: ClustalOmega uses this
algorithm to perform a de novo multiple sequence alignment of the selected proteins, while
the second option filters the precalculated MSA of the entire orthogroup (computed using the
software MAFFT) to retain
only the selected sequences, removing columns containing only gaps in the chosen subset. Thus,
this second option retains the evolutionary framework to which the entire orthogroup is ascribed,
while the first aligns only the chosen sequences. The resulting alignment can be explored interactively,
including searching for patterns USING Selection -> Find Motifs and different cholor schemes can be applied
based on different properties of amino acids. A consensus sequence is shown to summarize the alignment. For
a graphical representation with the different amino acids colored according to their chemical nature, click
on the Download Colored MSA button. The aligned sequences can also be download as a standard FASTA file.
Warning: This process (specially the colored download) may take a long time in the case of selecting many proteins.
Please be patient, aligning sequences ...
Click on the button to select the genes of interest from the tree and select
the ontology type to display the GO terms associated with those genes. After
pressing the Show GO terms button, the results are displayed in tabular form
and are accompanied by a GO association plot (each node is a GO and an edge
is drawn between two nodes if a gene from the set share both terms) and a treeplot
showing the hierarchy of the identified terms and their summary. Links to each
individual GO term are avalaible from table and all results can be downloaded in
standard PNG and TSV files.
Please be patient, preparing results ...
Click on the button to select the genes of interest from the tree and press
the Show KEGG information button to show the results. These include a table
showing the KEGG Orthology IDs, indicating how many and which of the selected
proteins correspond to each ID. In addition, the application performs an enrichment
in KEGG pathways from the identified KOs and allows for the plotting of these
pathways with the genes mapped onto them, using a selector to choose the pathway
in case several enriched ones exist. Warning: this process may take a long time
in the case of selecting many proteins. Attempts to plot large pathways will
produce an error, only more specific ones are supported i.e. Carbon metabolism
will return an error while Biotin metabolism will not.
Please be patient, exploring pathways ...
This tab presents the results of interactions of the different proteins
for the species supported by the STRING database. After clicking the button,
the proteins of the species included in the database are displayed. Once the
desired selection has been made, press the Show String Interactions
button and a table will be displayed
indicating in each row an interaction and the type of evidence for its existence,
either experimental (direct) or ortholog-based (interolog). In addition, a count
of the orthogroups to which the target proteins belong is computed, presented in
tabular form and by means of a pie chart, to assess preferential interaction of
the orthogroup under study with other specific orthogroups. Finally, for
specific proteins, an
interaction network can be generated with the proteins that have the most reliable
interactions with the selected protein. In case more than one protein is selected,
all of them will appear in the network, which will contain their interactions to
determine pathways between them. The network is shown as an image and a button
that creates the network is provided on the STRING page, where an interactive
network with additional protein structure information and SMART domains appears.
Species currently supported by STRING: Aegilops, Arabidopsis, Bathycoccus,
Chara, Chlamydomonas, Coccomyxa, Cyanidioschyzon, Galdieria, Gracilariopsis,
Guillardia, Klebsormidium, Micromonas, Oryza, Ostreococcus, Phaeodactylum,
Physcomitrium, Raphidocelis, Scenedesmus, Selaginella, Solanum, Sorghum,
Triticum and Volvox. Warning: the selection of many proteins in the first
step can lead to particularly high waiting times.
Please be patient, retrieving interactions ...
In the text box, type the search term to query your bibliographic information,
i.e., CCA1. Then select one of the 4 search modes: Normal returns the entities
containing the term, Exact returns those that are identical, Alias returns all
aliases associated with the term and Substring returns all those containing a
given string. Associations found in the literature are returned in tabular form,
with links to the papers from which the information was extracted. This
functionality is based on PlantConnectome, to extend this analysis, it is recommended
to go to https://connectome.plant.tools/.
Please be patient, browsing literature ...
Orthogroup ID-based search
Single orthogroup, available organism
Organism selection
Please select the desired organisms from the following list for performing the analysis.
Please take care of selecting organisms that are present in the selected orthogroup. Organisms in
green belong to Viridiplantae, while other colors indicate groups outside this clade.
Tree building method
Insert Orthogroup ID
Viridiplantae
The first step in any study using PharaohFUN is the construction of the
gene tree corresponding to the selected orthogroup. For this, four standard methods
are offered, select one before continuing. Default is performed with FastTree using Orthofinder 2
pipeline, which renders trees by aproximate maximum-likelihood and then reconciliate
gene trees with the species tree shown in the Home tab. For the other three methods,
no reconciliation is performed but support for each branch is shown as bootstrap
values.
Below you can write the ID corresponding to the desired orthogroup. It is specially intended
to study STRING's results from previous queries. Then, select a model: Global
or Viridiplantae (default is Viridiplantae). Note that the groups TSAR
and Cryptophytes, Rhodophytes and Glaucophytes do not belong to Viridiplantae, so they
will be ignored in case this is the selected model.
Global
Results
Results for the query orthogroup and species are displayed below, arranged in
different tabs. The execution of each analysis is initiated from the start
button within each of the tabs, with specific instructions for each analysis. First, go to Gene Tree tab,
which shows a tree containing the genes corresponding to the orthogroup in the selected species.
The rest of the tabs allow for a deeper exploration of these genes. Do not start any analysis before
creating gene tree.
This tab shows three different results. First of all, a complete list of the genes
that are assigned to the query orthogroup. Secondly, the proportion of
genes of each species. Finally, a gene tree show the evolutionary relationships
between those genes. All results can be downloaded using the buttons at the bottom
of the page.
Please be patient, building tree ...
This tab allows an interactive visualization of the previous tree.
In particular, in situations where gene duplications have given rise
to several clades and you want to reduce the tree not in relation to
the species that appear, but to these clades, this visualization allows
the collapse of subtrees and the simple exploration of the areas of interest.
Press Show Collapsable Tree to show the tree.
Click the Show Evolutionary History button to calculate the species tree showing
the reconstruction of ancestral states of the orthogroup, i.e. the number of genes
calculated for each common ancestor of each species. In case there has been no significant
variation in the size of the orthogroup or the variability is so high between species that
the sizes cannot be reliably calculated, the tree will not be plotted. For orthogroups that
have undergone significant expansions or contractions along any branch, these are marked
in red and blue on the tree, respectively. In turn, the loss or non-existence of orthogroup
genes in the different clades is represented in gray.
Please be patient, reconstructing expansion/contraction events ...
Next, click on the Show Gene Selection for Pfam button, select the desired proteins
from the tree and click on the Show Pfam Domains button to determine their PFAM domains.
A table with the domains of each protein and their positions will be displayed, as well
as a plot showing the same information. Links to each domain's data are provided.
Warning: this process may take a long time in the case of selecting many proteins.
Please be patient, identifying domains ...
Press the Show Gene Selection for MSA button to enable the selection of genes
from the tree to be aligned. Two alignment methods are supported: ClustalOmega uses this
algorithm to perform a de novo multiple sequence alignment of the selected proteins, while
the second option filters the precalculated MSA of the entire orthogroup (computed using the
software MAFFT) to retain
only the selected sequences, removing columns containing only gaps in the chosen subset. Thus,
this second option retains the evolutionary framework to which the entire orthogroup is ascribed,
while the first aligns only the chosen sequences. The resulting alignment can be explored interactively,
including searching for patterns USING Selection -> Find Motifs and different cholor schemes can be applied
based on different properties of amino acids. A consensus sequence is shown to summarize the alignment. For
a graphical representation with the different amino acids colored according to their chemical nature, click
on the Download Colored MSA button. The aligned sequences can also be download as a standard FASTA file.
Warning: This process (specially the colored download) may take a long time in the case of selecting many proteins.
Please be patient, aligning sequences ...
Click on the button to select the genes of interest from the tree and select
the ontology type to display the GO terms associated with those genes. After
pressing the Show GO terms button, the results are displayed in tabular form
and are accompanied by a GO association plot (each node is a GO and an edge
is drawn between two nodes if a gene from the set share both terms) and a treeplot
showing the hierarchy of the identified terms and their summary. Links to each
individual GO term are avalaible from table and all results can be downloaded in
standard PNG and TSV files.
Please be patient, preparing results ...
Click on the button to select the genes of interest from the tree and press
the Show KEGG information button to show the results. These include a table
showing the KEGG Orthology IDs, indicating how many and which of the selected
proteins correspond to each ID. In addition, the application performs an enrichment
in KEGG pathways from the identified KOs and allows for the plotting of these
pathways with the genes mapped onto them, using a selector to choose the pathway
in case several enriched ones exist. Warning: this process may take a long time
in the case of selecting many proteins. Attempts to plot large pathways will
produce an error, only more specific ones are supported i.e. Carbon metabolism
will return an error while Biotin metabolism will not.
Please be patient, exploring pathways ...
This tab presents the results of interactions of the different proteins
for the species supported by the STRING database. After clicking the button,
the proteins of the species included in the database are displayed. Once the
desired selection has been made, press the Show String Interactions
button and a table will be displayed
indicating in each row an interaction and the type of evidence for its existence,
either experimental (direct) or ortholog-based (interolog). In addition, a count
of the orthogroups to which the target proteins belong is computed, presented in
tabular form and by means of a pie chart, to assess preferential interaction of
the orthogroup under study with other specific orthogroups. Finally, for
specific proteins, an
interaction network can be generated with the proteins that have the most reliable
interactions with the selected protein. In case more than one protein is selected,
all of them will appear in the network, which will contain their interactions to
determine pathways between them. The network is shown as an image and a button
that creates the network is provided on the STRING page, where an interactive
network with additional protein structure information and SMART domains appears.
Species currently supported by STRING: Aegilops, Arabidopsis, Bathycoccus,
Chara, Chlamydomonas, Coccomyxa, Cyanidioschyzon, Galdieria, Gracilariopsis,
Guillardia, Klebsormidium, Micromonas, Oryza, Ostreococcus, Phaeodactylum,
Physcomitrium, Raphidocelis, Scenedesmus, Selaginella, Solanum, Sorghum,
Triticum and Volvox. Warning: the selection of many proteins in the first
step can lead to particularly high waiting times.
Please be patient, retrieving interactions ...
In the text box, type the search term to query your bibliographic information,
i.e., CCA1. Then select one of the 4 search modes: Normal returns the entities
containing the term, Exact returns those that are identical, Alias returns all
aliases associated with the term and Substring returns all those containing a
given string. Associations found in the literature are returned in tabular form,
with links to the papers from which the information was extracted. This
functionality is based on PlantConnectome, to extend this analysis, it is recommended
to go to https://connectome.plant.tools/.
Please be patient, browsing literature ...
Batch mode search
Set of genes, available organism
Organism selection
Please select the desired organisms from the following list for performing the analysis.
Please take care of selecting the organism whose sequences or IDs are being used as input. The
selected organisms will be the only ones that appear in heatmap and gene tables. Organisms in
green belong to Viridiplantae, while other colors indicate groups outside this clade.
Choose Sequence File to Upload
Viridiplantae
Sequences
Below you can upload a file with a list of gene IDs (with a single ID per line) or a FASTA file
containing custom identifiers for each sequence. These IDs or sequences must correspond to one
of the available organisms, and the correct species must be selected prior to analysis on the
selection bar to the right of the Run button. Then, select a model: Global
or Viridiplantae (default is Viridiplantae). Note that the groups TSAR
and Cryptophytes, Rhodophytes and Glaucophytes do not belong to Viridiplantae, so they
will be ignored in case this is the selected model. Finally, choose if the upload file
contains IDs or Sequences and click Run to begin computations.
Global
IDs
Results
Results for the query sequences or IDs and species are displayed below. Using the download button, user can
access a compressed file with results arranged in different subfolders. First, a table in TSV format indicates
the mapping of each sequence against the proteome and its associated metrics (in case sequence sarch mode is
selected). Each of the subfolders corresponds to a query sequence/ID and contains the gene tree of its
orthogroup (without filtering by species), the MSA of the complete orthogroup, the expansion/contraction
results, the KO and GO annotations for the genes from the selected species and a table with the genes
forming the orthogroup in the selected species. Additionally, a visual summary of the number of genes
within each orthogroup relative to each of the sequences/IDs is presented in the form of a heatmap,
which allows to determine if in any of the selected species some orthogroups have undergone expansions
or if there are orthogroups that are not present in all species.
Please be patient, preparing your folders ...
New organism sequence search
Single gene, custom organism
Organism selection
Please select the desired organisms from the following list for performing the analysis.
It is recommended to select organisms that span the evolutionary placement of the custom
organism to allow for better resolution of the created tree. Organisms in
green belong to Viridiplantae, while other colors indicate groups outside this clade.
Insert a protein chain
Viridiplantae
Below you can paste the sequence of the protein whose evolutionary history you
wish to analyze. After pasting it, select a model: Global or Viridiplantae (default
is Viridiplantae). Note that the groups TSAR
and Cryptophytes, Rhodophytes and Glaucophytes do not belong to Viridiplantae, so they
will be ignored in case this is the selected model.
Global
Results
Results for the query sequence and species are displayed below, arranged in
different tabs. The execution of each analysis is initiated from the start
button within each of the tabs, with specific instructions for each analysis. First, go to Gene Tree tab,
which shows a tree containing the genes corresponding to the same orthogroup as the query in the
selected species. The rest of the tabs allow for a deeper exploration of these genes. Do not start any
analysis before creating gene tree.
This tab shows three different results. First of all, a complete list of the genes
that are assigned to the same orthogroup as the query. Secondly, the proportion of
genes of each species. Finally, a gene tree show the evolutionary relationships
between those genes. Query sequence placement is highlighted in red. All results
can be downloaded using the buttons at the bottom
of the page.
Please be patient, building tree ...
This tab allows an interactive visualization of the previous tree.
In particular, in situations where gene duplications have given rise
to several clades and you want to reduce the tree not in relation to
the species that appear, but to these clades, this visualization allows
the collapse of subtrees and the simple exploration of the areas of interest.
Press Show Collapsable Tree to show the tree.
Click the Show Evolutionary History button to calculate the species tree showing
the reconstruction of ancestral states of the orthogroup, i.e. the number of genes
calculated for each common ancestor of each species. In case there has been no significant
variation in the size of the orthogroup or the variability is so high between species that
the sizes cannot be reliably calculated, the tree will not be plotted. For orthogroups that
have undergone significant expansions or contractions along any branch, these are marked
in red and blue on the tree, respectively. In turn, the loss or non-existence of orthogroup
genes in the different clades is represented in gray.
Please be patient, reconstructing expansion/contraction events ...
Next, click on the Show Gene Selection for Pfam button, select the desired proteins
from the tree and click on the Show Pfam Domains button to determine their PFAM domains.
A table with the domains of each protein and their positions will be displayed, as well
as a plot showing the same information. Links to each domain's data are provided.
Warning: this process may take a long time in the case of selecting many proteins.
Please be patient, identifying domains ...
Press the Show Gene Selection for MSA button to enable the selection of genes
from the tree to be aligned. Two alignment methods are supported: ClustalOmega uses this
algorithm to perform a de novo multiple sequence alignment of the selected proteins, while
the second option filters the precalculated MSA of the entire orthogroup (computed using the
software MAFFT) to retain
only the selected sequences, removing columns containing only gaps in the chosen subset. Thus,
this second option retains the evolutionary framework to which the entire orthogroup is ascribed,
while the first aligns only the chosen sequences. The resulting alignment can be explored interactively,
including searching for patterns USING Selection -> Find Motifs and different cholor schemes can be applied
based on different properties of amino acids. A consensus sequence is shown to summarize the alignment. For
a graphical representation with the different amino acids colored according to their chemical nature, click
on the Download Colored MSA button. The aligned sequences can also be download as a standard FASTA file.
Warning: This process (specially the colored download) may take a long time in the case of selecting many proteins.
Please be patient, aligning sequences ...
Click on the button to select the genes of interest from the tree and select
the ontology type to display the GO terms associated with those genes. After
pressing the Show GO terms button, the results are displayed in tabular form
and are accompanied by a GO association plot (each node is a GO and an edge
is drawn between two nodes if a gene from the set share both terms) and a treeplot
showing the hierarchy of the identified terms and their summary. Links to each
individual GO term are avalaible from table and all results can be downloaded in
standard PNG and TSV files.
Please be patient, preparing results ...
Click on the button to select the genes of interest from the tree and press
the Show KEGG information button to show the results. These include a table
showing the KEGG Orthology IDs, indicating how many and which of the selected
proteins correspond to each ID. In addition, the application performs an enrichment
in KEGG pathways from the identified KOs and allows for the plotting of these
pathways with the genes mapped onto them, using a selector to choose the pathway
in case several enriched ones exist. Warning: this process may take a long time
in the case of selecting many proteins. Attempts to plot large pathways will
produce an error, only more specific ones are supported i.e. Carbon metabolism
will return an error while Biotin metabolism will not.
Please be patient, exploring pathways ...
This tab presents the results of interactions of the different proteins
for the species supported by the STRING database. After clicking the button,
the proteins of the species included in the database are displayed. Once the
desired selection has been made, press the Show String Interactions
button and a table will be displayed
indicating in each row an interaction and the type of evidence for its existence,
either experimental (direct) or ortholog-based (interolog). In addition, a count
of the orthogroups to which the target proteins belong is computed, presented in
tabular form and by means of a pie chart, to assess preferential interaction of
the orthogroup under study with other specific orthogroups. Finally, for
specific proteins, an
interaction network can be generated with the proteins that have the most reliable
interactions with the selected protein. In case more than one protein is selected,
all of them will appear in the network, which will contain their interactions to
determine pathways between them. The network is shown as an image and a button
that creates the network is provided on the STRING page, where an interactive
network with additional protein structure information and SMART domains appears.
Species currently supported by STRING: Aegilops, Arabidopsis, Bathycoccus,
Chara, Chlamydomonas, Coccomyxa, Cyanidioschyzon, Galdieria, Gracilariopsis,
Guillardia, Klebsormidium, Micromonas, Oryza, Ostreococcus, Phaeodactylum,
Physcomitrium, Raphidocelis, Scenedesmus, Selaginella, Solanum, Sorghum,
Triticum and Volvox. Warning: the selection of many proteins in the first
step can lead to particularly high waiting times.
Please be patient, retrieving interactions ...
In the text box, type the search term to query your bibliographic information,
i.e., CCA1. Then select one of the 4 search modes: Normal returns the entities
containing the term, Exact returns those that are identical, Alias returns all
aliases associated with the term and Substring returns all those containing a
given string. Associations found in the literature are returned in tabular form,
with links to the papers from which the information was extracted. This
functionality is based on PlantConnectome, to extend this analysis, it is recommended
to go to https://connectome.plant.tools/.
Please be patient, browsing literature ...
Whole Datasets
Complete data, available organisms
Organism selection
Please select the desired organisms from the following list for filtering the complete dataset. The
selected organisms will be the only ones that appear in the final table. Organisms in
green belong to Viridiplantae, while other colors indicate groups outside this clade.
Choose Model Data to Download
Viridiplantae
Below you must select the model data to download: Global
or Viridiplantae (default is Viridiplantae). Note that the groups TSAR
and Cryptophytes, Rhodophytes and Glaucophytes do not belong to Viridiplantae, so they
will be ignored in case this is the selected model. Then, click Run to create the table.
Global
Results
Use the button to download the filtered dataset.
Download genomes
Complete genome, available organism
Organism selection
Please select the desired organisms from the following list for filtering the complete dataset. The
selected organisms will be the only ones that appear in the final table. Organisms in
green belong to Viridiplantae, while other colors indicate groups outside this clade.
Contact and Info
Acknowledgments and other information
Authors: Marcos Ramos González, Víctor
Ramos González, Emma Serrano Pérez, Christina Arvanitidou, Jorge Hernández García, Mercedes García González and Francisco José Romero Campero.
We are strongly committed to open access software and open science. PharaohFUN's source code is available
at GitHub following the lateral panel link and is released under a GNU General Public License v3.0. If you
experience any problem using PharaohFUN, please create an issue in GitHub and we will address it. For other
inquiries, send an email to mramos5@us.es.
All organisms images where acquired from PhyloPic. Next, we present a list with the authors of each one:
- Phaeodactylum by Jonathan Wells.
- Porphyra by Guillaume Dera.
- Cyanophora by Guillaume Dera.
- Ostreococcus by Guillaume Dera.
- Scenedesmus by Sergio A. Muñoz-Gómez.
- Klebsormidium by Matt Crook.
- Marchantia by Guillaume Dera.
- Selaginella by Mason McNair.
- Arabidopsis by Mason McNair.
- Chlamydomonas by Sergio A. Muñoz-Gómez.
- Haematococcus by Matthew Crook.
- Volvox by Matthew Crook.
- Zygnema by Matthew Crook
- Marchantia by T. Michael Keesey.
- Araucaria by T. Michael Keesey.
- Gnetum by T. Michael Keesey.
Some of these images are licensed under an
Attribution-NonCommercial-ShareAlike 3.0 Unported
or
Attribution-ShareAlike 3.0 Unported
license.