Natural products are a major source of novel drugs, and with the rise of antibiotic resistance, there is an urgent need to discover new compounds. Genome mining enables the rapid identification of biosynthetic gene clusters (BGCs) responsible for natural product biosynthesis. Predicting the structures of the synthesized compounds is key to guiding their targeted discovery, but this requires detailed knowledge of the enzymatic reactions at each step of the biosynthetic pathway. While the core scaffold of many natural products can often be predicted from the initial biosynthetic steps, later tailoring modifications remain difficult to model due to the limited characterization of the enzymes involved. Reliable reaction prediction requires first gathering functional annotations and performing evolutionary analysis of these enzymes—only then can accurate computational predictions be made. In the absence of automated tools, this process becomes time-consuming. Although phylogenetic trees with functional annotations are occasionally published, reusing them directly is labor-intensive and technically challenging.
PhyloNaP addresses this gap by providing a centralized collection of annotated phylogenetic trees for enzymes involved in natural product biosynthesis.
New Feature: Personal Tree Visualization
Users can now explore and analyze their own annotated phylogenetic trees directly in the PhyloNaP interface. This new functionality allows researchers to:
MIBIG or MITE column with valid
IDs,
corresponding molecular structures can also be displayedAccess this feature through the View page to start analyzing your phylogenetic data with PhyloNaP's advanced visualization tools.
PhyloNaP's database contains comprehensive phylogenetic datasets for protein families involved in natural product biosynthesis. Each dataset includes:
High-quality, manually reviewed datasets from:
Computationally generated phylogenetic trees covering:
Protein sequences and annotations gathered from multiple high-quality databases:
Collected data includes taxonomic information, BGC types, functional annotations, and structural data (SMILES) when available.
Sequences clustered using mmseqs easy-linclust with default
parameters (E-value: 1.000E-03) to group related proteins.
Clusters are filtered to remove:
MAFFT autoTrimAl autoFastTree
MAD-rootAll sequences classified into superfamilies using HMMER with
superfamilies_1.75 profiles for consistent functional annotation.
Each step includes quality control measures to ensure reliable phylogenetic reconstructions and meaningful functional predictions.
Users can efficiently explore the database using multiple filtering and sorting options:
PhyloNaP provides powerful interactive tools for exploring phylogenetic relationships and functional annotations. Each tree page combines evolutionary context with biochemical information to facilitate enzyme function prediction.
When you click on a node and select "Get the summary of the clade", a metadata summary is displayed. This summary shows all features associated with the selected node, and the descendant branches of that node will be highlighted in color.
By default, the features are sorted from those with the most identical values to those with the greatest diversity. However, users can adjust the sorting order manually using the arrow buttons. :
PhyloNaP enables users to classify their protein sequences by placing them onto curated phylogenetic trees using a robust, multi-step computational pipeline.
Each placement features a pendant length—the branch connecting the query to the placement node/leaf.
Mail: For any questions or feedback, please contact us at aleksandra.korenskaia@uni-tuebingen.de
Ziemert Lab, University of Tübingen, Germany
License:
PhyloNaP is released under the MIT License. The software is free to use for academic and commercial purposes.
Data Privacy:
Cookies:
This site uses minimal functional cookies to maintain session state. No tracking or analytics cookies are used.