Mining genomic islands for novel secondary metabolites gene clusters

Natural products (NPs), also called secondary metabolites, are small compounds produced by living organisms and possessing pharmacological or biological activities. NPs are a major source of human medicines, they constitute biological probes providing insight into cellular functions, they inspire synthetic and analytical chemists, and NP biosynthetic enzymes find application in green chemistry. Microorganisms are major NP producers and recent advances in microbial genomics revealed that there are many more NPs to discover. Indeed, microbial genome sequences have shown that a single strain, known to produce a few NPs, has the capacity to synthesize 20-30 different molecules. This has generated new efforts to discover the ∼90% of NPs that remain “cryptic”, i.e. not produced under standard laboratory culture conditions. Most of these approaches rely on sequencing and mining genomes for readily identifiable NP gene clusters, by homology-based methods. Although these methods have demonstrated their efficiency for the discovery of new metabolites, published data and our preliminary results indicate that the synthesis of some NPs is directed by clusters that homology-based genome mining methods fail to detect.

We propose to develop a method to identify NP biosynthetic gene clusters not detected by classical mining approaches. Our method relies on the detection of genomic islands (GIs) in genomes of closely related species. Indeed, species- or strain- specific GIs have been observed to be enriched in genes of secondary metabolism. Once GIs have been detected, the next step consists in establishing a link between the identified GIs and NPs identified by an OSMAC approach associated to metabolomics (LC-MS analyses) or biological activity detection. This can be achieved by deleting GIs and determining the impact of these deletions on NP production. Newly discovered biosynthetic gene clusters can then be functionally characterized and biosynthetic pathways elucidated.

As a proof of concept, we plan to apply our method to three Streptomyces ambofaciens strains, S. ambofaciens ATCC23877, DSM40697 and M1013 for which complete (S. ambofaciens ATCC23877) or draft (S. ambofaciens DSM40697 and M1013) genome sequences are available in the laboratories involved in this project. Initially, the project will focus on the GIs of S. ambofaciens ATCC23877, for which preliminary data are available. First, partners 1 and 2, who have been studying the secondary metabolism of this strain for several years, have identified the gene clusters detectable by homology-based analyses and have already characterized four antibacterial compounds (spiramycin, congocidine, kanamycin and stambomycin) produced by the strain. A strain not producing any of these molecules is available. Second, partners 1 and 2 have identified several compounds (antibacterial activities and HPLC peaks) produced by this strain that could not be linked to any of the secondary metabolite gene clusters detected by traditional in-silico sequence similarity searches. Third, a comparison of the S. ambofaciens ATCC23877 genome with that of the closely related species Streptomyces coelicolor allowed the identification of some GIs of this strain. A link between a metabolite and one of the GIs has been established, demonstrating that a gene cluster not detected by existing mining techniques directed the synthesis of a secondary metabolite. These preliminary results should guarantee the success of this project.

Overall, this project should validate the genome island mining strategy for NP and NP gene clusters discovery, strategy that can be transposed for the exploration of genomes of other bacteria. It will lead to the characterization of new NPs and gene clusters and possibly new families of genes/enzymes.

Project ID: ANR-13-BSV6-0009