Surfacing the bacterial chemistry we haven't characterized yet
ALCHEMY scans bacterial genomes for biosynthetic gene clusters — the factories that make natural products — and ranks the ones with no match in MIBiG, the curated reference of experimentally studied clusters. The output is a shortlist of candidates worth investigating, not finished discoveries.
Both came back with no hit against MIBiG (dual ClusterBlast + ClusterCompare). Important caveat: MIBiG only holds 3,059 hand-curated, experimentally studied clusters — a tiny slice of the millions in nature — so "no MIBiG hit" means under-characterized, not proven novel. They haven't yet been checked against the full predicted universe (antiSMASH-DB, BiG-FAM). These are leads to investigate.
NO MIBiG HIT
"phycolactam"
GCF_042055075.2 (NCBI: Roseobacter phycocola) · region 3
antiSMASH classes this as a hybrid β-lactone + NRPS cluster (3 modules + a glycosyltransferase) with no MIBiG match — the top-ranked candidate to investigate. The code-name is a provisional label, not a structural claim.
Cluster size52,664 bp
Coding genes43 CDS
antiSMASH classβ-lactone / NRPS
vs MIBiGNo match
NO MIBiG HIT
"silazactam"
GCF_055394375.1 (NCBI: Marinobacter alkaliphilus) · region 4
A compact β-lactone cluster (propionyl-CoA synthetase + leuA signature) with no MIBiG match. Provisional code-name only — nothing here has been expressed, isolated or structurally confirmed.
Cluster size24,249 bp
Coding genes22 CDS
antiSMASH classβ-lactone
vs MIBiGNo match
// The method
A discovery engine, fully automated
Five scripts take a marine genome from public database to a ranked, novelty-scored shortlist of candidate new chemistry — no manual lab work, no local installs.
01
🧬
Pick genomes
Pull marine bacterial assemblies from NCBI by ecology + novelty potential.
02
🔬
antiSMASH
Submit to the antiSMASH web service to detect every biosynthetic gene cluster.
03
📊
MIBiG diff
Dual ClusterBlast + ClusterCompare against the global known-cluster catalogue.
04
🏆
Rank
Product-class-weighted scoring surfaces the strongest zero-hit candidates.
05
📄
Assemble
Auto-build a preprint-ready manuscript with gene tables and figures.
// The scan
Genomes put through the pipeline
A pilot batch of five marine bacterial genomes submitted to antiSMASH. Three completed runs alone surfaced six zero-hit candidate clusters.
Organism
Assembly
Status
Roseobacter phycocola
GCF_042055075.2
headline find
Marinobacter alkaliphilus
GCF_055394375.1
novel BGC
Pseudoalteromonas sp. TO-2024
GCA_055398285.1
complete
Salinispora sp. CH2A1_3
GCF_056820995.1
queued
Salinispora sp. CH2A1_6
GCF_056820975.1
queued
// What this is — and isn't
A candidate, not a discovery
Being honest about the science is the whole point. Here's exactly what a "no MIBiG hit" does and doesn't mean.
📉
Absence ≠ novelty
MIBiG curates only ~3,000 experimentally studied clusters. Nature holds millions. "No MIBiG hit" is the normal outcome for most clusters — it flags under-characterized, not new.
🔬
Predicted, not observed
These are antiSMASH predictions from genome data. Nothing has been expressed, isolated or had its structure determined. You can't legitimately name a molecule that's never been seen.
🧭
A triage engine
The honest framing: ALCHEMY ranks under-characterized clusters for follow-up. Next step is comparing against antiSMASH-DB / BiG-FAM and, eventually, wet-lab work.