Standardized, and continuously updated global phylogeny and information resource of Treponema pallidum to track transmission trends and inform vaccine and diagnostic design - powered by Nextstrain.

Overview

Syphilis remains a resurging global public health threat and the lack of a common genomic resource has been an impediment to progress toward vaccine development. We provide a custom Nextstrain build with curated data to enable robust global genomic epidemiology, resistance surveillance, and comparative analysis across the three pathogenic subspecies pallidum, endemicum, and pertenue.

2,588 microbiology Isolates integrated.
49 public Countries represented.
Standardized build Reproducible data processing.
Continuous timelapse Ongoing dataset updates.

Features

  • Global comparative phylogenomics across subspecies.
  • Standardized lineage and cluster definitions, and curated metadata integration for consistent contextualization.
  • Macrolide resistance mutation tracking across lineages and geography.
  • Outer membrane protein variability display and sequence access to inform vaccine design.

Partners

This project is developed through close collaboration between various institutions. We welcome collaboration in data contribution, comparative analysis, methodological refinement, and translational applications. By fostering an open and standardized framework, we aim to strengthen global coordination in syphilis genomic epidemiology.

  • Logo of 'Gates Foundation'
  • Logo of 'University of Washington'
  • Logo of 'University of Tübingen'
  • Logo of 'Wellcome Sanger Institute'

Disclaimer

The data is provided for research and surveillance purposes only. While all analyses are conducted using standardized and reproducible workflows, we do not guarantee completeness, accuracy, or real-time representativeness of the underlying genomic or metadata sources.

The resource is not intended for clinical decision-making or diagnostic use. Interpretations derived from the dataset remain the responsibility of the user.

We gratefully acknowledge the authors, originating and submitting laboratories of the genetic sequences and metadata for sharing their work. Data producers should be named where possible and cooperation should be sought in certain circumstances. Please avoid scooping the work of others and reach out if uncertain. Genomic data originate from publicly available repositories and contributing partners; all primary data rights remain with the original data generators.

Resources

All data processing, curation, and analysis steps, along with Nextstrain customizations and workflows, are implemented in open-source repositories with version-controlled workflows to ensure transparency and reproducibility.

Detailed metadata for all isolates is available in the Nextstrain dataset. The metadata and variant curation are overseen by Nicole Lieberman, UW Medicine Laboratory Medicine and Pathology.

Dataset version: 0.1 - Last updated: February 2026