Improving WGS-based FBO investigations with pgSNP, a bacterial DNA pangenomic workflow
Ajouter à ma liste
Auteurs :
De Sousa Violante M, Michel V, Feurer C, Radomski N, Mistou MY, Mallet L
Salmonella is one of the most common bacterial pathogen worldwide in human and animal infections, leading to 87,923 cases of human gastroenteritis in Europe in 2019 [1]. To provide new insights on Salmonella epidemic investigations, whole genome sequencing (WGS) methods have been developed, especially focusing on detection of outbreaks and estimation of genetic relationships between isolates [2,3]. However, delineating isolates within homogeneous samples can be difficult, especially in the context of sanitary investigation, where attribution of source and identification of the origin of contamination is essential. This problematic is calling for a more discriminant comparative genomics methods, with higher resolution than methods usually used (MLST, SNP calling) [4].
In order to integrate the whole genomic variation to discriminate isolates, we developed “pgSNP”, an innovative workflow based on the pangenome, including the variants from core and accessory genomes. We built a pangenome reference from samples to showcase all existing sequences, and detection of pangenomic variants was performed by Snippy [5]. All variants from all samples were concatenated to produce different alignments inferred in trees. Finally, a supertree was inferred from all trees.
pgSNP has been validated on a previously published Salmonella outbreak dataset of 192 genomes, and an Escherichia coli dataset of 250 genomes. It was able to identify variants from core and accessory genomes, and to reconstruct a phylogenomic tree based on the whole pangenome DNA, without the need of a reference genome. Our analysis on these datasets showed that pgSNP took into account 2 times more informative genetic information, underscoring the importance of developing new methods that include the information associated to the dispensable genome. Phylogenomic trees inferred by pgSNP were consistent with expected epidemiological clusters and allowed further phylogenetic signal and higher resolution for discrimination of isolates based on accessory DNA. This work demonstrated the importance of the pangenomic variants-based analysis that is likely to bring more certainty in the conclusions of epidemiological field studies.
Fiche technique
Titre :
Improving WGS-based FBO investigations with pgSNP, a bacterial DNA pangenomic workflow
Date sortie / parution :
2022
Référence :
I3S 2022, 20-22 juin 2022, Saint-Malo
Auteur
Quelques mots clés
Autres documents
Joint Technological Unit ACTIA “FASTYPERS”: Fast detection of food-borne bacterial pathogens
Sécurité des aliments
Salmonella spp and Listeria monocytogenes are two major food-borne pathogens. Food contamination can originate from either vegetal or animal raw matters and from food-processing environments. The ability of these pathogens…
Publié en 2022