Viewport Size Code:
Login | Create New Account
picture

  MENU

About | Classical Genetics | Timelines | What's New | What's Hot

About | Classical Genetics | Timelines | What's New | What's Hot

icon

Bibliography Options Menu

icon
QUERY RUN:
HITS:
PAGE OPTIONS:
Hide Abstracts   |   Hide Additional Links
NOTE:
Long bibliographies are displayed in blocks of 100 citations at a time. At the end of each block there is an option to load the next block.

Bibliography on: Pangenome

The Electronic Scholarly Publishing Project: Providing world-wide, free access to classic scientific papers and other scholarly materials, since 1993.

More About:  ESP | OUR CONTENT | THIS WEBSITE | WHAT'S NEW | WHAT'S HOT

ESP: PubMed Auto Bibliography 14 Dec 2019 at 01:32 Created: 

Pangenome

Although the enforced stability of genomic content is ubiquitous among MCEs, the opposite is proving to be the case among prokaryotes, which exhibit remarkable and adaptive plasticity of genomic content. Early bacterial whole-genome sequencing efforts discovered that whenever a particular "species" was re-sequenced, new genes were found that had not been detected earlier — entirely new genes, not merely new alleles. This led to the concepts of the bacterial core-genome, the set of genes found in all members of a particular "species", and the flex-genome, the set of genes found in some, but not all members of the "species". Together these make up the species' pan-genome.

Created with PubMed® Query: pangenome or "pan-genome" or "pan genome" NOT pmcbook NOT ispreviousversion

Citations The Papers (from PubMed®)

RevDate: 2019-12-11

Lee BH, Cole S, Badel-Berchoux S, et al (2019)

Biofilm Formation of Listeria monocytogenes Strains Under Food Processing Environments and Pan-Genome-Wide Association Study.

Frontiers in microbiology, 10:2698.

Concerns about food contamination by Listeria monocytogenes are on the rise with increasing consumption of ready-to-eat foods. Biofilm production of L. monocytogenes is presumed to be one of the ways that confer its increased resistance and persistence in the food chain. In this study, a collection of isolates from foods and food processing environments (FPEs) representing persistent, prevalent, and rarely detected genotypes was evaluated for biofilm forming capacities including adhesion and sessile biomass production under diverse environmental conditions. The quantity of sessile biomass varied according to growth conditions, lineage, serotype as well as genotype but association of clonal complex (CC) 26 genotype with biofilm production was evidenced under cold temperature. In general, relative biofilm productivity of each strain varied inconsistently across growth conditions. Under our experimental conditions, there were no clear associations between biofilm formation efficiency and persistent or prevalent genotypes. Distinct extrinsic factors affected specific steps of biofilm formation. Sudden nutrient deprivation enhanced cellular adhesion while a prolonged nutrient deficiency impeded biofilm maturation. Salt addition increased biofilm production, moreover, nutrient limitation supplemented by salt significantly stimulated biofilm formation. Pan-genome-wide association study (Pan-GWAS) assessed genetic composition with regard to biofilm phenotypes for the first time. The number of reported genes differed depending on the growth conditions and the number of common genes was low. However, a broad overview of the ontology contents revealed similar patterns regardless of the conditions. Functional analysis showed that functions related to transformation/competence and surface proteins including Internalins were highly enriched.

RevDate: 2019-12-09

Jandrasits C, Kröger S, Haas W, et al (2019)

Computational pan-genome mapping and pairwise SNP-distance improve detection of Mycobacterium tuberculosis transmission clusters.

PLoS computational biology, 15(12):e1007527 pii:PCOMPBIOL-D-19-00340 [Epub ahead of print].

Next-generation sequencing based base-by-base distance measures have become an integral complement to epidemiological investigation of infectious disease outbreaks. This study introduces PANPASCO, a computational pan-genome mapping based, pairwise distance method that is highly sensitive to differences between cases, even when located in regions of lineage specific reference genomes. We show that our approach is superior to previously published methods in several datasets and across different Mycobacterium tuberculosis lineages, as its characteristics allow the comparison of a high number of diverse samples in one analysis-a scenario that becomes more and more likely with the increased usage of whole-genome sequencing in transmission surveillance.

RevDate: 2019-12-05

Emery A, Marpaux N, Naegelen C, et al (2019)

Genotypic study of Citrobacter koseri, an emergent platelet contaminant since 2012 in France.

Transfusion [Epub ahead of print].

BACKGROUND: Transfusion-transmitted bacterial infection is a rare occurrence but the most feared complication in transfusion practices. Between 2012 and 2017, five cases of platelet concentrates (PCs) contaminated with the bacterial pathogen Citrobacter koseri (PC-Ck) have been reported in France, with two leading to the death of the recipients. We tested the possibilities of the emergence of a PC-specific clone of C. koseri (Ck) and of specific bacterial genes associated with PC contamination.

STUDY DESIGN AND METHODS: The phylogenetic network, based on a homemade Ck core genome scheme, inferred from the genomes of 20 worldwide Ck isolates unrelated to PC contamination taken as controls (U-Ck) and the genomes of the five PC-Ck, explored the clonal relationship between the genomes and evaluated the distribution of PC-Ck throughout the species. Along with this core genome multilocus sequence typing approach, a Ck pan genome has been used to seek genes specific to PC-Ck isolates.

RESULTS: Our genomic approach suggested that the population of C. koseri is nonclonal, although it also identified a cluster containing three PC-Ck and eight U-Ck. Indeed, the PC-Ck did not share any specific genes.

CONCLUSION: The elevated incidence of PCs contaminated by C. koseri in France between 2012 and 2017 was not due to the dissemination of a clone. The determinants of the recent outbreaks of PC contamination with C. koseri are still unknown.

RevDate: 2019-12-05

Li R, Fu W, Su R, et al (2019)

Towards the Complete Goat Pan-Genome by Recovering Missing Genomic Segments From the Reference Genome.

Frontiers in genetics, 10:1169.

It is broadly expected that next generation sequencing will ultimately generate a complete genome as is the latest goat reference genome (ARS1), which is considered to be one of the most continuous assemblies in livestock. However, the rich diversity of worldwide goat breeds indicates that a genome from one individual would be insufficient to represent the whole genomic contents of goats. By comparing nine de novo assemblies from seven sibling species of domestic goat with ARS1 and using resequencing and transcriptome data from goats for verification, we identified a total of 38.3 Mb sequences that were absent in ARS1. The pan-sequences contain genic fractions with considerable expression. Using the pan-genome (ARS1 together with the pan-sequences) as a reference genome, variation calling efficacy can be appreciably improved. A total of 56,657 spurious SNPs per individual were repressed and 24,414 novel SNPs per individual on average were recovered as a result of better reads mapping quality. The transcriptomic mapping rate was also increased by ∼1.15%. Our study demonstrated that comparing de novo assemblies from closely related species is an efficient and reliable strategy for finding missing sequences from the reference genome and could be applicable to other species. Pan-genome can serve as an improved reference genome in animals for a better exploration of the underlying genomic variations and could increase the probability of finding genotype-phenotype associations assessed by a comprehensive variation database containing much more differences between individuals. We have constructed a goat pan-genome web interface for data visualization (http://animal.nwsuaf.edu.cn/panGoat).

RevDate: 2019-12-04

Sutton D, Livingstone PG, Furness E, et al (2019)

Genome-Wide Identification of Myxobacterial Predation Genes and Demonstration of Formaldehyde Secretion as a Potentially Predation-Resistant Trait of Pseudomonas aeruginosa.

Frontiers in microbiology, 10:2650.

Despite widespread use in human biology, genome-wide association studies (GWAS) of bacteria are few and have, to date, focused primarily on pathogens. Myxobacteria are predatory microbes with large patchwork genomes, with individual strains secreting unique cocktails of predatory proteins and metabolites. We investigated whether a GWAS strategy could be applied to myxobacteria to identify genes associated with predation. Deduced proteomes from 29 myxobacterial genomes (including eight Myxococcus genomes sequenced for this study), were clustered into orthologous groups, and the presence/absence of orthologues assessed in superior and inferior predators of ten prey organisms. 139 'predation genes' were identified as being associated significantly with predation, including some whose annotation suggested a testable predatory mechanism. Formaldehyde dismutase (fdm) was associated with superior predation of Pseudomonas aeruginosa, and predatory activity of a strain lacking fdm could be increased by the exogenous addition of a formaldehyde detoxifying enzyme, suggesting that production of formaldehyde by P. aeruginosa acts as an anti-predation behaviour. This study establishes the utility of bacterial GWAS to investigate microbial processes beyond pathogenesis, giving plausible and verifiable associations between gene presence/absence and predatory phenotype. We propose that the slow growth rate of myxobacteria, coupled with their predatory mechanism of constitutive secretion, has rendered them relatively resistant to genome streamlining. The resultant genome expansion made possible their observed accumulation of prey-specific predatory genes, without requiring them to be selected for by frequent or recent predation on diverse prey, potentially explaining both the large pan-genome and broad prey range of myxobacteria.

RevDate: 2019-12-04

Yuan J, Li YY, Xu Y, et al (2019)

Molecular Signatures Related to the Virulence of Bacillus cereus Sensu Lato, a Leading Cause of Devastating Endophthalmitis.

mSystems, 4(6): pii:4/6/e00745-19.

Bacillus endophthalmitis is a devastating eye infection that causes rapid blindness through extracellular tissue-destructive exotoxins. Despite its importance, knowledge of the phylogenetic relationships and population structure of intraocular Bacillus spp. is lacking. In this study, we sequenced the whole genomes of eight Bacillus intraocular pathogens independently isolated from 8/52 patients with posttraumatic Bacillus endophthalmitis infections in the Eye Hospital of Wenzhou Medical University between January 2010 and December 2018. Phylogenetic analysis revealed that the pathogenic intraocular isolates belonged to Bacillus cereus, Bacillus thuringiensis and Bacillus toyonensis To determine the virulence of the ocular isolates, three representative strains were injected into mouse models, and severe endophthalmitis leading to blindness was observed. Through incorporating publicly available genomes for Bacillus spp., we found that the intraocular pathogens could be isolated independently but displayed a similar genetic context. In addition, our data provide genome-wide support for intraocular and gastrointestinal sources of Bacillus spp. belonging to different lineages. Importantly, we identified five molecular signatures of virulence and motility genes associated with intraocular infection, namely, plcA-2, InhA-3, InhA-4, hblA-5, and fliD using pangenome-wide association studies. The characterization of overrepresented genes in the intraocular isolates holds value to predict bacterial evolution and for the design of future intervention strategies in patients with endophthalmitis.IMPORTANCE In this study, we provided a detailed and comprehensive clinicopathological and pathogenic report of Bacillus endophthalmitis over the 8 years of the study period. We first reported the whole-genome sequence of Bacillus spp. causing devastating endophthalmitis and found that Bacillus toyonensis is able to cause endophthalmitis. Finally, we revealed significant endophthalmitis-associated virulence genes involved in hemolysis, immunity inhibition, and pathogenesis. Overall, as more sequencing data sets become available, these data will facilitate comparative research and will reveal the emergence of pathogenic "ocular bacteria."

RevDate: 2019-12-02

Khan AW, Garg V, Roorkiwal M, et al (2019)

Super-Pangenome by Integrating the Wild Side of a Species for Accelerated Crop Improvement.

Trends in plant science pii:S1360-1385(19)30281-X [Epub ahead of print].

The pangenome provides genomic variations in the cultivated gene pool for a given species. However, as the crop's gene pool comprises many species, especially wild relatives with diverse genetic stock, here we suggest using accessions from all available species of a given genus for the development of a more comprehensive and complete pangenome, which we refer to as a super-pangenome. The super-pangenome provides a complete genomic variation repertoire of a genus and offers unprecedented opportunities for crop improvement. This opinion article focuses on recent developments in crop pangenomics, the need for a super-pangenome that should include wild species, and its application for crop improvement.

RevDate: 2019-11-30

Chaudhry V, PB Patil (2019)

Evolutionary insights into adaptation of Staphylococcus haemolyticus to human and non-human niches.

Genomics pii:S0888-7543(19)30804-3 [Epub ahead of print].

Staphylococcus haemolyticus is a well-known member of human skin microbiome and an emerging opportunistic human pathogen. Presently, evolutionary studies are limited to human isolates even though it is reported from plants with beneficial properties and in environmental settings. In the present study, we report isolation of novel S. haemolyticus strains from surface sterilized rice seeds and compare their genome to other isolates from diverse niches available in public domain. The study showed expanding nature of pan-genome and revealed set of genes with putative functions related to its adaptability. This is seen by presence of type II lanthipeptide cluster in rice isolates, metal homeostasis genes in an isolate from copper coin and gene encoding methicillin resistance in human isolates. The present study on differential genome dynamics and role of horizontal gene transfers has provided novel insights into capability for ecological diversification of a bacterium of significance to human health.

RevDate: 2019-11-29

Peeters C, De Canck E, Cnockaert M, et al (2019)

Comparative Genomics of Pandoraea, a Genus Enriched in Xenobiotic Biodegradation and Metabolism.

Frontiers in microbiology, 10:2556.

Comparative analysis of partial gyrB, recA, and gltB gene sequences of 84 Pandoraea reference strains and field isolates revealed several clusters that included no taxonomic reference strains. The gyrB, recA, and gltB phylogenetic trees were used to select 27 strains for whole-genome sequence analysis and for a comparative genomics study that also included 41 publicly available Pandoraea genome sequences. The phylogenomic analyses included a Genome BLAST Distance Phylogeny approach to calculate pairwise digital DNA-DNA hybridization values and their confidence intervals, average nucleotide identity analyses using the OrthoANIu algorithm, and a whole-genome phylogeny reconstruction based on 107 single-copy core genes using bcgTree. These analyses, along with subsequent chemotaxonomic and traditional phenotypic analyses, revealed the presence of 17 novel Pandoraea species among the strains analyzed, and allowed the identification of several unclassified Pandoraea strains reported in the literature. The genus Pandoraea has an open pan genome that includes many orthogroups in the 'Xenobiotics biodegradation and metabolism' KEGG pathway, which likely explains the enrichment of these species in polluted soils and participation in the biodegradation of complex organic substances. We propose to formally classify the 17 novel Pandoraea species as P. anapnoica sp. nov. (type strain LMG 31117T = CCUG 73385T), P. anhela sp. nov. (type strain LMG 31108T = CCUG 73386T), P. aquatica sp. nov. (type strain LMG 31011T = CCUG 73384T), P. bronchicola sp. nov. (type strain LMG 20603T = ATCC BAA-110T), P. capi sp. nov. (type strain LMG 20602T = ATCC BAA-109T), P. captiosa sp. nov. (type strain LMG 31118T = CCUG 73387T), P. cepalis sp. nov. (type strain LMG 31106T = CCUG 39680T), P. commovens sp. nov. (type strain LMG 31010T = CCUG 73378T), P. communis sp. nov. (type strain LMG 31110T = CCUG 73383T), P. eparura sp. nov. (type strain LMG 31012T = CCUG 73380T), P. horticolens sp. nov. (type strain LMG 31112T = CCUG 73379T), P. iniqua sp. nov. (type strain LMG 31009T = CCUG 73377T), P. morbifera sp. nov. (type strain LMG 31116T = CCUG 73389T), P. nosoerga sp. nov. (type strain LMG 31109T = CCUG 73390T), P. pneumonica sp. nov. (type strain LMG 31114T = CCUG 73388T), P. soli sp. nov. (type strain LMG 31014T = CCUG 73382T), and P. terrigena sp. nov. (type strain LMG 31013T = CCUG 73381T).

RevDate: 2019-11-28

Lupolova N, Lycett SJ, DL Gally (2019)

A guide to machine learning for bacterial host attribution using genome sequence data.

Microbial genomics [Epub ahead of print].

With the ever-expanding number of available sequences from bacterial genomes, and the expectation that this data type will be the primary one generated from both diagnostic and research laboratories for the foreseeable future, then there is both an opportunity and a need to evaluate how effectively computational approaches can be used within bacterial genomics to predict and understand complex phenotypes, such as pathogenic potential and host source. This article applied various quantitative methods such as diversity indexes, pangenome-wide association studies (GWAS) and dimensionality reduction techniques to better understand the data and then compared how well unsupervised and supervised machine learning (ML) methods could predict the source host of the isolates. The study uses the example of the pangenomes of 1203 Salmonella enterica serovar Typhimurium isolates in order to predict 'host of isolation' using these different methods. The article is aimed as a review of recent applications of ML in infection biology, but also, by working through this specific dataset, it allows discussion of the advantages and drawbacks of the different techniques. As with all such sub-population studies, the biological relevance will be dependent on the quality and diversity of the input data. Given this major caveat, we show that supervised ML has the potential to add real value to interpretation of bacterial genomic data, as it can provide probabilistic outcomes for important phenotypes, something that is very difficult to achieve with the other methods.

RevDate: 2019-11-28

Eggertsson HP, Kristmundsdottir S, Beyter D, et al (2019)

GraphTyper2 enables population-scale genotyping of structural variation using pangenome graphs.

Nature communications, 10(1):5402 pii:10.1038/s41467-019-13341-9.

Analysis of sequence diversity in the human genome is fundamental for genetic studies. Structural variants (SVs) are frequently omitted in sequence analysis studies, although each has a relatively large impact on the genome. Here, we present GraphTyper2, which uses pangenome graphs to genotype SVs and small variants using short-reads. Comparison to the syndip benchmark dataset shows that our SV genotyping is sensitive and variant segregation in families demonstrates the accuracy of our approach. We demonstrate that incorporating public assembly data into our pipeline greatly improves sensitivity, particularly for large insertions. We validate 6,812 SVs on average per genome using long-read data of 41 Icelanders. We show that GraphTyper2 can simultaneously genotype tens of thousands of whole-genomes by characterizing 60 million small variants and half a million SVs in 49,962 Icelanders, including 80 thousand SVs with high-confidence.

RevDate: 2019-11-27

Chernysheva N, Bystritskaya E, Stenkova A, et al (2019)

Comparative Genomics and CAZyme Genome Repertoires of Marine Zobellia amurskyensis KMM 3526T and Zobellia laminariae KMM 3676T.

Marine drugs, 17(12): pii:md17120661.

We obtained two novel draft genomes of type Zobellia strains with estimated genome sizes of 5.14 Mb for Z. amurskyensis KMM 3526Т and 5.16 Mb for Z. laminariae KMM 3676Т. Comparative genomic analysis has been carried out between obtained and known genomes of Zobellia representatives. The pan-genome of Zobellia genus is composed of 4853 orthologous clusters and the core genome was estimated at 2963 clusters. The genus CAZome was represented by 775 GHs classified into 62 families, 297 GTs of 16 families, 100 PLs of 13 families, 112 CEs of 13 families, 186 CBMs of 18 families and 42 AAs of six families. A closer inspection of the carbohydrate-active enzyme (CAZyme) genomic repertoires revealed members of new putative subfamilies of GH16 and GH117, which can be biotechnologically promising for production of oligosaccharides and rare monomers with different bioactivities. We analyzed AA3s, among them putative FAD-dependent glycoside oxidoreductases (FAD-GOs) being of particular interest as promising biocatalysts for glycoside deglycosylation in food and pharmaceutical industries.

RevDate: 2019-11-26

Cabrera-Contreras R, Santamaría RI, Bustos P, et al (2019)

Genomic diversity of prevalent Staphylococcus epidermidis multidrug-resistant strains isolated from a Children's Hospital in México City in an eight-years survey.

PeerJ, 7:e8068 pii:8068.

Staphylococcus epidermidis is a human commensal and pathogen worldwide distributed. In this work, we surveyed for multi-resistant S. epidermidis strains in eight years at a children's health-care unit in México City. Multidrug-resistant S. epidermidis were present in all years of the study, including resistance to methicillin, beta-lactams, fluoroquinolones, and macrolides. To understand the genetic basis of antibiotic resistance and its association with virulence and gene exchange, we sequenced the genomes of 17 S. epidermidis isolates. Whole-genome nucleotide identities between all the pairs of S. epidermidis strains were about 97% to 99%. We inferred a clonal structure and eight Multilocus Sequence Types (MLSTs) in the S. epidermidis sequenced collection. The profile of virulence includes genes involved in biofilm formation and phenol-soluble modulins (PSMs). Half of the S. epidermidis analyzed lacked the ica operon for biofilm formation. Likely, they are commensal S. epidermidis strains but multi-antibiotic resistant. Uneven distribution of insertion sequences, phages, and CRISPR-Cas immunity phage systems suggest frequent horizontal gene transfer. Rates of recombination between S. epidermidis strains were more prevalent than the mutation rate and affected the whole genome. Therefore, the multidrug resistance, independently of the pathogenic traits, might explain the persistence of specific highly adapted S. epidermidis clonal lineages in nosocomial settings.

RevDate: 2019-11-25

Sujitha S, Vishnu US, Karthikeyan R, et al (2019)

Genome Investigation of a Cariogenic Pathogen with Implications in Cardiovascular Diseases.

Indian journal of microbiology, 59(4):451-459.

The proportion of people suffering from cardiovascular diseases has risen by 34% in the last 15 years in India. Cardiomyopathy is among the many forms of CVD s present. Infection of heart muscles is the suspected etiological agent for the same. Oral pathogens gaining entry into the bloodstream are responsible for such infections. Streptococcus mutans is an oral pathogen with implications in cardiovascular diseases. Previous studies have shown certain strains of S. mutans are found predominantly within atherosclerotic plaques and extirpated valves. To decipher the genetic differences responsible for endothelial cell invasion, we have sequenced the genome of Streptococcus mutans B14. Pan-genome analysis, search for adhesion proteins through a special algorithm, and protein-protein interactions search through HPIDB have been done. Pan-genome analysis of 187 whole genomes, assemblies revealed 6965 genes in total and 918 genes forming the core gene cluster. Adhesion to the endothelial cell is a critical virulence factor distinguishing virulent and non-virulent strains. Overall, 4% of the total proteins in S. mutans B14 were categorized as adhesion proteins. Protein-protein interaction between putative adhesion proteins and Human extracellular matrix components was predicted, revealing novel interactions. A conserved gene catalyzing the synthesis of branched-chain amino acids in S. mutans B14 shows possible interaction with isoforms of cathepsin protein of the ECM. This genome sequence analysis indicates towards other proteins in the S. mutans genome, which might have a specific role to play in host cell interaction.

RevDate: 2019-11-23

Decano AG, T Downing (2019)

An Escherichia coli ST131 pangenome atlas reveals population structure and evolution across 4,071 isolates.

Scientific reports, 9(1):17394 pii:10.1038/s41598-019-54004-5.

Escherichia coli ST131 is a major cause of infection with extensive antimicrobial resistance (AMR) facilitated by widespread beta-lactam antibiotic use. This drug pressure has driven extended-spectrum beta-lactamase (ESBL) gene acquisition and evolution in pathogens, so a clearer resolution of ST131's origin, adaptation and spread is essential. E. coli ST131's ESBL genes are typically embedded in mobile genetic elements (MGEs) that aid transfer to new plasmid or chromosomal locations, which are mobilised further by plasmid conjugation and recombination, resulting in a flexible ESBL, MGE and plasmid composition with a conserved core genome. We used population genomics to trace the evolution of AMR in ST131 more precisely by extracting all available high-quality Illumina HiSeq read libraries to investigate 4,071 globally-sourced genomes, the largest ST131 collection examined so far. We applied rigorous quality-control, genome de novo assembly and ESBL gene screening to resolve ST131's population structure across three genetically distinct Clades (A, B, C) and abundant subclades from the dominant Clade C. We reconstructed their evolutionary relationships across the core and accessory genomes using published reference genomes, long read assemblies and k-mer-based methods to contextualise pangenome diversity. The three main C subclades have co-circulated globally at relatively stable frequencies over time, suggesting attaining an equilibrium after their origin and initial rapid spread. This contrasted with their ESBL genes, which had stronger patterns across time, geography and subclade, and were located at distinct locations across the chromosomes and plasmids between isolates. Within the three C subclades, the core and accessory genome diversity levels were not correlated due to plasmid and MGE activity, unlike patterns between the three main clades, A, B and C. This population genomic study highlights the dynamic nature of the accessory genomes in ST131, suggesting that surveillance should anticipate genetically variable outbreaks with broader antibiotic resistance levels. Our findings emphasise the potential of evolutionary pangenomics to improve our understanding of AMR gene transfer, adaptation and transmission to discover accessory genome changes linked to novel subtypes.

RevDate: 2019-11-21

de Fátima Rauber Würfel S, Jorge S, de Oliveira NR, et al (2019)

Campylobacter jejuni isolated from poultry meat in Brazil: in silico analysis and genomic features of two strains with different phenotypes of antimicrobial susceptibility.

Molecular biology reports pii:10.1007/s11033-019-05174-y [Epub ahead of print].

Campylobacter jejuni is the most common bacterial cause of foodborne diarrheal disease worldwide and is among the antimicrobial resistant "priority pathogens" that pose greatest threat to public health. The genomes of two C. jejuni isolated from poultry meat sold on the retail market in Southern Brazil phenotypically characterized as multidrug-resistant (CJ100) and susceptible (CJ104) were sequenced and analyzed by bioinformatic tools. The isolates CJ100 and CJ104 showed distinct multilocus sequence types (MLST). Comparative genomic analysis revealed a large number of single nucleotide polymorphisms, rearrangements, and inversions in both genomes, in addition to virulence factors, genomic islands, prophage sequences, and insertion sequences. A circular 103-kilobase megaplasmid carrying virulence factors was identified in the genome of CJ100, in addition to resistance mechanisms to aminoglycosides, beta-lactams, macrolides, quinolones, and tetracyclines. The molecular characterization of distinct phenotypes of foodborne C. jejuni and the discovery of a novel virulence megaplasmid provide useful data for pan-genome and large-scale studies to monitor the virulent C. jejuni in poultry meat is warranted.

RevDate: 2019-11-20

Chapeton-Montes D, Plourde L, Bouchier C, et al (2019)

Author Correction: The population structure of Clostridium tetani deduced from its pan-genome.

Scientific reports, 9(1):17409 pii:10.1038/s41598-019-53688-z.

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

RevDate: 2019-11-19

Lawson MAE, O'Neill IJ, Kujawska M, et al (2019)

Breast milk-derived human milk oligosaccharides promote Bifidobacterium interactions within a single ecosystem.

The ISME journal pii:10.1038/s41396-019-0553-2 [Epub ahead of print].

Diet-microbe interactions play an important role in modulating the early-life microbiota, with Bifidobacterium strains and species dominating the gut of breast-fed infants. Here, we sought to explore how infant diet drives distinct bifidobacterial community composition and dynamics within individual infant ecosystems. Genomic characterisation of 19 strains isolated from breast-fed infants revealed a diverse genomic architecture enriched in carbohydrate metabolism genes, which was distinct to each strain, but collectively formed a pangenome across infants. Presence of gene clusters implicated in digestion of human milk oligosaccharides (HMOs) varied between species, with growth studies indicating that within single infants there were differences in the ability to utilise 2'FL and LNnT HMOs between strains. Cross-feeding experiments were performed with HMO degraders and non-HMO users (using spent or 'conditioned' media and direct co-culture). Further 1H-NMR analysis identified fucose, galactose, acetate, and N-acetylglucosamine as key by-products of HMO metabolism; as demonstrated by modest growth of non-HMO users on spend media from HMO metabolism. These experiments indicate how HMO metabolism permits the sharing of resources to maximise nutrient consumption from the diet and highlights the cooperative nature of bifidobacterial strains and their role as 'foundation' species in the infant ecosystem. The intra- and inter-infant bifidobacterial community behaviour may contribute to the diversity and dominance of Bifidobacterium in early life and suggests avenues for future development of new diet and microbiota-based therapies to promote infant health.

RevDate: 2019-11-18

Robertson J, Lin J, Wren-Hedgus A, et al (2019)

Development of a multi-locus typing scheme for an Enterobacteriaceae linear plasmid that mediates inter-species transfer of flagella.

PloS one, 14(11):e0218638 pii:PONE-D-19-15528.

Due to the public health importance of flagellar genes for typing, it is important to understand mechanisms that could alter their expression or presence. Phenotypic novelty in flagellar genes arise predominately through accumulation of mutations but horizontal transfer is known to occur. A linear plasmid termed pBSSB1 previously identified in Salmonella Typhi, was found to encode a flagellar operon that can mediate phase variation, which results in the rare z66 flagella phenotype. The identification and tracking of homologs of pBSSB1 is limited because it falls outside the normal replicon typing schemes for plasmids. Here we report the generation of nine new pBSSB1-family sequences using Illumina and Nanopore sequence data. Homologs of pBSSB1 were identified in 154 genomes representing 25 distinct serotypes from 67,758 Salmonella public genomes. Pangenome analysis of pBSSB1-family contigs was performed using roary and we identified three core genes amenable to a minimal pMLST scheme. Population structure analysis based on the newly developed pMLST scheme identified three major lineages representing 35 sequence types, and the distribution of these sequence types was found to span multiple serovars across the globe. This in silico pMLST scheme has shown utility in tracking and subtyping pBSSB1-family plasmids and it has been incorporated into the plasmid MLST database under the name "pBSSB1-family".

RevDate: 2019-11-18

Suresh G, Lodha TD, Indu B, et al (2019)

Taxogenomics Resolves Conflict in the Genus Rhodobacter: A Two and Half Decades Pending Thought to Reclassify the Genus Rhodobacter.

Frontiers in microbiology, 10:2480.

The genus Rhodobacter is taxonomically well studied, and some members are model organisms. However, this genus is comprised of a heterogeneous group of members. 16S rRNA gene-based phylogeny of the genus Rhodobacter indicates a motley assemblage of anoxygenic phototrophic bacteria (genus Rhodobacter) with interspersing members of other genera (chemotrophs) making the genus polyphyletic. Taxogenomics was performed to resolve the taxonomic conflicts of the genus Rhodobacter using twelve type strains. The phylogenomic analysis showed that Rhodobacter spp. can be grouped into four monophyletic clusters with interspersing chemotrophs. Genomic indices (ANI and dDDH) confirmed that all the current species are well defined, except Rhodobacter megalophilus. The average amino acid identity values between the monophyletic clusters of Rhodobacter members, as well as with the chemotrophic genera, are less than 80% whereas the percentage of conserved proteins values were below 70%, which has been observed among several genera related to Rhodobacter. The pan-genome analysis has shown that there are only 1239 core genes shared between the 12 species of the genus Rhodobacter. The polyphasic taxonomic analysis supports the phylogenomic and genomic studies in distinguishing the four Rhodobacter clusters. Each cluster is comprised of one to seven species according to the current Rhodobacter taxonomy. Therefore, to address this taxonomic discrepancy we propose to reclassify the members of the genus Rhodobacter into three new genera, Luteovulum gen. nov., Phaeovulum gen. nov. and Fuscovulum gen. nov., and provide an emended description of the genus Rhodobacter sensu stricto. Also, we propose reclassification of Rhodobacter megalophilus as a sub-species of Rhodobacter sphaeroides.

RevDate: 2019-11-16

Ghosh S, Sarangi AN, Mukherjee M, et al (2019)

Reanalysis of Lactobacillus paracasei Lbs2 Strain and Large-Scale Comparative Genomics Places Many Strains into Their Correct Taxonomic Position.

Microorganisms, 7(11): pii:microorganisms7110487.

Lactobacillus paracasei are diverse Gram-positive bacteria that are very closely related to Lactobacillus casei, belonging to the Lactobacillus casei group. Due to extreme genome similarities between L. casei and L. paracasei, many strains have been cross placed in the other group. We had earlier sequenced and analyzed the genome of Lactobacillus paracasei Lbs2, but mistakenly identified it as L. casei. We re-analyzed Lbs2 reads into a 2.5 MB genome that is 91.28% complete with 0.8% contamination, which is now suitably placed under L. paracasei based on Average Nucleotide Identity and Average Amino Acid Identity. We took 74 sequenced genomes of L. paracasei from GenBank with assembly sizes ranging from 2.3 to 3.3 MB and genome completeness between 88% and 100% for comparison. The pan-genome of 75 L. paracasei strains hold 15,945 gene families (21,5232 genes), while the core genome contained about 8.4% of the total genes (243 gene families with 18,225 genes) of pan-genome. Phylogenomic analysis based on core gene families revealed that the Lbs2 strain has a closer relationship with L. paracasei subsp. tolerans DSM20258. Finally, the in-silico analysis of the L. paracasei Lbs2 genome revealed an important pathway that could underpin the production of thiamin, which may contribute to the host energy metabolism.

RevDate: 2019-11-15

Seribelli AA, Gonzales JC, de Almeida F, et al (2019)

Phylogenetic analysis revealed that Salmonella Typhimurium ST313 isolated from humans and food in Brazil presented a high genomic similarity.

Brazilian journal of microbiology : [publication of the Brazilian Society for Microbiology] pii:10.1007/s42770-019-00155-6 [Epub ahead of print].

Salmonella Typhimurium sequence type 313 (S. Typhimurium ST313) has caused invasive disease mainly in sub-Saharan Africa. In Brazil, ST313 strains have been recently described, and there is a lack of studies that assessed by whole genome sequencing (WGS)-the relationship of these strains. The aims of this work were to study the phylogenetic relationship of 70 S. Typhimurium genomes comparing strains of ST313 (n = 9) isolated from humans and food in Brazil among themselves, with other STs isolated in this country (n = 31) and in other parts of the globe (n = 30) by 16S rRNA sequences, the Gegenees software, whole genome multilocus sequence typing (wgMLST), and average nucleotide identity (ANI) for the genomes of ST313. Additionally, pangenome analysis was performed to verify the heterogeneity of these genomes. The phylogenetic analyses showed that the ST313 genomes were very similar among themselves. However, the ST313 genomes were usually clustered more distantly to other STs of strains isolated in Brazil and in other parts of the world. By pangenome calculation, the core genome was 2,880 CDSs and 4,171 CDSs singletons for all the 70 S. Typhimurium genomes studied. Considering the 10 ST313 genomes analyzed the core genome was 4,112 CDSs and 76 CDSs singletons. In conclusion, the ST313 genomes from Brazil showed a high similarity among them which information might eventually help in the development of vaccines and antibiotics. The pangenome analysis showed that the S. Typhimurium genomes studied presented an open pangenome, but specifically tending to become close for the ST313 strains.

RevDate: 2019-11-13

Chhotaray C, Wang S, Tan Y, et al (2019)

Comparative Analysis of Whole-Genome and Methylome Profiles of a Smooth and a Rough Mycobacterium abscessus Clinical Strain.

G3 (Bethesda, Md.) pii:g3.119.400737 [Epub ahead of print].

Mycobacterium abscessus is a fast growing mycobacterium species mainly causing skin and respiratory infections in human. M. abscessus is resistant to numerous drugs, which is a major challenge for the treatment. In this study, we have sequenced the genomes of two clinical M. abscessus strains having rough and smooth morphology, using the single molecule real-time and Illumina HiSeq sequencing technology. In addition, we reported the first comparative methylome profiles of a rough and a smooth M. abscessus clinical strains. The number of N4-methylcytosine (4mC) and N6-methyladenine (6mA) modified bases obtained from smooth phenotype were 2-fold and 1.6-fold respectively higher than that of rough phenotype. We have also identified 4 distinct novel motifs in two clinical strains and genes encoding antibiotic-modifying/targeting enzymes and genes associated with intracellular survivability having different methylation patterns. To our knowledge, this is the first report about genome-wide methylation profiles of M. abscessus strains and identification of a natural linear plasmid (15 kb) in this critical pathogen harboring methylated bases. The pan-genome analysis of 25 M. abscessus strains including two clinical strains revealed an open pan genome comprises of 7596 gene clusters. Likewise, structural variation analysis revealed that the genome of rough phenotype strain contains more insertions and deletions than the smooth phenotype and that of the reference strain. A total of 391 single nucleotide variations responsible for the non-synonymous mutations were detected in clinical strains compared to the reference genome. The comparative genomic analysis elucidates the genome plasticity in this emerging pathogen. Furthermore, the detection of genome-wide methylation profiles of M. abscessus clinical strains may provide insight into the significant role of DNA methylation in pathogenicity and drug resistance in this opportunistic pathogen.

RevDate: 2019-11-09

Kim KH, Chun BH, Baek JH, et al (2020)

Genomic and metabolic features of Lactobacillus sakei as revealed by its pan-genome and the metatranscriptome of kimchi fermentation.

Food microbiology, 86:103341.

The genomic and metabolic features of Lactobacillus sakei were investigated using its pan-genome and by analyzing the metatranscriptome of kimchi fermentation. In the genome-based relatedness analysis, the strains were divided into the Lb. sakei ssp. sakei and Lb. sakei ssp. carnosus lineage groups. Genomic and metabolic pathway analysis revealed that all Lb. sakei strains have the capability of producing d/l-lactate, ethanol, acetate, CO2, formate, l-malate, diacetyl, acetoin, and 2,3-butanediol from d-glucose, d-fructose, d-galactose, sucrose, d-lactose, l-arabinose, cellobiose, d-mannose, d-gluconate, and d-ribose through homolactic and heterolactic fermentation, whereas their capability of d-maltose, d-xylose, l-xylulose, d-galacturonate, and d-glucuronate metabolism is strain-specific. All strains carry genes for the biosynthesis of folate and thiamine, whereas genes for biogenic amine and toxin production, hemolysis, and antibiotic resistance were not identified. The metatranscriptomic analysis showed that the expression of Lb. sakei transcripts involved in carbohydrate metabolism increased as kimchi fermentation progressed, suggesting that Lb. sakei is more competitive during late fermentation stage. Homolactic fermentation pathway was highly expressed and generally constant during kimchi fermentation, whereas expression of heterolactic fermentation pathway increased gradually as fermentation progressed. l-Lactate dehydrogenase was more highly expressed than d-lactate dehydrogenase, suggesting that l-lactate is the major lactate metabolized by Lb. sakei.

RevDate: 2019-11-07

Bernheim A, R Sorek (2019)

The pan-immune system of bacteria: antiviral defence as a community resource.

Nature reviews. Microbiology pii:10.1038/s41579-019-0278-2 [Epub ahead of print].

Viruses and their hosts are engaged in a constant arms race leading to the evolution of antiviral defence mechanisms. Recent studies have revealed that the immune arsenal of bacteria against bacteriophages is much more diverse than previously envisioned. These discoveries have led to seemingly contradictory observations: on one hand, individual microorganisms often encode multiple distinct defence systems, some of which are acquired by horizontal gene transfer, alluding to their fitness benefit. On the other hand, defence systems are frequently lost from prokaryotic genomes on short evolutionary time scales, suggesting that they impose a fitness cost. In this Perspective article, we present the 'pan-immune system' model in which we suggest that, although a single strain cannot carry all possible defence systems owing to their burden on fitness, it can employ horizontal gene transfer to access immune defence mechanisms encoded by closely related strains. Thus, the 'effective' immune system is not the one encoded by the genome of a single microorganism but rather by its pan-genome, comprising the sum of all immune systems available for a microorganism to horizontally acquire and use.

RevDate: 2019-11-07

Vila Nova M, Durimel K, La K, et al (2019)

Genetic and metabolic signatures of Salmonella enterica subsp. enterica associated with animal sources at the pangenomic scale.

BMC genomics, 20(1):814 pii:10.1186/s12864-019-6188-x.

BACKGROUND: Salmonella enterica subsp. enterica is a public health issue related to food safety, and its adaptation to animal sources remains poorly described at the pangenome scale. Firstly, serovars presenting potential mono- and multi-animal sources were selected from a curated and synthetized subset of Enterobase. The corresponding sequencing reads were downloaded from the European Nucleotide Archive (ENA) providing a balanced dataset of 440 Salmonella genomes in terms of serovars and sources (i). Secondly, the coregenome variants and accessory genes were detected (ii). Thirdly, single nucleotide polymorphisms and small insertions/deletions from the coregenome, as well as the accessory genes were associated to animal sources based on a microbial Genome Wide Association Study (GWAS) integrating an advanced correction of the population structure (iii). Lastly, a Gene Ontology Enrichment Analysis (GOEA) was applied to emphasize metabolic pathways mainly impacted by the pangenomic mutations associated to animal sources (iv).

RESULTS: Based on a genome dataset including Salmonella serovars from mono- and multi-animal sources (i), 19,130 accessory genes and 178,351 coregenome variants were identified (ii). Among these pangenomic mutations, 52 genomic signatures (iii) and 9 over-enriched metabolic signatures (iv) were associated to avian, bovine, swine and fish sources by GWAS and GOEA, respectively.

CONCLUSIONS: Our results suggest that the genetic and metabolic determinants of Salmonella adaptation to animal sources may have been driven by the natural feeding environment of the animal, distinct livestock diets modified by human, environmental stimuli, physiological properties of the animal itself, and work habits for health protection of livestock.

RevDate: 2019-10-31

Aguirre de Cárcer D (2019)

A conceptual framework for the phylogenetically constrained assembly of microbial communities.

Microbiome, 7(1):142 pii:10.1186/s40168-019-0754-y.

Microbial communities play essential and preponderant roles in all ecosystems. Understanding the rules that govern microbial community assembly will have a major impact on our ability to manage microbial ecosystems, positively impacting, for instance, human health and agriculture. Here, I present a phylogenetically constrained community assembly principle grounded on the well-supported facts that deterministic processes have a significant impact on microbial community assembly, that microbial communities show significant phylogenetic signal, and that microbial traits and ecological coherence are, to some extent, phylogenetically conserved. From these facts, I derive a few predictions which form the basis of the framework. Chief among them is the existence, within most microbial ecosystems, of phylogenetic core groups (PCGs), defined as discrete portions of the phylogeny of varying depth present in all instances of the given ecosystem, and related to specific niches whose occupancy requires a specific phylogenetically conserved set of traits. The predictions are supported by the recent literature, as well as by dedicated analyses. Integrating the effect of ecosystem patchiness, microbial social interactions, and scale sampling pitfalls takes us to a comprehensive community assembly model that recapitulates the characteristics most commonly observed in microbial communities. PCGs' identification is relatively straightforward using high-throughput 16S amplicon sequencing, and subsequent bioinformatic analysis of their phylogeny, estimated core pan-genome, and intra-group co-occurrence should provide valuable information on their ecophysiology and niche characteristics. Such a priori information for a significant portion of the community could be used to prime complementing analyses, boosting their usefulness. Thus, the use of the proposed framework could represent a leap forward in our understanding of microbial community assembly and function.

RevDate: 2019-10-29

Alonge M, Soyk S, Ramakrishnan S, et al (2019)

RaGOO: fast and accurate reference-guided scaffolding of draft genomes.

Genome biology, 20(1):224 pii:10.1186/s13059-019-1829-6.

We present RaGOO, a reference-guided contig ordering and orienting tool that leverages the speed and sensitivity of Minimap2 to accurately achieve chromosome-scale assemblies in minutes. After the pseudomolecules are constructed, RaGOO identifies structural variants, including those spanning sequencing gaps. We show that RaGOO accurately orders and orients 3 de novo tomato genome assemblies, including the widely used M82 reference cultivar. We then demonstrate the scalability and utility of RaGOO with a pan-genome analysis of 103 Arabidopsis thaliana accessions by examining the structural variants detected in the newly assembled pseudomolecules. RaGOO is available open source at https://github.com/malonge/RaGOO .

RevDate: 2019-10-29

Oh YJ, Kim JY, Park HK, et al (2019)

Salicibibacter halophilus sp. nov., a moderately halophilic bacterium isolated from kimchi.

Journal of microbiology (Seoul, Korea), 57(11):997-1002.

A Gram-stain-positive, rod-shaped, alkalitolerant, and halophilic bacterium-designated as strain NKC3-5T-was isolated from kimchi that was collected from the Geumsan area in the Republic of Korea. Cells of isolated strain NKC3-5T were 0.5-0.7 μm wide and 1.4-2.8 μm long. The strain NKC3-5T could grow at up to 20.0% (w/v) NaCl (optimum 10%), pH 6.5-10.0 (optimum pH 9.0), and 25-40°C (optimum 35°C). The cells were able to reduce nitrate under aerobic conditions, which is the first report in the genus Salicibibacter. The genome size and genomic G + C content of strain NKC3-5T were 3,754,174 bp and 45.9 mol%, respectively; it contained 3,630 coding sequences, 16S rRNA genes (six 16S, five 5S, and five 23S), and 59 tRNA genes. Phylogenetic analysis based on 16S rRNA showed that strain NKC3-5T clustered with bacterium Salicibibacter kimchii NKC1-1T, with a similarity of 96.2-97.6%, but formed a distinct branch with other published species of the family Bacillaceae. In addition, OrthoANI value between strain NKC3-5T and Salicibibacter kimchii NKC1-1T was far lower than the species demarcation threshold. Using functional genome annotation, the result found that carbohydrate, amino acid, and vitamin metabolism related genes were highly distributed in the genome of strain NKC3-5T. Comparative genomic analysis revealed that strain NKC3-5T had 716 pan-genome orthologous groups (POGs), dominated with carbohydrate metabolism. Phylogenomic analysis based on the concatenated core POGs revealed that strain NKC3-5T was closely related to Salicibibacter kimchii. The predominant polar lipids were phosphatidylglycerol and two unidentified lipids. Anteiso-C15:0, iso-C17:0, anteiso-C17:0, and iso-C15:0 were the major cellular fatty acids, and menaquinone-7 was the major isoprenoid quinone present in strain NKC3-5T. Cell wall peptidoglycan analysis of strain NKC3-5T showed that meso-diaminopimelic acid was the diagnostic diamino acid. The phephenotypic, genomic, phylogenetic, and chemotaxonomic properties reveal that the strain represents a novel species of the genus Salicibibacter, for which the name Salicibibacter halophilus sp. nov. is proposed, with the type strain NKC3-5T (= KACC 21230T = JCM 33437T).

RevDate: 2019-10-26

Zhu D, Yang Z, Xu J, et al (2019)

Pan-genome analysis of Riemerella anatipestifer reveals its genomic diversity and acquired antibiotic resistance associated with genomic islands.

Functional & integrative genomics pii:10.1007/s10142-019-00715-x [Epub ahead of print].

Riemerella anatipestifer is a gram-negative bacterium that leads to severe contagious septicemia in ducks, turkeys, chickens, and wild waterfowl. Here, a pan-genome with 32 R. anatipestifer genomes is re-established, and the mathematical model is calculated to evaluate the expansion of R. anatipestifer genomes, which were determined to be open. Average nucleotide identity (ANI) and phylogenetic analysis preliminarily clarify intraspecies variation and distance. Comparative genomic analysis of R. anatipestifer found that horizontal gene transfer events, which provide an expressway for the recruitment of novel functionalities and facilitate genetic diversity in microbial genomes, play a key role in the process of acquiring and transmitting antibiotic-resistance genes in R. anatipestifer. Furthermore, a new antibiotic-resistance gene cluster was identified in the same loci in 14 genomes. The uneven distribution of virulence factors was also confirmed by our results. Our study suggests that the ability to acquire foreign genes (such as antibiotic-resistance genes) increases the adaptability of R. anatipestifer, and the virulence genes with little mobility are highly conserved in R. anatipestifer.

RevDate: 2019-10-24

Vallenet D, Calteau A, Dubois M, et al (2019)

MicroScope: an integrated platform for the annotation and exploration of microbial gene functions through genomic, pangenomic and metabolic comparative analysis.

Nucleic acids research pii:5606622 [Epub ahead of print].

Large-scale genome sequencing and the increasingly massive use of high-throughput approaches produce a vast amount of new information that completely transforms our understanding of thousands of microbial species. However, despite the development of powerful bioinformatics approaches, full interpretation of the content of these genomes remains a difficult task. Launched in 2005, the MicroScope platform (https://www.genoscope.cns.fr/agc/microscope) has been under continuous development and provides analysis for prokaryotic genome projects together with metabolic network reconstruction and post-genomic experiments allowing users to improve the understanding of gene functions. Here we present new improvements of the MicroScope user interface for genome selection, navigation and expert gene annotation. Automatic functional annotation procedures of the platform have also been updated and we added several new tools for the functional annotation of genes and genomic regions. We finally focus on new tools and pipeline developed to perform comparative analyses on hundreds of genomes based on pangenome graphs. To date, MicroScope contains data for >11 800 microbial genomes, part of which are manually curated and maintained by microbiologists (>4500 personal accounts in September 2019). The platform enables collaborative work in a rich comparative genomic context and improves community-based curation efforts.

RevDate: 2019-10-24

Mende DR, Letunic I, Maistrenko OM, et al (2019)

proGenomes2: an improved database for accurate and consistent habitat, taxonomic and functional annotations of prokaryotic genomes.

Nucleic acids research pii:5606617 [Epub ahead of print].

Microbiology depends on the availability of annotated microbial genomes for many applications. Comparative genomics approaches have been a major advance, but consistent and accurate annotations of genomes can be hard to obtain. In addition, newer concepts such as the pan-genome concept are still being implemented to help answer biological questions. Hence, we present proGenomes2, which provides 87 920 high-quality genomes in a user-friendly and interactive manner. Genome sequences and annotations can be retrieved individually or by taxonomic clade. Every genome in the database has been assigned to a species cluster and most genomes could be accurately assigned to one or multiple habitats. In addition, general functional annotations and specific annotations of antibiotic resistance genes and single nucleotide variants are provided. In short, proGenomes2 provides threefold more genomes, enhanced habitat annotations, updated taxonomic and functional annotation and improved linkage to the NCBI BioSample database. The database is available at http://progenomes.embl.de/.

RevDate: 2019-10-24

Yin Z, Yuan C, Du Y, et al (2019)

Comparative genomic analysis of the Hafnia genus reveals an explicit evolutionary relationship between the species alvei and paralvei and provides insights into pathogenicity.

BMC genomics, 20(1):768 pii:10.1186/s12864-019-6123-1.

BACKGROUND: The Hafnia genus is an opportunistic pathogen that has been implicated in both nosocomial and community-acquired infections. Although Hafnia is fairly often isolated from clinical material, its taxonomy has remained an unsolved riddle, and the involvement and importance of Hafnia in human disease is also uncertain. Here, we used comparative genomic analysis to define the taxonomy of Hafnia, identify species-specific genes that may be the result of ecological and pathogenic specialization, and reveal virulence-related genetic profiles that may contribute to pathogenesis.

RESULTS: One complete genome sequence and 19 draft genome sequences for Hafnia strains were generated and combined with 27 publicly available genomes. We provided high-resolution typing methods by constructing phylogeny and population structure based on single-copy core genes in combination with whole genome average nucleotide identity to identify two distant Hafnia species (alvei and paralvei) and one mislabeled strain. The open pan-genome and the presence of numerous mobile genetic elements reveal that Hafnia has undergone massive gene rearrangements. Presence of species-specific core genomes associated with metabolism and transport suggests the putative niche differentiation between alvei and paralvei. We also identified possession of diverse virulence-related profiles in both Hafnia species., including the macromolecular secretion system, virulence, and antimicrobial resistance. In the macromolecular system, T1SS, Flagellum 1, Tad pilus and T6SS-1 were conserved in Hafnia, whereas T4SS, T5SS, and other T6SSs exhibited the evolution of diversity. The virulence factors in Hafnia are related to adherence, toxin, iron uptake, stress adaptation, and efflux pump. The identified resistance genes are associated with aminoglycoside, beta-lactam, bacitracin, cationic antimicrobial peptide, fluoroquinolone, and rifampin. These virulence-related profiles identified at the genomic level provide insights into Hafnia pathogenesis and the differentiation between alvei and paralvei.

CONCLUSIONS: Our research using core genome phylogeny and comparative genomics analysis of a larger collection of strains provides a comprehensive view of the taxonomy and species-specific traits between Hafnia species. Deciphering the genome of Hafnia strains possessing a reservoir of macromolecular secretion systems, virulence factors, and resistance genes related to pathogenicity may provide insights into addressing its numerous infections and devising strategies to combat the pathogen.

RevDate: 2019-10-23

Lu F, Wei Z, Luo Y, et al (2019)

SilkDB 3.0: visualizing and exploring multiple levels of data for silkworm.

Nucleic acids research pii:5603220 [Epub ahead of print].

SilkDB is an open-accessibility database and powerful platform that provides comprehensive information on the silkworm (Bombyx mori) genome. Since SilkDB 2.0 was released 10 years ago, vast quantities of data about multiple aspects of the silkworm have been generated, including genome, transcriptome, Hi-C and pangenome. To visualize data at these different biological levels, we present SilkDB 3.0 (https://silkdb.bioinfotoolkits.net), a visual analytic tool for exploring silkworm data through an interactive user interface. The database contains a high-quality chromosome-level assembly of the silkworm genome, and its coding sequences and gene sets are more accurate than those in the previous version. SilkDB 3.0 provides a view of the information for each gene at the levels of sequence, protein structure, gene family, orthology, synteny, genome organization and gives access to gene expression information, genetic variation and genome interaction map. A set of visualization tools are available to display the abundant information in the above datasets. With an improved interactive user interface for the integration of large data sets, the updated SilkDB 3.0 database will be a valuable resource for the silkworm and insect research community.

RevDate: 2019-10-23

Nanayakkara BS, O'Brien CL, DM Gordon (2019)

Phenotypic characteristics contributing to the enhanced growth of Escherichia coli bloom strains.

Environmental microbiology reports [Epub ahead of print].

During bloom events, Escherichia coli cell counts increase to between 10,000 - 100,000 cfu/100 ml of water. The strains responsible for bloom events belong to E. coli phylogenetic groups A and B1, and all have acquired a capsule from Klebsiella. A pan-genome comparison of phylogroup A E. coli revealed that the ferric citrate uptake system (fecIRABCDE) was over-represented in phylogroup A bloom strains compared to non-bloom E. coli. A series of experiments was carried out to investigate if the capsule together with ferric citrate uptake system could confer a growth rate advantage on E. coli. Capsulated strains had a growth rate advantage regardless of the media composition and the presence/absence of the fec operon, and they had a shorter lag phase compared to capsule-negative strains. The results suggest that the Klebsiella capsule may facilitate nutrient uptake or utilisation by a strain. This, together with the protective roles played by the capsule and the shorter lag phase of capsule-positive strains, may explain why it is only capsule-positive strains that produce elevated counts in response to nutrient influx. This article is protected by copyright. All rights reserved.

RevDate: 2019-10-23

Zhong C, Han M, Yang P, et al (2019)

Comprehensive Analysis Reveals the Evolution and Pathogenicity of Aeromonas, Viewed from Both Single Isolated Species and Microbial Communities.

mSystems, 4(5): pii:4/5/e00252-19.

The genus Aeromonas is a common gastrointestinal pathogen associated with human and animal infections. Due to the high level of cross-species similarity, their evolutionary dynamics and genetic diversity are still fragmented. Hereby, we investigated the pan-genomes of 29 Aeromonas species, as well as Aeromonas species in microbial communities, to clarify their evolutionary dynamics and genetic diversity, with special focus on virulence factors and horizontal gene transfer events. Our study revealed an open pan-genome of Aeromonas containing 10,144 gene families. These Aeromonas species exhibited different functional constraints, with the single-copy core genes and most accessory genes experiencing purifying selection. The significant congruence between core genome and pan-genome trees revealed that core genes mainly affected evolutionary divergences of Aeromonas species. Gene gains and losses revealed a high level of genome plasticity, exhibited by hundreds of gene expansions and contractions, horizontally transferred genes, and mobile genetic elements. The selective constraints shaped virulence gene pools of these Aeromonas strains, where genes encoding hemolysin were ubiquitous. Of these strains, Aeromonas aquatica MX16A seemed to be more resistant, as it harbored most resistance genes. Finally, the virulence factors of Aeromonas in microbial communities were quite dynamic in response to environment changes. For example, the virulence diversity of Aeromonas in microbial communities could reach levels that match some of the most virulent Aeromonas species (such as A. hydrophila) in penetrated-air and modified-air packaging. Our work shed some light onto genetic diversity, evolutionary history, and functional features of Aeromonas, which could facilitate the detection and prevention of infections.IMPORTANCEAeromonas has long been known as a gastrointestinal pathogen, yet it has many species whose evolutionary dynamics and genetic diversity had been unclear until now. We have conducted pan-genome analysis for 29 Aeromonas species and revealed a high level of genome plasticity exhibited by hundreds of gene expansions and contractions, horizontally transferred genes, and mobile genetic elements. These species also contained many virulence factors both identified from single isolated species and microbial community. This pan-genome study could elevate the level for detection and prevention of Aeromonas infections.

RevDate: 2019-10-22

Brockhurst MA, Harrison E, Hall JPJ, et al (2019)

The Ecology and Evolution of Pangenomes.

Current biology : CB, 29(20):R1094-R1103.

Since the first genome-scale comparisons, it has been evident that the genomes of many species are unbound by strict vertical descent: Large differences in gene content can occur among genomes belonging to the same prokaryotic species, with only a fraction of genes being universal to all genomes. These insights gave rise to the pangenome concept. The pangenome is defined as the set of all the genes present in a given species and can be subdivided into the accessory genome, present in only some of the genomes, and the core genome, present in all the genomes. Pangenomes arise due to gene gain by genomes from other species through horizontal gene transfer and differential gene loss among genomes, and have been described in both prokaryotes and eukaryotes. Our current view of pangenome variation is phenomenological and incomplete. In this review, we outline the mechanistic, ecological and evolutionary drivers of and barriers to horizontal gene transfer that are likely to structure pangenomes. We highlight the key role of conflict between the host chromosome(s) and the mobile genetic elements that mediate gene exchange. We identify shortcomings in our current models of pangenome evolution and suggest directions for future research to allow a more complete understanding of how and why pangenomes evolve.

RevDate: 2019-10-20

Zhao S, Ci J, Xue J, et al (2019)

Cutibacterium acnes Type II strains are associated with acne in Chinese patients.

Antonie van Leeuwenhoek pii:10.1007/s10482-019-01344-x [Epub ahead of print].

Acne is a common inflammatory skin disease, especially in adolescents. Certain Cutibacterium acnes subtypes are associated with acne, although more than one subtype of C. acnes strains may simultaneously reside on the surface of the skin of an individual. To better understand the relationship between the genomic characteristics of C. acnes subtypes and acnes, we collected 50 C. acnes strains from the facial skin of 10 people (5 healthy individuals, 5 patients with acne) in Liaoning, China and performed whole genome sequencing of all strains. We demonstrated that the six potential pathogenic C. acnes strains were all Type II subtype, and discovered 90 unique genes of the six strains related to acne using pan-genome analysis. The distribution of 2 of the 90 genes was identified by PCR in bacterial cultures collected from the facial skin of 171 individuals (55 healthy individuals, 52 patients with mild acne and 64 patients with moderate to severe acne). Both the genes were significantly associated with acne (Chi square test, P < 0.01). We conclude that Type II strains are associated with acne in Chinese patients.

RevDate: 2019-10-18

Mangas EL, Rubio A, Álvarez-Marín R, et al (2019)

Pangenome of Acinetobacter baumannii uncovers two groups of genomes, one of them with genes involved in CRISPR/Cas defence systems associated with the absence of plasmids and exclusive genes for biofilm formation.

Microbial genomics [Epub ahead of print].

Acinetobacter baumannii is an opportunistic bacterium that causes hospital-acquired infections with a high mortality and morbidity, since there are strains resistant to virtually any kind of antibiotic. The chase to find novel strategies to fight against this microbe can be favoured by knowledge of the complete catalogue of genes of the species, and their relationship with the specific characteristics of different isolates. In this work, we performed a genomics analysis of almost 2500 strains. Two different groups of genomes were found based on the number of shared genes. One of these groups rarely has plasmids, and bears clustered regularly interspaced short palindromic repeat (CRISPR) sequences, in addition to CRISPR-associated genes (cas genes) or restriction-modification system genes. This fact strongly supports the lack of plasmids. Furthermore, the scarce plasmids in this group also bear CRISPR sequences, and specifically contain genes involved in prokaryotic toxin-antitoxin systems that could either act as the still little known CRISPR type IV system or be the precursors of other novel CRISPR/Cas systems. In addition, a limited set of strains present a new cas9-like gene, which may complement the other cas genes in inhibiting the entrance of new plasmids into the bacteria. Finally, this group has exclusive genes involved in biofilm formation, which would connect CRISPR systems to the biogenesis of these bacterial resistance structures.

RevDate: 2019-10-18

Wan X (2019)

Comparative Genome Analyses Reveal the Genomic Traits and Host Plant Adaptations of Flavobacterium akiainvivens IK-1T.

International journal of molecular sciences, 20(19): pii:ijms20194910.

The genus Flavobacterium contains a large group of commensal bacteria identified in diverse terrestrial and aquatic habitats. We compared the genome of a new species Flavobacterium akiainvivens IK-1T to public available genomes of Flavobacterium species to reveal the genomic traits and ecological roles of IK-1T. Principle component analysis (PCA) of carbohydrate-active enzyme classes suggests that IK-1T belongs to a terrestrial clade of Flavobacterium. In addition, type 2 and type 9 secretion systems involved in bacteria-environment interactions were identified in the IK-1T genome. The IK-1T genome encodes eukaryotic-like domain containing proteins including ankyrin repeats, von Willebrand factor type A domain, and major royal jelly proteins, suggesting that IK-1T may alter plant host physiology by secreting eukaryotic-like proteins that mimic host proteins. A novel two-component system FaRpfC-FaYpdB was identified in the IK-1T genome, which may mediate quorum sensing to regulate global gene expressions. Our findings suggest that comparative genome analyses of Flavobacterium spp. reveal that IK-1T has adapted to a terrestrial niche. Further functional characterizations of IK-1T secreted proteins and their regulation systems will shed light on molecular basis of bacteria-plant interactions in environments.

RevDate: 2019-10-17

Zhang Y, Zhang Z, Zhang H, et al (2019)

PADS Arsenal: a database of prokaryotic defense systems related genes.

Nucleic acids research pii:5588688 [Epub ahead of print].

Defense systems are vital weapons for prokaryotes to resist heterologous DNA and survive from the constant invasion of viruses, and they are widely used in biochemistry investigation and antimicrobial drug research. So far, numerous types of defense systems have been discovered, but there is no comprehensive defense systems database to organize prokaryotic defense gene datasets. To fill this gap, we unveil the prokaryotic antiviral defense system (PADS) Arsenal (https://bigd.big.ac.cn/padsarsenal), a public database dedicated to gathering, storing, analyzing and visualizing prokaryotic defense gene datasets. The initial version of PADS Arsenal integrates 18 distinctive categories of defense system with the annotation of 6 600 264 genes retrieved from 63,701 genomes across 33 390 species of archaea and bacteria. PADS Arsenal provides various ways to retrieve defense systems related genes information and visualize them with multifarious function modes. Moreover, an online analysis pipeline is integrated into PADS Arsenal to facilitate annotation and evolutionary analysis of defense genes. PADS Arsenal can also visualize the dynamic variation information of defense genes from pan-genome analysis. Overall, PADS Arsenal is a state-of-the-art open comprehensive resource to accelerate the research of prokaryotic defense systems.

RevDate: 2019-10-17

Li R, Tian X, Yang P, et al (2019)

Recovery of non-reference sequences missing from the human reference genome.

BMC genomics, 20(1):746 pii:10.1186/s12864-019-6107-1.

BACKGROUND: The non-reference sequences (NRS) represent structure variations in human genome with potential functional significance. However, besides the known insertions, it is currently unknown whether other types of structure variations with NRS exist.

RESULTS: Here, we compared 31 human de novo assemblies with the current reference genome to identify the NRS and their location. We resolved the precise location of 6113 NRS adding up to 12.8 Mb. Besides 1571 insertions, we detected 3041 alternate alleles, which were defined as having less than 90% (or none) identity with the reference alleles. These alternate alleles overlapped with 1143 protein-coding genes including a putative novel MHC haplotype. Further, we demonstrated that the alternate alleles and their flanking regions had high content of tandem repeats, indicating that their origin was associated with tandem repeats.

CONCLUSIONS: Our study detected a large number of NRS including many alternate alleles which are previously uncharacterized. We suggested that the origin of alternate alleles was associated with tandem repeats. Our results enriched the spectrum of genetic variations in human genome.

RevDate: 2019-10-15

Hoarfrost A, Nayfach S, Ladau J, et al (2019)

Global ecotypes in the ubiquitous marine clade SAR86.

The ISME journal pii:10.1038/s41396-019-0516-7 [Epub ahead of print].

SAR86 is an abundant and ubiquitous heterotroph in the surface ocean that plays a central role in the function of marine ecosystems. We hypothesized that despite its ubiquity, different SAR86 subgroups may be endemic to specific ocean regions and functionally specialized for unique marine environments. However, the global biogeographical distributions of SAR86 genes, and the manner in which these distributions correlate with marine environments, have not been investigated. We quantified SAR86 gene content across globally distributed metagenomic samples and modeled these gene distributions as a function of 51 environmental variables. We identified five distinct clusters of genes within the SAR86 pangenome, each with a unique geographic distribution associated with specific environmental characteristics. Gene clusters are characterized by the strong taxonomic enrichment of distinct SAR86 genomes and partial assemblies, as well as differential enrichment of certain functional groups, suggesting differing functional and ecological roles of SAR86 ecotypes. We then leveraged our models and high-resolution, remote sensing-derived environmental data to predict the distributions of SAR86 gene clusters across the world's oceans, creating global maps of SAR86 ecotype distributions. Our results reveal that SAR86 exhibits previously unknown, complex biogeography, and provide a framework for exploring geographic distributions of genetic diversity from other microbial clades.

RevDate: 2019-10-14

Tralamazza SM, Rocha LO, Oggenfuss U, et al (2019)

Complex evolutionary origins of specialized metabolite gene cluster diversity among the plant pathogenic fungi of the Fusarium graminearum species complex.

Genome biology and evolution pii:5586976 [Epub ahead of print].

Fungal genomes encode highly organized gene clusters that underlie the production of specialized (or secondary) metabolites. Gene clusters encode key functions to exploit plant hosts or environmental niches. Promiscuous exchange among species and frequent reconfigurations make gene clusters some of the most dynamic elements of fungal genomes. Despite evidence for high diversity in gene cluster content among closely related strains, the microevolutionary processes driving gene cluster gain, loss and neofunctionalization are largely unknown. We analyzed the Fusarium graminearum species complex (FGSC) composed of plant pathogens producing potent mycotoxins and causing Fusarium head blight on cereals. We de novo assembled genomes of previously uncharacterized FGSC members (two strains of F. austroamericanum, F. cortaderiae and F. meridionale). Our analyses of eight species of the FGSC in addition to 15 other Fusarium species identified a pangenome of 54 gene clusters within FGSC. We found that multiple independent losses were a key factor generating extant cluster diversity within the FGSC and the Fusarium genus. We identified a modular gene cluster conserved among distantly related fungi, which was likely reconfigured to encode different functions. We also found strong evidence that a rare cluster in FGSC was gained through an ancient horizontal transfer between bacteria and fungi. Chromosomal rearrangements underlying cluster loss were often complex and were likely facilitated by an enrichment in specific transposable elements. Our findings identify important transitory stages in the birth and death process of specialized metabolism gene clusters among very closely related species.

RevDate: 2019-10-14

Tett A, Huang KD, Asnicar F, et al (2019)

The Prevotella copri Complex Comprises Four Distinct Clades Underrepresented in Westernized Populations.

Cell host & microbe pii:S1931-3128(19)30427-5 [Epub ahead of print].

Prevotella copri is a common human gut microbe that has been both positively and negatively associated with host health. In a cross-continent meta-analysis exploiting >6,500 metagenomes, we obtained >1,000 genomes and explored the genetic and population structure of P. copri. P. copri encompasses four distinct clades (>10% inter-clade genetic divergence) that we propose constitute the P. copri complex, and all clades were confirmed by isolate sequencing. These clades are nearly ubiquitous and co-present in non-Westernized populations. Genomic analysis showed substantial functional diversity in the complex with notable differences in carbohydrate metabolism, suggesting that multi-generational dietary modifications may be driving reduced prevalence in Westernized populations. Analysis of ancient metagenomes highlighted patterns of P. copri presence consistent with modern non-Westernized populations and a clade delineation time pre-dating human migratory waves out of Africa. These findings reveal that P. copri exhibits a high diversity that is underrepresented in Western-lifestyle populations.

RevDate: 2019-10-10

Chen S, Soehnlen M, Blom J, et al (2019)

Comparative genomic analyses reveal diverse virulence factors and antimicrobial resistance mechanisms in clinical Elizabethkingia meningoseptica strains.

PloS one, 14(10):e0222648 pii:PONE-D-19-17060.

Three human clinical isolates of bacteria (designated strains Em1, Em2 and Em3) had high average nucleotide identity (ANI) to Elizabethkingia meningoseptica. Their genome sizes (3.89, 4.04 and 4.04 Mb) were comparable to those of other Elizabethkingia species and strains, and exhibited open pan-genome characteristics, with two strains being nearly identical and the third divergent. These strains were susceptible only to trimethoprim/sulfamethoxazole and ciprofloxacin amongst 16 antibiotics in minimum inhibitory tests. The resistome exhibited a high diversity of resistance genes, including 5 different lactamase- and 18 efflux protein- encoding genes. Forty-four genes encoding virulence factors were conserved among the strains. Sialic acid transporters and curli synthesis genes were well conserved in E. meningoseptica but absent in E. anophelis and E. miricola. E. meningoseptica carried several genes contributing to biofilm formation. 58 glycoside hydrolases (GH) and 25 putative polysaccharide utilization loci (PULs) were found. The strains carried numerous genes encoding two-component system proteins (56), transcription factor proteins (187~191), and DNA-binding proteins (6~7). Several prophages and CRISPR/Cas elements were uniquely present in the genomes.

RevDate: 2019-10-10

Bayliss SC, Thorpe HA, Coyle NM, et al (2019)

PIRATE: A fast and scalable pangenomics toolbox for clustering diverged orthologues in bacteria.

GigaScience, 8(10):.

BACKGROUND: Cataloguing the distribution of genes within natural bacterial populations is essential for understanding evolutionary processes and the genetic basis of adaptation. Advances in whole genome sequencing technologies have led to a vast expansion in the amount of bacterial genomes deposited in public databases. There is a pressing need for software solutions which are able to cluster, catalogue and characterise genes, or other features, in increasingly large genomic datasets.

RESULTS: Here we present a pangenomics toolbox, PIRATE (Pangenome Iterative Refinement and Threshold Evaluation), which identifies and classifies orthologous gene families in bacterial pangenomes over a wide range of sequence similarity thresholds. PIRATE builds upon recent scalable software developments to allow for the rapid interrogation of thousands of isolates. PIRATE clusters genes (or other annotated features) over a wide range of amino acid or nucleotide identity thresholds and uses the clustering information to rapidly identify paralogous gene families and putative fission/fusion events. Furthermore, PIRATE orders the pangenome using a directed graph, provides a measure of allelic variation, and estimates sequence divergence for each gene family.

CONCLUSIONS: We demonstrate that PIRATE scales linearly with both number of samples and computation resources, allowing for analysis of large genomic datasets, and compares favorably to other popular tools. PIRATE provides a robust framework for analysing bacterial pangenomes, from largely clonal to panmictic species.

RevDate: 2019-10-07

John J, George S, Nori SRC, et al (2019)

Evolutionary route of resistant genes in Staphylococcus aureus.

Genome biology and evolution pii:5582665 [Epub ahead of print].

Multi-drug resistant S. aureus is a leading concern worldwide. Coagulase-Negative Staphylococci (CoNS) are claimed to be the reservoir and source of important resistant elements in S. aureus. However, the origin and evolutionary route of resistant genes in S. aureus are still remaining unknown. Here, we performed a detailed phylogenomic analysis of 152 completely sequenced S. aureus strains in comparison with 7,529 non-Staphylococcus aureus reference bacterial genomes. Our results reveal that S. aureus has a large open pan-genome where 97 (55%) of its known resistant related genes belonging to its accessory genome. Among these genes, 47 (27%) were located within the Staphylococcal Cassette Chromosome (SCCmec), a transposable element responsible for resistance against major classes of antibiotics including beta-lactams, macrolides and aminoglycosides. However, the physically linked mec-box genes (MecA-MecR-MecI) that are responsible for the maintenance of SCCmec elements is not unique to S. aureus, instead it is widely distributed within Staphylococcaceae family. The phyletic patterns of SCCmec encoded resistant genes in Staphylococcus species are significantly different from that of its core genes indicating frequent exchange of these genes between Staphylococcus species. Our in-depth analysis of SCCmec resistant gene phylogenies reveals that genes such as blaZ, ble, kmA and tetK that are responsible for beta-lactam, bleomycin, kanamycin and tetracycline resistance in S. aureus were laterally transferred from non-Staphylococcus sources. In addition, at least 11 non-SCCmec encoded resistant genes in S. aureus, mostly present in plasmid are laterally acquired from distantly related species. Our study evidently shows that gene transfers played a crucial role in shaping the evolution of antibiotic resistance in S. aureus.

RevDate: 2019-10-04

Li G, Ji B, J Nielsen (2019)

The pan-genome of Saccharomyces cerevisiae.

FEMS yeast research pii:5581504 [Epub ahead of print].

Understanding genotype-phenotype relationships is fundamental in biology. With the benefit from next-generation sequencing and high-throughput phenotyping methodologies, there have been generated much genome and phenome data for Saccharomyces cerevisiae. This makes it an excellent model system to understand the genotype-phenotype relationship. In this paper, we presented the reconstruction and application of the yeast pan-genome in resolving genotype-phenotype relationship by a machine learning-assisted approach.

RevDate: 2019-10-04

Ferrés I, Fresia P, G Iraola (2019)

simurg: simulate bacterial pangenomes in R.

Bioinformatics (Oxford, England) pii:5581402 [Epub ahead of print].

MOTIVATION: The pangenome concept describes genetic variability as the union of genes shared in a set of genomes and constitutes the current paradigm for comparative analysis of bacterial populations. However, there is a lack of tools to simulate pangenome variability and structure using defined evolutionary models.

RESULTS: We developed simurg, an R package that allows to simulate bacterial pangenomes using different combinations of evolutionary constraints such as gene gain, gene loss and mutation rates. Our tool allows the straightforward and reproducible simulation of bacterial pangenomes using real sequence data, providing a valuable tool for benchmarking of pangenome softwares or comparing evolutionary hypotheses.

AVAILABILITY: The simurg package is released under the GPL-3 license, and is freely available for download from GitHub (https://github.com/iferres/simurg).

SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

RevDate: 2019-10-03

Sabbagh CRR, Carrere S, Lonjon F, et al (2019)

Pangenomic type III effector database of the plant pathogenic Ralstonia spp.

PeerJ, 7:e7346 pii:7346.

Background: The bacterial plant pathogenic Ralstonia species belong to the beta-proteobacteria class and are soil-borne pathogens causing vascular bacterial wilt disease, affecting a wide range of plant hosts. These bacteria form a heterogeneous group considered as a "species complex" gathering three newly defined species. Like many other Gram negative plant pathogens, Ralstonia pathogenicity relies on a type III secretion system, enabling bacteria to secrete/inject a large repertoire of type III effectors into their plant host cells. Type III-secreted effectors (T3Es) are thought to participate in generating a favorable environment for the pathogen (countering plant immunity and modifying the host metabolism and physiology).

Methods: Expert genome annotation, followed by specific type III-dependent secretion, allowed us to improve our Hidden-Markov-Model and Blast profiles for the prediction of type III effectors.

Results: We curated the T3E repertoires of 12 plant pathogenic Ralstonia strains, representing a total of 12 strains spread over the different groups of the species complex. This generated a pangenome repertoire of 102 T3E genes and 16 hypothetical T3E genes. Using this database, we scanned for the presence of T3Es in the 155 available genomes representing 140 distinct plant pathogenic Ralstonia strains isolated from different host plants in different areas of the globe. All this information is presented in a searchable database. A presence/absence analysis, modulated by a strain sequence/gene annotation quality score, enabled us to redefine core and accessory T3E repertoires.

RevDate: 2019-10-01

Song B, Song Y, Fu Y, et al (2019)

Draft genome sequence of Solanum aethiopicum provides insights into disease resistance, drought tolerance, and the evolution of the genome.

GigaScience, 8(10):.

BACKGROUND: The African eggplant (Solanum aethiopicum) is a nutritious traditional vegetable used in many African countries, including Uganda and Nigeria. It is thought to have been domesticated in Africa from its wild relative, Solanum anguivi. S. aethiopicum has been routinely used as a source of disease resistance genes for several Solanaceae crops, including Solanum melongena. A lack of genomic resources has meant that breeding of S. aethiopicum has lagged behind other vegetable crops.

RESULTS: We assembled a 1.02-Gb draft genome of S. aethiopicum, which contained predominantly repetitive sequences (78.9%). We annotated 37,681 gene models, including 34,906 protein-coding genes. Expansion of disease resistance genes was observed via 2 rounds of amplification of long terminal repeat retrotransposons, which may have occurred ∼1.25 and 3.5 million years ago, respectively. By resequencing 65 S. aethiopicum and S. anguivi genotypes, 18,614,838 single-nucleotide polymorphisms were identified, of which 34,171 were located within disease resistance genes. Analysis of domestication and demographic history revealed active selection for genes involved in drought tolerance in both "Gilo" and "Shum" groups. A pan-genome of S. aethiopicum was assembled, containing 51,351 protein-coding genes; 7,069 of these genes were missing from the reference genome.

CONCLUSIONS: The genome sequence of S. aethiopicum enhances our understanding of its biotic and abiotic resistance. The single-nucleotide polymorphisms identified are immediately available for use by breeders. The information provided here will accelerate selection and breeding of the African eggplant, as well as other crops within the Solanaceae family.

RevDate: 2019-10-01

Wang D, F Gao (2019)

Comprehensive Analysis of Replication Origins in Saccharomyces cerevisiae Genomes.

Frontiers in microbiology, 10:2122.

DNA replication initiates from multiple replication origins (ORIs) in eukaryotes. Discovery and characterization of replication origins are essential for a better understanding of the molecular mechanism of DNA replication. In this study, the features of autonomously replicating sequences (ARSs) in Saccharomyces cerevisiae have been comprehensively analyzed as follows. Firstly, we carried out the analysis of the ARSs available in S. cerevisiae S288C. By evaluating the sequence similarity of experimentally established ARSs, we found that 94.32% of ARSs are unique across the whole genome of S. cerevisiae S288C and those with high sequence similarity are prone to locate in subtelomeres. Subsequently, we built a non-redundant dataset with a total of 520 ARSs, which are based on ARSs annotation of S. cerevisiae S288C from SGD and then supplemented with those from OriDB and DeOri databases. We conducted a large-scale comparison of ORIs among the diverse budding yeast strains from a population genomics perspective. We found that 82.7% of ARSs are not only conserved in genomic sequence but also relatively conserved in chromosomal position. The non-conserved ARSs tend to distribute in the subtelomeric regions. We also conducted a pan-genome analysis of ARSs among the S. cerevisiae strains, and a total of 183 core ARSs existing in all yeast strains were determined. We extracted the genes adjacent to replication origins among the 104 yeast strains to examine whether there are differences in their gene functions. The result showed that the genes involved in the initiation of DNA replication, such as orc3, mcm2, mcm4, mcm6, and cdc45, are conservatively located adjacent to the replication origins. Furthermore, we found the genes adjacent to conserved ARSs are significantly enriched in DNA binding, enzyme activity, transportation, and energy, whereas for the genes adjacent to non-conserved ARSs are significantly enriched in response to environmental stress, metabolites biosynthetic process and biosynthesis of antibiotics. In general, we characterized the replication origins from the genome-wide and population genomics perspectives, which would provide new insights into the replication mechanism of S. cerevisiae and facilitate the design of algorithms to identify genome-wide replication origins in yeast.

RevDate: 2019-09-25

Dolatabadian A, Bayer PE, Tirnaz S, et al (2019)

Characterisation of disease resistance genes in the Brassica napus pangenome reveals significant structural variation.

Plant biotechnology journal [Epub ahead of print].

Methods based on single nucleotide polymorphism (SNP), copy number variation (CNV) and presence/absence variation (PAV) discovery provide a valuable resource to study gene structure and evolution. However, as a result of these structural variations, a single reference genome is unable to cover the entire gene content of a species. Therefore pangenomics analysis is needed to ensure that the genomic diversity within a species is fully represented. Brassica napus is one of the most important oilseed crops in the world and exhibits variability in its resistance genes across different cultivars. Here, we characterised resistance gene distribution across 50 B. napus lines. We identified a total of 1,749 resistance gene analogs (RGAs), of which 996 are core and 753 are variable, 368 of which are not present in the reference genome (cv. Darmor-bzh) . In addition, a total of 15,318 SNPs were predicted within 1,030 of the RGAs. The results showed that core R-genes harbour more SNPs than variable genes. More nucleotide binding site leucine-rich repeat (NBS-LRR) genes were located in clusters than as singletons, with variable genes more likely to be found in clusters. We identified 106 RGA candidates linked to blackleg resistance quantitative trait locus (QTL). This study provides a better understanding of resistance genes to target for genomics-based improvement and improved disease resistance.

RevDate: 2019-09-25

Zhang W, Wang J, Zhang D, et al (2019)

Complete Genome Sequencing and Comparative Genome Characterization of Lactobacillus johnsonii ZLJ010, a Potential Probiotic With Health-Promoting Properties.

Frontiers in genetics, 10:812.

Lactobacillus johnsonii ZLJ010 is a probiotic strain isolated from the feces of a healthy sow and has putative health-promoting properties. To determine the molecular basis underlying the probiotic potential of ZLJ010 and the genes involved in the same, complete genome sequencing and comparative genome analysis with L. johnsonii ZLJ010 were performed. The ZLJ010 genome was found to contain a single circular chromosome of 1,999,879 bp with a guanine-cytosine (GC) content of 34.91% and encoded 18 ribosomal RNA (rRNA) genes and 77 transfer RNA (tRNA) genes. From among the 1,959 protein coding sequences (CDSs), genes known to confer probiotic properties were identified, including genes related to stress adaptation, biosynthesis, metabolism, transport of amino acid, secretion, and the defense machinery. ZLJ010 lacked complete or partial biosynthetic pathways for amino acids but was predicted to compensate for this with an enhanced transport system and some unique amino acid permeases and peptidases that allow it to acquire amino acids and other precursors exogenously. The comparative genomic analysis of L. johnsonii ZLP001 and seven other available L. johnsonii strains, including L. johnsonii NCC533, FI9785, DPC6026, N6.2, BS15, UMNLJ22, and PF01, revealed 2,732 pan-genome orthologous gene clusters and 1,324 core-genome orthologous gene clusters. Phylogenomic analysis based on 1,288 single copy genes showed that ZLJ010 had a closer relationship with the BS15 from yogurt and DPC6026 from the porcine intestinal tract but was located on a relatively standalone branch. The number of clusters of unique, strain-specific genes ranged from 42 to 185. A total of 219 unique genes present in the genome of L. johnsonii ZLJ010 primarily encoded proteins that are putatively involved in replication, recombination and repair, defense mechanisms, transcription, amino acid transport and metabolism, and carbohydrate transport and metabolism. Two unique prophages were predicted in the ZLJ010 genome. The present study helps us understand the ability of L. johnsonii ZLJ010 to better adapt to the gut environment and also its probiotic functionalities.

RevDate: 2019-09-25

Pain M, Hjerde E, Klingenberg C, et al (2019)

Comparative Genomic Analysis of Staphylococcus haemolyticus Reveals Key to Hospital Adaptation and Pathogenicity.

Frontiers in microbiology, 10:2096.

Staphylococcus haemolyticus is a skin commensal gaining increased attention as an emerging pathogen of nosocomial infections. However, knowledge about the transition from a commensal to an invasive lifestyle remains sparse and there is a paucity of studies comparing pathogenicity traits between commensal and clinical isolates. In this study, we used a pan-genomic approach to identify factors important for infection and hospital adaptation by exploring the genomic variability of 123 clinical isolates and 46 commensal S. haemolyticus isolates. Phylogenetic reconstruction grouped the 169 isolates into six clades with a distinct distribution of clinical and commensal isolates in the different clades. Phenotypically, multi-drug antibiotic resistance was detected in 108/123 (88%) of the clinical isolates and 5/46 (11%) of the commensal isolates (p < 0.05). In the clinical isolates, we commonly identified a homolog of the serine-rich repeat glycoproteins sraP. Additionally, three novel capsular polysaccharide operons were detected, with a potential role in S. haemolyticus virulence. Clinical S. haemolyticus isolates showed specific signatures associated with successful hospital adaption. Biofilm forming S. haemolyticus isolates that are resistant to oxacillin (mecA) and aminoglycosides (aacA-aphD) are most likely invasive isolates whereas absence of these traits strongly indicates a commensal isolate. We conclude that our data show a clear segregation of isolates of commensal origin, and specific genetic signatures distinguishing the clinical isolates from the commensal isolates. The widespread use of antimicrobial agents has probably promoted the development of successful hospital adapted clones of S. haemolyticus clones through acquisition of mobile genetic elements or beneficial point mutations and rearrangements in surface associated genes.

RevDate: 2019-09-23

Heo S, Lee J, Lee JH, et al (2019)

Genomic insight into the salt tolerance of Enterococcus faecium, Enterococcus faecalis and Tetragenococcus halophilus.

Journal of microbiology and biotechnology pii:10.4014/jmb.1908.08015 [Epub ahead of print].

To shed light on the genetic basis of salt tolerance in Enterococcus faecium, Enterococcus faecalis, and Tetragenococcus halophilus, we performed comparative genome analysis of 10 E. faecalis, 11 E. faecium, and three T. halophilus strains. Factors involved in salt tolerance that could be used to distinguish the species were identified. Overall, T. halophilus contained a greater number of potassium transport and osmoprotectant synthesis genes compared with the other two species. In particular, our findings suggested that T. halophilus may be the only one among the three species capable of synthesizing glycine betaine from choline, cardiolipin from glycerol and proline from citrate. These molecules are well-known osmoprotectants; thus, we propose that these genes confer the salt-tolerance of T. halophilus.

RevDate: 2019-09-23

Hatje K, Mühlhausen S, Simm D, et al (2019)

The Protein-Coding Human Genome: Annotating High-Hanging Fruits.

BioEssays : news and reviews in molecular, cellular and developmental biology [Epub ahead of print].

The major transcript variants of human protein-coding genes are annotated to a certain degree of accuracy combining manual curation, transcript data, and proteomics evidence. However, there is considerable disagreement on the annotation of about 2000 genes-they can be protein-coding, noncoding, or pseudogenes-and on the annotation of most of the predicted alternative transcripts. Pure transcriptome mapping approaches seem to be limited in discriminating functional expression from noise. These limitations have partially been overcome by dedicated algorithms to detect alternative spliced micro-exons and wobble splice variants. Recently, knowledge about splice mechanism and protein structure are incorporated into an algorithm to predict neighboring homologous exons, often spliced in a mutually exclusive manner. Predicted exons are evaluated by transcript data, structural compatibility, and evolutionary conservation, revealing hundreds of novel coding exons and splice mechanism re-assignments. The emerging human pan-genome is necessitating distinctive annotations incorporating differences between individuals and between populations.

RevDate: 2019-09-18

Erwin DH (2019)

Tempos and modes of collectivity in the history of life.

Theory in biosciences = Theorie in den Biowissenschaften pii:10.1007/s12064-019-00303-4 [Epub ahead of print].

Collective integration and processing of information have increased through the history of life, through both the formation of aggregates in which the entities may have very different properties and which jointly coarse-grained environmental variables (ranging from widely varying metabolism in microbial consortia to the ecological diversity of species on reefs) and through collectives of similar entities (such as cells within an organism or social groups). Such increases have been implicated in significant transitions in the history of life, including aspects of the origin of life, the generation of pangenomes among microbes and microbial communities such as stromatolites, multicellularity and social insects. This contribution provides a preliminary overview of the dominant modes of collective information processing in the history of life, their phylogenetic distribution and extent of convergence, and the effects of new modes for integrating and acting upon information on the tempo of evolutionary change.

RevDate: 2019-09-12

Sigalova OM, Chaplin AV, Bochkareva OO, et al (2019)

Chlamydia pan-genomic analysis reveals balance between host adaptation and selective pressure to genome reduction.

BMC genomics, 20(1):710 pii:10.1186/s12864-019-6059-5.

BACKGROUND: Chlamydia are ancient intracellular pathogens with reduced, though strikingly conserved genome. Despite their parasitic lifestyle and isolated intracellular environment, these bacteria managed to avoid accumulation of deleterious mutations leading to subsequent genome degradation characteristic for many parasitic bacteria.

RESULTS: We report pan-genomic analysis of sixteen species from genus Chlamydia including identification and functional annotation of orthologous genes, and characterization of gene gains, losses, and rearrangements. We demonstrate the overall genome stability of these bacteria as indicated by a large fraction of common genes with conserved genomic locations. On the other hand, extreme evolvability is confined to several paralogous gene families such as polymorphic membrane proteins and phospholipase D, and likely is caused by the pressure from the host immune system.

CONCLUSIONS: This combination of a large, conserved core genome and a small, evolvable periphery likely reflect the balance between the selective pressure towards genome reduction and the need to adapt to escape from the host immunity.

RevDate: 2019-09-12

Ghaffaari A, T Marschall (2019)

Fully-sensitive seed finding in sequence graphs using a hybrid index.

Bioinformatics (Oxford, England), 35(14):i81-i89.

MOTIVATION: Sequence graphs are versatile data structures that are, for instance, able to represent the genetic variation found in a population and to facilitate genome assembly. Read mapping to sequence graphs constitutes an important step for many applications and is usually done by first finding exact seed matches, which are then extended by alignment. Existing methods for finding seed hits prune the graph in complex regions, leading to a loss of information especially in highly polymorphic regions of the genome. While such complex graph structures can indeed lead to a combinatorial explosion of possible alleles, the query set of reads from a diploid individual realizes only two alleles per locus-a property that is not exploited by extant methods.

RESULTS: We present the Pan-genome Seed Index (PSI), a fully-sensitive hybrid method for seed finding, which takes full advantage of this property by combining an index over selected paths in the graph with an index over the query reads. This enables PSI to find all seeds while eliminating the need to prune the graph. We demonstrate its performance with different parameter settings on both simulated data and on a whole human genome graph constructed from variants in the 1000 Genome Project dataset. On this graph, PSI outperforms GCSA2 in terms of index size, query time and sensitivity.

The C++ implementation is publicly available at: https://github.com/cartoonist/psi.

RevDate: 2019-09-11

Espadinha D, Sobral RG, Mendes CI, et al (2019)

Distinct Phenotypic and Genomic Signatures Underlie Contrasting Pathogenic Potential of Staphylococcus epidermidis Clonal Lineages.

Frontiers in microbiology, 10:1971.

Background:Staphylococcus epidermidis is a common skin commensal that has emerged as a pathogen in hospitals, mainly related to medical devices-associated infections. Noteworthy, infection rates by S. epidermidis have the tendency to rise steeply in next decades together with medical devices use and immunocompromized population growth. Staphylococcus epidermidis population structure includes two major clonal lineages (A/C and B) that present contrasting pathogenic potentials. To address this distinction and explore the basis of increased pathogenicity of A/C lineage, we performed a detailed comparative analysis using phylogenetic and integrated pangenome-wide-association study (panGWAS) approaches and compared the lineages's phenotypes in in vitro conditions mimicking carriage and infection. Results: Each S. epidermidis lineage had distinct phenotypic signatures in skin and infection conditions and differed in genomic content. Combination of phenotypic and genotypic data revealed that both lineages were well adapted to skin environmental cues. However, they appear to occupy different skin niches, perform distinct biological functions in the skin and use different mechanisms to complete the same function: lineage B strains showed evidence of specialization to survival in microaerobic and lipid rich environment, characteristic of hair follicle and sebaceous glands; lineage A/C strains showed evidence for adaption to diverse osmotic and pH conditions, potentially allowing them to occupy a broader and more superficial skin niche. In infection conditions, A/C strains had an advantage, having the potential to bind blood-associated host matrix proteins, form biofilms at blood pH, resist antibiotics and macrophage acidity and to produce proteases. These features were observed to be rare in the lineage B strains. PanGWAS analysis produced a catalog of putative S. epidermidis virulence factors and identified an epidemiological molecular marker for the more pathogenic lineage. Conclusion: The prevalence of A/C lineage in infection is probably related to a higher metabolic and genomic versatility that allows rapid adaptation during transition from a commensal to a pathogenic lifestyle. The putative virulence and phenotypic factors associated to A/C lineage constitute a reliable framework for future studies on S. epidermidis pathogenesis and the finding of an epidemiological marker for the more pathogenic lineage is an asset for the management of S. epidermidis infections.

RevDate: 2019-09-11

Fariq A, Blazier JC, Yasmin A, et al (2019)

Whole genome sequence analysis reveals high genetic variation of newly isolated Acidithiobacillus ferrooxidans IO-2C.

Scientific reports, 9(1):13049 pii:10.1038/s41598-019-49213-x.

Acidithiobacillus ferrooxidans, a chemolithoautotrophic bacterium, is well known for its mineral oxidizing properties. The current study combines experimental and whole genome sequencing approaches to investigate an iron oxidizing, extreme acidophilic bacterium, A. ferrooxidans isolate (IO-2C) from an acid seep area near Carlos, TX, USA. Strain IO-2C was capable of oxidizing iron i.e. iron sulphate and iron ammonium sulphate yielding shwertmannite and jarosite minerals. Further, the bacterium's genome was sequenced, assembled and annotated to study its general features, structure and functions. To determine genetic heterogeneity, it was compared with the genomes of other published A. ferrooxidans strains. Pan-genome analysis displayed low gene conservation and significant genetic diversity in A. ferrooxidans species comprising of 6926 protein coding sequences with 23.04% (1596) core genes, 46.13% (3195) unique and 30.82% (2135) accessory genes. Variant analysis showed >75,000 variants, 287 of them with a predicted high impact, in A. ferrooxidans IO-2C genome compared to the reference strain, resulting in abandonment of some important functional key genes. The genome contains numerous functional genes for iron and sulphur metabolism, nitrogen fixation, secondary metabolites, degradation of aromatic compounds, and multidrug and heavy metal resistance. This study demonstrated the bio-oxidation of iron by newly isolated A. ferrooxidans IO-2C under acidic conditions, which was further supported by genomic analysis. Genomic analysis of this strain provided valuable information about the complement of genes responsible for the utilization of iron and tolerance of other metals.

RevDate: 2019-09-10

Kaminski MA, Sobczak A, Dziembowski A, et al (2019)

Genomic Analysis of γ-Hexachlorocyclohexane-Degrading Sphingopyxis lindanitolerans WS5A3p Strain in the Context of the Pangenome of Sphingopyxis.

Genes, 10(9): pii:genes10090688.

Sphingopyxis inhabit diverse environmental niches, including marine, freshwater, oceans, soil and anthropogenic sites. The genus includes 20 phylogenetically distinct, valid species, but only a few with a sequenced genome. In this work, we analyzed the nearly complete genome of the newly described species, Sphingopyxislindanitolerans, and compared it to the other available Sphingopyxis genomes. The genome included 4.3 Mbp in total and consists of a circular chromosome, and two putative plasmids. Among the identified set of lin genes responsible for γ-hexachlorocyclohexane pesticide degradation, we discovered a gene coding for a new isoform of the LinA protein. The significant potential of this species in the remediation of contaminated soil is also correlated with the fact that its genome encodes a higher number of enzymes potentially involved in aromatic compound degradation than for most other Sphingopyxis strains. Additional analysis of 44 Sphingopyxis representatives provides insights into the pangenome of Sphingopyxis and revealed a core of 734 protein clusters and between four and 1667 unique proteins per genome.

RevDate: 2019-09-10

Fenske GJ, Thachil A, McDonough PL, et al (2019)

Geography Shapes the Population Genomics of Salmonella enterica Dublin.

Genome biology and evolution, 11(8):2220-2231.

Salmonella enterica serotype Dublin (S. Dublin) is a bovine-adapted serotype that can cause serious systemic infections in humans. Despite the increasing prevalence of human infections and the negative impact on agricultural processes, little is known about the population structure of the serotype. To this end, we compiled a manually curated data set comprising of 880 S. Dublin genomes. Core genome phylogeny and ancestral state reconstruction revealed that region-specific clades dominate the global population structure of S. Dublin. Strains of S. Dublin in the UK are genomically distinct from US, Brazilian, and African strains. The geographical partitioning impacts the composition of the core genome as well as the ancillary genome. Antibiotic resistance genes are almost exclusively found in US genomes and are mediated by an IncA/C2 plasmid. Phage content and the S. Dublin virulence plasmid were strongly conserved in the serotype. Comparison of S. Dublin to a closely related serotype, S. enterica serotype Enteritidis, revealed that S. Dublin contains 82 serotype specific genes that are not found in S. Enteritidis. Said genes encode metabolic functions involved in the uptake and catabolism of carbohydrates and virulence genes associated with type VI secretion systems and fimbria assembly respectively.

RevDate: 2019-09-05

Safari M, Yakhchali B, V Shariati J (2019)

Comprehensive genomic analysis of an indigenous Pseudomonas pseudoalcaligenes degrading phenolic compounds.

Scientific reports, 9(1):12736 pii:10.1038/s41598-019-49048-6.

Environmental contamination with aromatic compounds is a universal challenge. Aromatic-degrading microorganisms isolated from the same or similar polluted environments seem to be more suitable for bioremediation. Moreover, microorganisms adapted to contaminated environments are able to use toxic compounds as the sole sources of carbon and energy. An indigenous strain of Pseudomonas, isolated from the Mahshahr Petrochemical plant in the Khuzestan province, southwest of Iran, was studied genetically. It was characterized as a novel Gram-negative, aerobic, halotolerant, rod-shaped bacterium designated Pseudomonas YKJ, which was resistant to chloramphenicol and ampicillin. Genome of the strain was completely sequenced using Illumina technology to identify its genetic characteristics. MLST analysis revealed that the YKJ strain belongs to the genus Pseudomonas indicating the highest sequence similarity with Pseudomonas pseudoalcaligenes strain CECT 5344 (99% identity). Core- and pan-genome analysis indicated that P. pseudoalcaligenes contains 1,671 core and 3,935 unique genes for coding DNA sequences. The metabolic and degradation pathways for aromatic pollutants were investigated using the NCBI and KEGG databases. Genomic and experimental analyses showed that the YKJ strain is able to degrade certain aromatic compounds including bisphenol A, phenol, benzoate, styrene, xylene, benzene and chlorobenzene. Moreover, antibiotic resistance and chemotaxis properties of the YKJ strain were found to be controlled by two-component regulatory systems.

RevDate: 2019-09-04

Tidjani AR, Lorenzi JN, Toussaint M, et al (2019)

Massive Gene Flux Drives Genome Diversity between Sympatric Streptomyces Conspecifics.

mBio, 10(5): pii:mBio.01533-19.

In this work, by comparing genomes of closely related individuals of Streptomyces isolated at a spatial microscale (millimeters or centimeters), we investigated the extent and impact of horizontal gene transfer in the diversification of a natural Streptomyces population. We show that despite these conspecific strains sharing a recent common ancestor, all harbored significantly different gene contents, implying massive and rapid gene flux. The accessory genome of the strains was distributed across insertion/deletion events (indels) ranging from one to several hundreds of genes. Indels were preferentially located in the arms of the linear chromosomes (ca. 12 Mb) and appeared to form recombination hot spots. Some of them harbored biosynthetic gene clusters (BGCs) whose products confer an inhibitory capacity and may constitute public goods that can favor the cohesiveness of the bacterial population. Moreover, a significant proportion of these variable genes were either plasmid borne or harbored signatures of actinomycete integrative and conjugative elements (AICEs). We propose that conjugation is the main driver for the indel flux and diversity in Streptomyces populations.IMPORTANCE Horizontal gene transfer is a rapid and efficient way to diversify bacterial gene pools. Currently, little is known about this gene flux within natural soil populations. Using comparative genomics of Streptomyces strains belonging to the same species and isolated at microscale, we reveal frequent transfer of a significant fraction of the pangenome. We show that it occurs at a time scale enabling the population to diversify and to cope with its changing environment, notably, through the production of public goods.

RevDate: 2019-09-02

Fernie AR, A Aharoni (2019)

Pan-Genomic Illumination of Tomato Identifies Novel Gene-Trait Interactions.

Trends in plant science pii:S1360-1385(19)30210-9 [Epub ahead of print].

A recent study by Gao et al., (Nat. Genet., 2019) presents a tomato pan-genome that was constructed using genome sequences of 725 phylogenetically and geographically representative accessions. The study revealed 4873 genes that are absent from the reference genome, including important genes associated with both disease resistance and flavor, thereby providing an important breeding resource.

RevDate: 2019-08-29

Zeng C, Gilcrease EB, Hendrix RW, et al (2019)

DNA packaging and genomics of the Salmonella 9NA-like phages.

Journal of virology pii:JVI.00848-19 [Epub ahead of print].

We present the genome sequences of Salmonella enterica tailed phages Sasha, Sergei, and Solent. These phages, along with Salmonella phages 9NA, FSL_SP-062 and FSL_SP-069 and the more distantly-related Proteus phage PmiS-Isfahan have similar sized genomes between 52 and 57 kbp in length that are largely syntenic. Their genomes also show substantial genome mosaicism relative to one another, which is common within tailed phage clusters. Their gene content ranges from 80 to 99 predicted genes, of which 40 are common to all seven and form the core genome which includes all identifiable virion assembly and DNA replication genes. The total number of gene types (pangenome) in the seven phages is 176, and 59 of these are unique to individual phages. Their core genomes are much more closely related to one another than to any other known phage, and they comprise a well-defined cluster within the family Siphoviridae To begin to characterize this group of phages in more experimental detail, we identified the genes that encode the major virion proteins and examined the DNA packaging of the prototypic member, phage 9NA. We showed that it uses a pac site-directed headful packaging mechanism that results in virion chromosomes that are circularly permuted and about 13% terminally redundant. We also showed that its packaging series initiate with dsDNA cleavages that are scattered across a 170 bp region, and that its headful measuring device has a precision of ±1.8%.IMPORTANCE The 9NA-like phages are clearly highly related to each other but are not closely related to any other known phage type. This work describes the genomes of three new 9NA-like phages, and experimental analysis of the proteome of the 9NA virion and DNA packaging into the 9NA phage head. There is increasing interest in the biology of phages because of their potential for use as antibacterial agents and for their ecological roles in bacterial communities. 9NA-like phages have been identified that infect two bacterial genera to date and related phages infecting additional Gram-negative hosts are likely to be found in the future. This work provides a foundation for the study of these phages which will facilitate their study and potential use.

RevDate: 2019-09-02

Lee K, Kim MS, Lee JS, et al (2019)

Chromosomal features revealed by comparison of genetic maps of Glycine max and Glycine soja.

Genomics pii:S0888-7543(19)30394-5 [Epub ahead of print].

Recombination is a crucial component of evolution and breeding. New combinations of variation on chromosomes are shaped by recombination. Recombination is also involved in chromosomal rearrangements. However, recombination rates vary tremendously among chromosome segments. Genome-wide genetic maps are one of the best tools to study variation of recombination. Here, we describe high density genetic maps of Glycine max and Glycine soja constructed from four segregating populations. The maps were used to identify chromosomal rearrangements and find the highly predictable pattern of cross-overs on the broad scale in soybean. Markers on these genetic maps were used to evaluate assembly quality of the current soybean reference genome sequence. We find a strong inversion candidate larger than 3 Mb based on patterns of cross-overs. We also identify quantitative trait loci (QTL) that control number of cross-overs. This study provides fundamental insights relevant to practical strategy for breeding programs and for pan-genome researches.

RevDate: 2019-08-31

Seif Y, Monk JM, Machado H, et al (2019)

Systems Biology and Pangenome of Salmonella O-Antigens.

mBio, 10(4): pii:mBio.01247-19.

O-antigens are glycopolymers in lipopolysaccharides expressed on the cell surface of Gram-negative bacteria. Variability in the O-antigen structure constitutes the basis for the establishment of the serotyping schema. We pursued a two-pronged approach to define the basis for O-antigen structural diversity. First, we developed a bottom-up systems biology approach to O-antigen metabolism by building a reconstruction of Salmonella O-antigen biosynthesis and used it to (i) update 410 existing Salmonella strain-specific metabolic models, (ii) predict a strain's serogroup and its O-antigen glycan synthesis capability (yielding 98% agreement with experimental data), and (iii) extend our workflow to more than 1,400 Gram-negative strains. Second, we used a top-down pangenome analysis to elucidate the genetic basis for intraserogroup O-antigen structural variations. We assembled a database of O-antigen gene islands from over 11,000 sequenced Salmonella strains, revealing (i) that gene duplication, pseudogene formation, gene deletion, and bacteriophage insertion elements occur ubiquitously across serogroups; (ii) novel serotypes in the group O:4 B2 variant, as well as an additional genotype variant for group O:4, and (iii) two novel O-antigen gene islands in understudied subspecies. We thus comprehensively defined the genetic basis for O-antigen diversity.IMPORTANCE Lipopolysaccharides are a major component of the outer membrane in Gram-negative bacteria. They are composed of a conserved lipid structure that is embedded in the outer leaflet of the outer membrane and a polysaccharide known as the O-antigen. O-antigens are highly variable in structure across strains of a species and are crucial to a bacterium's interactions with its environment. They constitute the first line of defense against both the immune system and bacteriophage infections and have been shown to mediate antimicrobial resistance. The significance of our research is in identifying the metabolic and genetic differences within and across O-antigen groups in Salmonella strains. Our effort constitutes a first step toward characterizing the O-antigen metabolic network across Gram-negative organisms and a comprehensive overview of genetic variations in Salmonella.

RevDate: 2019-08-14

Dar HA, Zaheer T, Shehroz M, et al (2019)

Immunoinformatics-Aided Design and Evaluation of a Potential Multi-Epitope Vaccine against Klebsiella Pneumoniae.

Vaccines, 7(3): pii:vaccines7030088.

Klebsiella pneumoniae is an opportunistic gram-negative bacterium that causes nosocomial infection in healthcare settings. Despite the high morbidity and mortality rate associated with these bacterial infections, no effective vaccine is available to counter the pathogen. In this study, the pangenome of a total of 222 available complete genomes of K. pneumoniae was explored to obtain the core proteome. A reverse vaccinology strategy was applied to the core proteins to identify four antigenic proteins. These proteins were then subjected to epitope mapping and prioritization steps to shortlist nine B-cell derived T-cell epitopes which were linked together using GPGPG linkers. An adjuvant (Cholera Toxin B) was also added at the N-terminal of the vaccine construct to improve its immunogenicity and a stabilized multi-epitope protein structure was obtained using molecular dynamics simulation. The designed vaccine exhibited sustainable and strong bonding interactions with Toll-like receptor 2 and Toll-like receptor 4. In silico reverse translation and codon optimization also confirmed its high expression in E. coli K12 strain. The computer-aided analyses performed in this study imply that the designed multi-epitope vaccine can elicit specific immune responses against K. pneumoniae. However, wet lab validation is necessary to further verify the effectiveness of this proposed vaccine candidate.

RevDate: 2019-08-10

Xing J, Li X, Sun Y, et al (2019)

Comparative genomic and functional analysis of Akkermansia muciniphila and closely related species.

Genes & genomics pii:10.1007/s13258-019-00855-1 [Epub ahead of print].

BACKGROUND: Akkermansia muciniphila is an important bacterium that resides on the mucus layer of the intestinal tract. Akkermansia muciniphila has a high abundance in human feces and plays an important role in human health.

OBJECTIVE: In this article, 23 whole genome sequences of the Akkermansia genus were comparatively studied.

METHODS: Phylogenetic trees were constructed with three methods: All amino acid sequences of each strain were used to construct the first phylogenetic tree using the web server of Composition Vector Tree Version 3. The matrix of Genome-to-Genome Distances which were obtained from GGDC 2.0 was used to construct the second phylogenetic tree using FastME. The concatenated single-copy core gene-based phylogenetic tree was generated through MEGA. The single-copy genes were obtained using OrthoMCL. Population structure was assessed by STRUCTURE 2.3.4 using the SNPs in core genes. PROKKA and Roary were used to do pan-genome analyses. The biosynthetic gene clusters were predicted using antiSMASH 4.0. IalandViewer 4 was used to detect the genomic islands.

RESULTS: The results of comparative genomic analysis revealed that: (1) The 23 Akkermansia strains formed 4 clades in phylogenetic trees. The A. muciniphila strains isolated from different geographic regions and ecological niches, formed a closely related clade. (2) The 23 Akkermansia strains were divided into 4 species based on digital DNA-DNA hybridization (dDDH) values. (3) Pan-genome of A. muciniphila is in an open state and increases with addition of new sequenced genomes. (4) SNPs were not evenly distributed throughout the A. muciniphila genomes. The genes in regions with high SNP density are related to metabolism and cell wall/membrane envelope biogenesis. (5) The thermostable outer-membrane protein, Amuc_1100, was conserved in the Akkermansia genus, except for Akkermansia glycaniphila PytT.

CONCLUSION: Overall, applying comparative genomic and pan-genomic analyses, we classified and illuminated the phylogenetic relationship of the 23 Akkermansia strains. Insights of the evolutionary, population structure, gene clusters and genome islands of Akkermansia provided more information about the possible physiological and probiotic mechanisms of the Akkermansia strains, and gave some instructions for the in-depth researches about the use of Akkermansia as a gut probiotic in the future.

RevDate: 2019-08-08

Khan AMAM, Mendoza C, Hauk VJ, et al (2019)

Genomic and physiological analyses reveal that extremely thermophilic Caldicellulosiruptor changbaiensis deploys uncommon cellulose attachment mechanisms.

Journal of industrial microbiology & biotechnology pii:10.1007/s10295-019-02222-1 [Epub ahead of print].

The genus Caldicellulosiruptor is comprised of extremely thermophilic, heterotrophic anaerobes that degrade plant biomass using modular, multifunctional enzymes. Prior pangenome analyses determined that this genus is genetically diverse, with the current pangenome remaining open, meaning that new genes are expected with each additional genome sequence added. Given the high biodiversity observed among the genus Caldicellulosiruptor, we have sequenced and added a 14th species, Caldicellulosiruptor changbaiensis, to the pangenome. The pangenome now includes 3791 ortholog clusters, 120 of which are unique to C. changbaiensis and may be involved in plant biomass degradation. Comparisons between C. changbaiensis and Caldicellulosiruptor bescii on the basis of growth kinetics, cellulose solubilization and cell attachment to polysaccharides highlighted physiological differences between the two species which are supported by their respective gene inventories. Most significantly, these comparisons indicated that C. changbaiensis possesses uncommon cellulose attachment mechanisms not observed among the other strongly cellulolytic members of the genus Caldicellulosiruptor.

RevDate: 2019-08-09

Chapeton-Montes D, Plourde L, Bouchier C, et al (2019)

The population structure of Clostridium tetani deduced from its pan-genome.

Scientific reports, 9(1):11220 pii:10.1038/s41598-019-47551-4.

Clostridium tetani produces a potent neurotoxin, the tetanus neurotoxin (TeNT) that is responsible for the worldwide neurological disease tetanus, but which can be efficiently prevented by vaccination with tetanus toxoid. Until now only one type of TeNT has been characterized and very little information exists about the heterogeneity among C. tetani strains. We report here the genome sequences of 26 C. tetani strains, isolated between 1949 and 2017 and obtained from different locations. Genome analyses revealed that the C. tetani population is distributed in two phylogenetic clades, a major and a minor one, with no evidence for clade separation based on geographical origin or time of isolation. The chromosome of C. tetani is highly conserved; in contrast, the TeNT-encoding plasmid shows substantial heterogeneity. TeNT itself is highly conserved among all strains; the most relevant difference is an insertion of four amino acids in the C-terminal receptor-binding domain in four strains that might impact on receptor-binding properties. Other putative virulence factors, including tetanolysin and collagenase, are encoded in all genomes. This study highlights the population structure of C. tetani and suggests that tetanus-causing strains did not undergo extensive evolutionary diversification, as judged from the high conservation of its main virulence factors.

RevDate: 2019-08-08

Saad J, Phelippeau M, Khoder M, et al (2019)

"Mycobacterium mephinesia", a Mycobacterium terrae complex species of clinical interest isolated in French Polynesia.

Scientific reports, 9(1):11169 pii:10.1038/s41598-019-47674-8.

A 59-year-old tobacco smoker male with chronic bronchitis living in Taravao, French Polynesia, Pacific, presented with a two-year growing nodule in the middle lobe of the right lung. A guided bronchoalveolar lavage inoculated onto Löwenstein-Jensen medium yielded colonies of a rapidly-growing non-chromogenic mycobacterium designed as isolate P7213. The isolate could not be identified using routine matrix-assisted laser desorption ionization-time of flight-mass spectrometry and phenotypic and probe-hybridization techniques and yielded 100% and 97% sequence similarity with the respective 16S rRNA and rpoB gene sequences of Mycobacterium virginiense in the Mycobacterium terrae complex. Electron microscopy showed a 1.15 µm long and 0.38 µm large bacillus which was in vitro susceptible to rifampicin, rifabutin, ethambutol, isoniazid, doxycycline and kanamycin. Its 4,511,948-bp draft genome exhibited a 67.6% G + C content with 4,153 coding-protein genes and 87 predicted RNA genes. Genome sequence-derived DNA-DNA hybridization, OrthoANI and pangenome analysis confirmed isolate P7213 was representative of a new species in the M. terrae complex. We named this species "Mycobacterium mephinesia".

RevDate: 2019-08-02

O'Connor E, McGowan J, McCarthy CGP, et al (2019)

Whole Genome Sequence of the Commercially Relevant Mushroom Strain Agaricus bisporus var. bisporus ARP23.

G3 (Bethesda, Md.) pii:g3.119.400563 [Epub ahead of print].

Agaricus bisporus is an extensively cultivated edible mushroom. Demand for cultivation is continuously growing and difficulties associated with breeding programmes now means strains are effectively considered monoculture. While commercial growing practices are highly efficient and tightly controlled, the over-use of a single strain has led to a variety of disease outbreaks from a range of pathogens including bacteria, fungi and viruses. To address this, the Agaricus Resource Program (ARP) was set up to collect wild isolates from diverse geographical locations through a bounty-driven scheme to create a repository of wild Agaricus germplasm. One of the strains collected, Agaricus bisporus var. bisporus ARP23, has been crossed extensively with white commercial varieties leading to the generation of a novel hybrid with a dark brown pileus commonly referred to as 'Heirloom'. Heirloom has been successfully implemented into commercial mushroom cultivation. In this study the whole genome of Agaricus bisporus var. bisporus ARP23 was sequenced and assembled with Illumina and PacBio sequencing technology. The final genome was found to be 33.49 Mb in length and have significant levels of synteny to other sequenced Agaricus bisporus strains. Overall, 13,030 putative protein coding genes were located and annotated. Relative to the other A. bisporus genomes that are currently available, Agaricus bisporus var. bisporus ARP23 is the largest A. bisporus strain in terms of gene number and genetic content sequenced to date. Comparative genomic analysis shows that the A. bisporus mating loci in unifactorial and unsurprisingly highly conserved between strains. The lignocellulolytic gene content of all A. bisporus strains compared is also very similar. Our results show that the pangenome structure of A. bisporus is quite diverse with between 60-70% of the total protein coding genes per strain considered as being orthologous and syntenically conserved. These analyses and the genome sequence described herein are the starting point for more detailed molecular analyses into the growth and phenotypical responses of Agaricus bisporus var. bisporus ARP23 when challenged with economically important mycoviruses.

RevDate: 2019-08-13

Duan Z, Qiao Y, Lu J, et al (2019)

HUPAN: a pan-genome analysis pipeline for human genomes.

Genome biology, 20(1):149 pii:10.1186/s13059-019-1751-y.

The human reference genome is still incomplete, especially for those population-specific or individual-specific regions, which may have important functions. Here, we developed a HUman Pan-genome ANalysis (HUPAN) system to build the human pan-genome. We applied it to 185 deep sequencing and 90 assembled Han Chinese genomes and detected 29.5 Mb novel genomic sequences and at least 188 novel protein-coding genes missing in the human reference genome (GRCh38). It can be an important resource for the human genome-related biomedical studies, such as cancer genome analysis. HUPAN is freely available at http://cgm.sjtu.edu.cn/hupan/ and https://github.com/SJTU-CGM/HUPAN .

RevDate: 2019-07-27

Richards VP, Velsko IM, Alam T, et al (2019)

Population gene introgression and high genome plasticity for the zoonotic pathogen Streptococcus agalactiae.

Molecular biology and evolution pii:5539754 [Epub ahead of print].

The influence that bacterial adaptation (or niche partitioning) within species has on gene spillover and transmission among bacteria populations occupying different niches is not well understood. Streptococcus agalactiae is an important bacterial pathogen that has a taxonomically diverse host range making it an excellent model system to study these processes. Here we analyze a global set of 901 genome sequences from nine diverse host species to advance our understanding of these processes. Bayesian clustering analysis delineated twelve major populations that closely aligned with niches. Comparative genomics revealed extensive gene gain/loss among populations and a large pan-genome of 9,527 genes, which remained open and was strongly partitioned among niches. As a result, the biochemical characteristics of eleven populations were highly distinctive (significantly enriched). Positive selection was detected and biochemical characteristics of the dispensable genes under selection were enriched in ten populations. Despite the strong gene partitioning, phylogenomics detected gene spillover. In particular, tetracycline resistance (which likely evolved in the human-associated population) from humans to bovine, canines, seals, and fish, demonstrating how a gene selected in one host can ultimately be transmitted into another, and biased transmission from humans to bovines was confirmed with a Bayesian migration analysis. Our findings show high bacterial genome plasticity acting in balance with selection pressure from distinct functional requirements of niches that is associated with an extensive and highly partitioned dispensable genome, likely facilitating continued and expansive adaptation.

RevDate: 2019-07-25

Naidenov B, Lim A, Willyerd K, et al (2019)

Pan-Genomic and Polymorphic Driven Prediction of Antibiotic Resistance in Elizabethkingia.

Frontiers in microbiology, 10:1446.

The Elizabethkingia are a genetically diverse genus of emerging pathogens that exhibit multidrug resistance to a range of common antibiotics. Two representative species, Elizabethkingia bruuniana and E. meningoseptica, were phenotypically tested to determine minimum inhibitory concentrations (MICs) for five antibiotics. Ultra-long read sequencing with Oxford Nanopore Technologies (ONT) and subsequent de novo assembly produced complete, gapless circular genomes for each strain. Alignment based annotation with Prokka identified 5,480 features in E. bruuniana and 5,203 features in E. meningoseptica, where none of these identified genes or gene combinations corresponded to observed phenotypic resistance values. Pan-genomic analysis, performed with an additional 19 Elizabethkingia strains, identified a core-genome size of 2,658,537 bp, 32 uniquely identifiable intrinsic chromosomal antibiotic resistance core-genes and 77 antibiotic resistance pan-genes. Using core-SNPs and pan-genes in combination with six machine learning (ML) algorithms, binary classification of clindamycin and vancomycin resistance achieved f1 scores of 0.94 and 0.84, respectively. Performance on the more challenging multiclass problem for fusidic acid, rifampin and ciprofloxacin resulted in f1 scores of 0.70, 0.75, and 0.54, respectively. By producing two sets of quality biological predictors, pan-genome genes and core-genome SNPs, from long-read sequence data and applying an ensemble of ML techniques, our results demonstrated that accurate phenotypic inference, at multiple AMR resolutions, can be achieved.

RevDate: 2019-07-30

Passarelli-Araujo H, Palmeiro JK, Moharana KC, et al (2019)

Genomic analysis unveils important aspects of population structure, virulence, and antimicrobial resistance in Klebsiella aerogenes.

The FEBS journal [Epub ahead of print].

Klebsiella aerogenes is an important pathogen in healthcare-associated infections. Nevertheless, in comparison to other clinically important pathogens, K. aerogenes population structure, genetic diversity, and pathogenicity remain poorly understood. Here, we elucidate K. aerogenes clonal complexes (CCs) and genomic features associated with resistance and virulence. We present a detailed description of the population structure of K. aerogenes based on 97 publicly available genomes by using both multilocus sequence typing and single-nucleotide polymorphisms extracted from the core genome. We also assessed virulence and resistance profiles using Virulence Finder Database and Comprehensive Antibiotic Resistance Database, respectively. We show that K. aerogenes has an open pangenome and a large effective population size, which account for its high genomic diversity and support that negative selection prevents fixation of most deleterious alleles. The population is structured in at least 10 CCs, including two novel ones identified here, CC9 and CC10. The repertoires of resistance genes comprise a high number of antibiotic efflux proteins as well as narrow- and extended-spectrum β-lactamases. Regarding the population structure, we identified two clusters based on virulence profiles because of the presence of the toxin-encoding clb operon and the siderophore production genes, irp and ybt. Notably, CC3 comprises the majority of K. aerogenes isolates associated with hospital outbreaks, emphasizing the importance of constant monitoring of this pathogen. Collectively, our results may provide a foundation for the development of new therapeutic and surveillance strategies worldwide.

RevDate: 2019-07-23

Chen SL (2019)

Genomic Insights Into the Distribution and Evolution of Group B Streptococcus.

Frontiers in microbiology, 10:1447.

Streptococcus agalactiae, also known as Group B Streptococcus (GBS), is a bacteria with truly protean biology. It infects a variety of hosts, among which the most commonly studied are humans, cattle, and fish. GBS holds a singular position in the history of bacterial genomics, as it was the substrate used to describe one of the first major conceptual advances of comparative genomics, the idea of the pan-genome. In this review, I describe a brief history of GBS and the major contributions of genomics to understanding its genome plasticity and evolution as well as its molecular epidemiology, focusing on the three hosts mentioned above. I also discuss one of the major recent paradigm shifts in our understanding of GBS evolution and disease burden: foodborne GBS can cause invasive infections in humans.

RevDate: 2019-08-01

Xia Q, Pan L, Zhang R, et al (2019)

The genome assembly of asparagus bean, Vigna unguiculata ssp. sesquipedialis.

Scientific data, 6(1):124 pii:10.1038/s41597-019-0130-6.

Asparagus bean (Vigna. unguiculata ssp. sesquipedialis), known for its very long and tender green pods, is an important vegetable crop broadly grown in the developing Asian countries. In this study, we reported a 632.8 Mb assembly (549.81 Mb non-N size) of asparagus bean based on the whole genome shotgun sequencing strategy. We also generated a linkage map for asparagus bean, which helped anchor 94.42% of the scaffolds into 11 pseudo-chromosomes. A total of 42,609 protein-coding genes and 3,579 non-protein-coding genes were predicted from the assembly. Taken together, these genomic resources of asparagus bean will help develop a pan-genome of V. unguiculata and facilitate the investigation of economically valuable traits in this species, so that the cultivation of this plant would help combat the protein and energy malnutrition in the developing world.

RevDate: 2019-08-31

Yahara K, Lehours P, FF Vale (2019)

Analysis of genetic recombination and the pan-genome of a highly recombinogenic bacteriophage species.

Microbial genomics, 5(8):.

Bacteriophages are the most prevalent biological entities impacting on the ecosystem and are characterized by their extensive diversity. However, there are two aspects of phages that have remained largely unexplored: genetic flux by recombination between phage populations and characterization of specific phages in terms of the pan-genome. Here, we examined the recombination and pan-genome in Helicobacter pylori prophages at both the genome and gene level. In the genome-level analysis, we applied, for the first time, chromosome painting and fineSTRUCTURE algorithms to a phage species, and showed novel trends in inter-population genetic flux. Notably, hpEastAsia is a phage population that imported a higher proportion of DNA fragments from other phages, whereas the hpSWEurope phages showed weaker signatures of inter-population recombination, suggesting genetic isolation. The gene-level analysis showed that, after parameter tuning of the prokaryote pan-genome analysis program, H. pylori phages have a pan-genome consisting of 75 genes and a soft-core genome of 10 genes, which includes genes involved in the lytic and lysogenic life cycles. Quantitative analysis of recombination events of the soft-core genes showed no substantial variation in the intensity of recombination across the genes, but rather equally frequent recombination among housekeeping genes that were previously reported to be less prone to recombination. The signature of frequent recombination appears to reflect the host-phage evolutionary arms race, either by contributing to escape from bacterial immunity or by protecting the host by producing defective phages.

RevDate: 2019-07-14

Paterson ML, Ranasinghe D, Blom J, et al (2019)

Genomic analysis of a novel Rhodococcus (Prescottella) equi isolate from a bovine host.

Archives of microbiology pii:10.1007/s00203-019-01695-z [Epub ahead of print].

Rhodococcus (Prescottella) equi causes pneumonia-like infections in foals with high mortality rates and can also infect a number of other animals. R. equi is also emerging as an opportunistic human pathogen. In this study, we have sequenced the genome of a novel R. equi isolate, B0269, isolated from the faeces of a bovine host. Comparative genomic analyses with seven other published R. equi genomes, including those from equine or human sources, revealed a pangenome comprising of 6876 genes with 4141 genes in the core genome. Two hundred and 75 genes were specific to the bovine isolate, mostly encoding hypothetical proteins of unknown function. However, these genes include four copies of terA and five copies of terD genes that may be involved in responding to chemical stress. Virulence characteristics in R. equi are associated with the presence of large plasmids carrying a pathogenicity island, including genes from the vap multigene family. A BLAST search of the protein sequences from known virulence-associated plasmids (pVAPA, pVAPB and pVAPN) revealed a similar plasmid backbone on two contigs in bovine isolate B0269; however, no homologues of the main virulence-associated genes, vapA, vapB or vapN, were identified. In summary, this study confirms that R. equi genomes are highly conserved and reports the presence of an apparently novel plasmid in the bovine isolate B0269 that needs further characterisation to understand its potential involvement in virulence properties.

RevDate: 2019-07-31

Kingstad-Bakke BA, Chandrasekar SS, Phanse Y, et al (2019)

Effective mosaic-based nanovaccines against avian influenza in poultry.

Vaccine, 37(35):5051-5058.

Avian influenza virus (AIV) is an extraordinarily diverse pathogen that causes significant morbidity in domesticated poultry populations and threatens human life with looming pandemic potential. Controlling avian influenza in susceptible populations requires highly effective, economical and broadly reactive vaccines. Several AIV vaccines have proven insufficient despite their wide use, and better technologies are needed to improve their immunogenicity and broaden effectiveness. Previously, we developed a "mosaic" H5 subtype hemagglutinin (HA) AIV vaccine and demonstrated its broad protection against diverse highly pathogenic H5N1 and seasonal H1N1 virus strains in mouse and non-human primate models. There is a significant interest in developing effective and safe vaccines against AIV that cannot contribute to the emergence of new strains of the virus once circulating in poultry. Here, we report on the development of an H5 mosaic (H5M) vaccine antigen formulated with polyanhydride nanoparticles (PAN) that provide sustained release of encapsulated antigens. H5M vaccine constructs were immunogenic whether delivered by the modified virus Ankara (MVA) strain or encapsulated within PAN. Both humoral and cellular immune responses were generated in both specific-pathogen free (SPF) and commercial chicks. Importantly, chicks vaccinated by H5M constructs were protected in terms of viral shedding from divergent challenge with a low pathogenicity avian influenza (LPAI) strain at 8 weeks post-vaccination. In addition, protective levels of humoral immunity were generated against highly pathogenic avian influenza (HPAI) of the similar H5N1 and genetically dissimilar H5N2 viruses. Overall, the developed platform technologies (MVA vector and PAN encapsulation) were safe and provided high levels of sustained protection against AIV in chickens. Such approaches could be used to design more efficacious vaccines against other important poultry infections.

RevDate: 2019-08-20

McCarthy CGP, DA Fitzpatrick (2019)

Pangloss: A Tool for Pan-Genome Analysis of Microbial Eukaryotes.

Genes, 10(7): pii:genes10070521.

Although the pan-genome concept originated in prokaryote genomics, an increasing number of eukaryote species pan-genomes have also been analysed. However, there is a relative lack of software intended for eukaryote pan-genome analysis compared to that available for prokaryotes. In a previous study, we analysed the pan-genomes of four model fungi with a computational pipeline that constructed pan-genomes using the synteny-dependent Pan-genome Ortholog Clustering Tool (PanOCT) approach. Here, we present a modified and improved version of that pipeline which we have called Pangloss. Pangloss can perform gene prediction for a set of genomes from a given species that the user provides, constructs and optionally refines a species pan-genome from that set using PanOCT, and can perform various functional characterisation and visualisation analyses of species pan-genome data. To demonstrate Pangloss's capabilities, we constructed and analysed a species pan-genome for the oleaginous yeast Yarrowialipolytica and also reconstructed a previously-published species pan-genome for the opportunistic respiratory pathogen Aspergillus fumigatus. Pangloss is implemented in Python, Perl and R and is freely available under an open source GPLv3 licence via GitHub.

RevDate: 2019-07-14

Passera A, Compant S, Casati P, et al (2019)

Not Just a Pathogen? Description of a Plant-Beneficial Pseudomonas syringae Strain.

Frontiers in microbiology, 10:1409.

Plants develop in a microbe-rich environment and must interact with a plethora of microorganisms, both pathogenic and beneficial. Indeed, such is the case of Pseudomonas, and its model organisms P. fluorescens and P. syringae, a bacterial genus that has received particular attention because of its beneficial effect on plants and its pathogenic strains. The present study aims to compare plant-beneficial and pathogenic strains belonging to the P. syringae species to get new insights into the distinction between the two types of plant-microbe interactions. In assays carried out under greenhouse conditions, P. syringae pv. syringae strain 260-02 was shown to promote plant-growth and to exert biocontrol of P. syringae pv. tomato strain DC3000, against the Botrytis cinerea fungus and the Cymbidium Ringspot Virus. This P. syringae strain also had a distinct volatile emission profile, as well as a different plant-colonization pattern, visualized by confocal microscopy and gfp labeled strains, compared to strain DC3000. Despite the different behavior, the P. syringae strain 260-02 showed great similarity to pathogenic strains at a genomic level. However, genome analyses highlighted a few differences that form the basis for the following hypotheses regarding strain 260-02. P. syringae strain 260-02: (i) possesses non-functional virulence genes, like the mangotoxin-producing operon Mbo; (ii) has different regulation pathways, suggested by the difference in the autoinducer system and the lack of a virulence activator gene; (iii) has genes encoding DNA methylases different from those found in other P. syringae strains, suggested by the presence of horizontal-gene-transfer-obtained methylases that could affect gene expression.

RevDate: 2019-07-14

Fontana A, Falasconi I, Molinari P, et al (2019)

Genomic Comparison of Lactobacillus helveticus Strains Highlights Probiotic Potential.

Frontiers in microbiology, 10:1380.

Lactobacillus helveticus belongs to the large group of lactic acid bacteria (LAB), which are the major players in the fermentation of a wide range of foods. LAB are also present in the human gut, which has often been exploited as a reservoir of potential novel probiotic strains, but several parameters need to be assessed before establishing their safety and potential use for human consumption. In the present study, six L. helveticus strains isolated from natural whey cultures were analyzed for their phenotype and genotype in exopolysaccharide (EPS) production, low pH and bile salt tolerance, bile salt hydrolase (BSH) activity, and antibiotic resistance profile. In addition, a comparative genomic investigation was performed between the six newly sequenced strains and the 51 publicly available genomes of L. helveticus to define the pangenome structure. The results indicate that the newly sequenced strain UC1267 and the deposited strain DSM 20075 can be considered good candidates for gut-adapted strains due to their ability to survive in the presence of 0.2% glycocholic acid (GCA) and 1% taurocholic and taurodeoxycholic acid (TDCA). Moreover, these strains had the highest bile salt deconjugation activity among the tested L. helveticus strains. Considering the safety profile, none of these strains presented antibiotic resistance phenotypically and/or at the genome level. The pangenome analysis revealed genes specific to the new isolates, such as enzymes related to folate biosynthesis in strains UC1266 and UC1267 and an integrated phage in strain UC1035. Finally, the presence of maltose-degrading enzymes and multiple copies of 6-phospho-β-glucosidase genes in our strains indicates the capability to metabolize sugars other than lactose, which is related solely to dairy niches.

RevDate: 2019-07-10

Tian X, Li R, Fu W, et al (2019)

Building a sequence map of the pig pan-genome from multiple de novo assemblies and Hi-C data.

Science China. Life sciences pii:10.1007/s11427-019-9551-7 [Epub ahead of print].

Pigs were domesticated independently in the Near East and China, indicating that a single reference genome from one individual is unable to represent the full spectrum of divergent sequences in pigs worldwide. Therefore, 12 de novo pig assemblies from Eurasia were compared in this study to identify the missing sequences from the reference genome. As a result, 72.5 Mb of non-redundant sequences (∼3% of the genome) were found to be absent from the reference genome (Sscrofa11.1) and were defined as pan-sequences. Of the pan-sequences, 9.0 Mb were dominant in Chinese pigs, in contrast with their low frequency in European pigs. One sequence dominant in Chinese pigs contained the complete genic region of the tazarotene-induced gene 3 (TIG3) gene which is involved in fatty acid metabolism. Using flanking sequences and Hi-C based methods, 27.7% of the sequences could be anchored to the reference genome. The supplementation of these sequences could contribute to the accurate interpretation of the 3D chromatin structure. A web-based pan-genome database was further provided to serve as a primary resource for exploration of genetic diversity and promote pig breeding and biomedical research.

RevDate: 2019-08-09

Piligrimova EG, Kazantseva OA, Nikulin NA, et al (2019)

Bacillus Phage vB_BtS_B83 Previously Designated as a Plasmid May Represent a New Siphoviridae Genus.

Viruses, 11(7): pii:v11070624.

The Bacillus cereus group of bacteria includes, inter alia, the species known to be associated with human diseases and food poisoning. Here, we describe the Bacillus phage vB_BtS_B83 (abbreviated as B83) infecting the species of this group. Transmission electron microscopy (TEM) micrographs indicate that B83 belongs to the Siphoviridae family. B83 is a temperate phage using an arbitrium system for the regulation of the lysis-lysogeny switch, and is probably capable of forming a circular plasmid prophage. Comparative analysis shows that it has been previously sequenced, but was mistaken for a plasmid. B83 shares common genome organization and >46% of proteins with other the Bacillus phage, BMBtp14. Phylograms constructed using large terminase subunits and a pan-genome presence-absence matrix show that these phages form a clade distinct from the closest viruses. Based on the above, we propose the creation of a new genus named Bembunaquatrovirus that includes B83 and BMBtp14.

RevDate: 2019-07-11

Machado KCT, Fortuin S, Tomazella GG, et al (2019)

On the Impact of the Pangenome and Annotation Discrepancies While Building Protein Sequence Databases for Bacteria Proteogenomics.

Frontiers in microbiology, 10:1410.

In proteomics, peptide information within mass spectrometry (MS) data from a specific organism sample is routinely matched against a protein sequence database that best represent such organism. However, if the species/strain in the sample is unknown or genetically poorly characterized, it becomes challenging to determine a database which can represent such sample. Building customized protein sequence databases merging multiple strains for a given species has become a strategy to overcome such restrictions. However, as more genetic information is publicly available and interesting genetic features such as the existence of pan- and core genes within a species are revealed, we questioned how efficient such merging strategies are to report relevant information. To test this assumption, we constructed databases containing conserved and unique sequences for 10 different species. Features that are relevant for probabilistic-based protein identification by proteomics were then monitored. As expected, increase in database complexity correlates with pangenomic complexity. However, Mycobacterium tuberculosis and Bordetella pertussis generated very complex databases even having low pangenomic complexity. We further tested database performance by using MS data from eight clinical strains from M. tuberculosis, and from two published datasets from Staphylococcus aureus. We show that by using an approach where database size is controlled by removing repeated identical tryptic sequences across strains/species, computational time can be reduced drastically as database complexity increases.

RevDate: 2019-08-30

Nielsen MR, Wollenberg RD, Westphal KR, et al (2019)

Heterologous expression of intact biosynthetic gene clusters in Fusarium graminearum.

Fungal genetics and biology : FG & B, 132:103248 pii:S1087-1845(19)30046-5 [Epub ahead of print].

Filamentous fungi such as species from the genus Fusarium are capable of producing a wide palette of interesting metabolites relevant to health, agriculture and biotechnology. Secondary metabolites are formed from large synthase/synthetase enzymes often encoded in gene clusters containing additional enzymes cooperating in the metabolite's biosynthesis. The true potential of fungal metabolomes remain untapped as the majority of secondary metabolite gene clusters are silent under standard laboratory growth conditions. One way to achieve expression of biosynthetic pathways is to clone the responsible genes and express them in a well-suited heterologous host, which poses a challenge since Fusarium polyketide synthase and non-ribosomal peptide synthetase gene clusters can be large (e.g. as large as 80 kb) and comprise several genes necessary for product formation. The major challenge associated with heterologous expression of fungal biosynthesis pathways is thus handling and cloning large DNA sequences. In this paper we present the successful workflow for cloning, reconstruction and heterologous production of two previously characterized Fusarium pseudograminearum natural product pathways in Fusarium graminearum. In vivo yeast recombination enabled rapid assembly of the W493 (NRPS32-PKS40) and the Fusarium Cytokinin gene clusters. F. graminearum transformants were obtained through protoplast-mediated and Agrobacterium tumefaciens-mediated transformation. Whole genome sequencing revealed isolation of transformants carrying intact copies the gene clusters was possible. Known Fusarium cytokinin metabolites; fusatin, 8-oxo-fusatin, 8-oxo-isopentenyladenine, fusatinic acid together with cis- and trans-zeatin were detected by liquid chromatography and mass spectrometry, which confirmed gene functionality in F. graminearum. In addition the non-ribosomal lipopeptide products W493 A and B was heterologously produced in similar amounts to that observed in the F. pseudograminearum doner. The Fusarium pan-genome comprises more than 60 uncharacterized putative secondary metabolite gene clusters. We nominate the well-characterized F. graminearum as a heterologous expression platform for Fusarium secondary metabolite gene clusters, and present our experience cloning and introducing gene clusters into this species. We expect the presented methods will inspire future endevours in heterologous production of Fusarium metabolites and potentially aid the production and characterization of novel natural products.

RevDate: 2019-07-14

Matteoli FP, Passarelli-Araujo H, Pedrosa-Silva F, et al (2019)

Population structure and pangenome analysis of Enterobacter bugandensis uncover the presence of blaCTX-M-55, blaNDM-5 and blaIMI-1, along with sophisticated iron acquisition strategies.

Genomics pii:S0888-7543(19)30319-2 [Epub ahead of print].

Enterobacter bugandensis is a recently described species that has been largely associated with nosocomial infections. We report the genome of a non-clinical E. bugandensis strain, which was integrated with publicly available genomes to study the pangenome and general population structure of E. bugandensis. Core- and whole-genome multilocus sequence typing allowed the detection of five E. bugandensis phylogroups (PG-A to E), which contain important antimicrobial resistance and virulence determinants. We uncovered several extended-spectrum β-lactamases, including blaCTX-M-55 and blaNDM-5, present in an IncX replicon type plasmid, described here for the first time in E. bugandensis. Genetic context analysis of blaNDM-5 revealed the resemblance of this plasmid with other IncX plasmids from other bacteria from the same country. Three distinctive siderophore producing operons were found in E. bugandensis: enterobactin (ent), aerobactin (iuc/iut), and salmochelin (iro). Our findings provide novel insights on the lifestyle, physiology, antimicrobial, and virulence profiles of E. bugandensis.

RevDate: 2019-08-23

Kopejtka K, Lin Y, Jakubovičová M, et al (2019)

Clustered Core- and Pan-Genome Content on Rhodobacteraceae Chromosomes.

Genome biology and evolution, 11(8):2208-2217.

In Bacteria, chromosome replication starts at a single origin of replication and proceeds on both replichores. Due to its asymmetric nature, replication influences chromosome structure and gene organization, mutation rate, and expression. To date, little is known about the distribution of highly conserved genes over the bacterial chromosome. Here, we used a set of 101 fully sequenced Rhodobacteraceae representatives to analyze the relationship between conservation of genes within this family and their distance from the origin of replication. Twenty-two of the analyzed species had core genes clustered significantly closer to the origin of replication with representatives of the genus Celeribacter being the most apparent example. Interestingly, there were also eight species with the opposite organization. In particular, Rhodobaca barguzinensis and Loktanella vestfoldensis showed a significant increase of core genes with distance from the origin of replication. The uneven distribution of low-conserved regions is in particular pronounced for genomes in which the halves of one replichore differ in their conserved gene content. Phage integration and horizontal gene transfer partially explain the scattered nature of Rhodobacteraceae genomes. Our findings lay the foundation for a better understanding of bacterial genome evolution and the role of replication therein.

RevDate: 2019-07-13

Lima NCB, Tanmoy AM, Westeel E, et al (2019)

Analysis of isolates from Bangladesh highlights multiple ways to carry resistance genes in Salmonella Typhi.

BMC genomics, 20(1):530 pii:10.1186/s12864-019-5916-6.

BACKGROUND: Typhoid fever, caused by Salmonella Typhi, follows a fecal-oral transmission route and is a major global public health concern, especially in developing countries like Bangladesh. Increasing emergence of antimicrobial resistance (AMR) is a serious issue; the list of treatments for typhoid fever is ever-decreasing. In addition to IncHI1-type plasmids, Salmonella genomic island (SGI) 11 has been reported to carry AMR genes. Although reports suggest a recent reduction in multidrug resistance (MDR) in the Indian subcontinent, the corresponding genomic changes in the background are unknown.

RESULTS: Here, we assembled and annotated complete closed chromosomes and plasmids for 73 S. Typhi isolates using short-length Illumina reads. S. Typhi had an open pan-genome, and the core genome was smaller than previously reported. Considering AMR genes, we identified five variants of SGI11, including the previously reported reference sequence. Five plasmids were identified, including the new plasmids pK91 and pK43; pK43and pHCM2 were not related to AMR. The pHCM1, pPRJEB21992 and pK91 plasmids carried AMR genes and, along with the SGI11 variants, were responsible for resistance phenotypes. pK91 also contained qnr genes, conferred high ciprofloxacin resistance and was related to the H58-sublineage Bdq, which shows the same phenotype. The presence of plasmids (pHCM1 and pK91) and SGI11 were linked to two H58-lineages, Ia and Bd. Loss of plasmids and integration of resistance genes in genomic islands could contribute to the fitness advantage of lineage Ia isolates.

CONCLUSIONS: Such events may explain why lineage Ia is globally widespread, while the Bd lineage is locally restricted. Further studies are required to understand how these S. Typhi AMR elements spread and generate new variants. Preventive measures such as vaccination programs should also be considered in endemic countries; such initiatives could potentially reduce the spread of AMR.

RevDate: 2019-06-30

Levesque S, de Melo AG, Labrie SJ, et al (2019)

Mobilome of Brevibacterium aurantiacum Sheds Light on Its Genetic Diversity and Its Adaptation to Smear-Ripened Cheeses.

Frontiers in microbiology, 10:1270.

Brevibacterium aurantiacum is an actinobacterium that confers key organoleptic properties to washed-rind cheeses during the ripening process. Although this industrially relevant species has been gaining an increasing attention in the past years, its genome plasticity is still understudied due to the unavailability of complete genomic sequences. To add insights on the mobilome of this group, we sequenced the complete genomes of five dairy Brevibacterium strains and one non-dairy strain using PacBio RSII. We performed phylogenetic and pan-genome analyses, including comparisons with other publicly available Brevibacterium genomic sequences. Our phylogenetic analysis revealed that these five dairy strains, previously identified as Brevibacterium linens, belong instead to the B. aurantiacum species. A high number of transposases and integrases were observed in the Brevibacterium spp. strains. In addition, we identified 14 and 12 new insertion sequences (IS) in B. aurantiacum and B. linens genomes, respectively. Several stretches of homologous DNA sequences were also found between B. aurantiacum and other cheese rind actinobacteria, suggesting horizontal gene transfer (HGT). A HGT region from an iRon Uptake/Siderophore Transport Island (RUSTI) and an iron uptake composite transposon were found in five B. aurantiacum genomes. These findings suggest that low iron availability in milk is a driving force in the adaptation of this bacterial species to this niche. Moreover, the exchange of iron uptake systems suggests cooperative evolution between cheese rind actinobacteria. We also demonstrated that the integrative and conjugative element BreLI (Brevibacterium Lanthipeptide Island) can excise from B. aurantiacum SMQ-1417 chromosome. Our comparative genomic analysis suggests that mobile genetic elements played an important role into the adaptation of B. aurantiacum to cheese ecosystems.

RevDate: 2019-06-28

Zhang B, Zhu W, Diao S, et al (2019)

The poplar pangenome provides insights into the evolutionary history of the genus.

Communications biology, 2:215 pii:474.

The genus Populus comprises a complex amalgam of ancient and modern species that has become a prime model for evolutionary and taxonomic studies. Here we sequenced the genomes of 10 species from five sections of the genus Populus, identified 71 million genomic variations, and observed new correlations between the single-nucleotide polymorphism-structural variation (SNP-SV) density and indel-SV density to complement the SNP-indel density correlation reported in mammals. Disease resistance genes (R genes) with heterozygous loss-of-function (LOF) were significantly enriched in the 10 species, which increased the diversity of poplar R genes during evolution. Heterozygous LOF mutations in the self-incompatibility genes were closely related to the self-fertilization of poplar, suggestive of genomic control of self-fertilization in dioecious plants. The phylogenetic genome-wide SNPs tree also showed possible ancient hybridization among species in sections Tacamahaca, Aigeiros, and Leucoides. The pangenome resource also provided information for poplar genetics and breeding.

RevDate: 2019-07-11

Zhang AN, Mao Y, Wang Y, et al (2019)

Mining traits for the enrichment and isolation of not-yet-cultured populations.

Microbiome, 7(1):96 pii:10.1186/s40168-019-0708-4.

BACKGROUND: The lack of pure cultures limits our understanding into 99% of bacteria. Proper interpretation of the genetic and the transcriptional datasets can reveal clues for the enrichment and even isolation of the not-yet-cultured populations. Unraveling such information requires a proper mining method.

RESULTS: Here, we present a method to infer the hidden traits for the enrichment of not-yet-cultured populations. We demonstrate this method using Candidatus Accumulibacter. Our method constructs a whole picture of the carbon, electron, and energy flows in the not-yet-cultured populations from the genomic datasets. Then, it decodes the coordination across three flows from the transcriptional datasets. Based on it, our method diagnoses the status of the not-yet-cultured populations and provides strategy to optimize the enrichment systems.

CONCLUSION: Our method could shed light to the exploration into the bacterial dark matter in the environments.

RevDate: 2019-07-10

Québatte M, C Dehio (2019)

Bartonella gene transfer agent: Evolution, function, and proposed role in host adaptation.

Cellular microbiology [Epub ahead of print].

The processes underlying host adaptation by bacterial pathogens remain a fundamental question with relevant clinical, ecological, and evolutionary implications. Zoonotic pathogens of the genus Bartonella constitute an exceptional model to study these aspects. Bartonellae have undergone a spectacular diversification into multiple species resulting from adaptive radiation. Specific adaptations of a complex facultative intracellular lifestyle have enabled the colonisation of distinct mammalian reservoir hosts. This remarkable host adaptability has a multifactorial basis and is thought to be driven by horizontal gene transfer (HGT) and recombination among a limited genus-specific pan genome. Recent functional and evolutionary studies revealed that the conserved Bartonella gene transfer agent (BaGTA) mediates highly efficient HGT and could thus drive this evolution. Here, we review the recent progress made towards understanding BaGTA evolution, function, and its role in the evolution and pathogenesis of Bartonella spp. We notably discuss how BaGTA could have contributed to genome diversification through recombination of beneficial traits that underlie host adaptability. We further address how BaGTA may counter the accumulation of deleterious mutations in clonal populations (Muller's ratchet), which are expected to occur through the recurrent transmission bottlenecks during the complex infection cycle of these pathogens in their mammalian reservoir hosts and arthropod vectors.

LOAD NEXT 100 CITATIONS

ESP Quick Facts

ESP Origins

In the early 1990's, Robert Robbins was a faculty member at Johns Hopkins, where he directed the informatics core of GDB — the human gene-mapping database of the international human genome project. To share papers with colleagues around the world, he set up a small paper-sharing section on his personal web page. This small project evolved into The Electronic Scholarly Publishing Project.

ESP Support

In 1995, Robbins became the VP/IT of the Fred Hutchinson Cancer Research Center in Seattle, WA. Soon after arriving in Seattle, Robbins secured funding, through the ELSI component of the US Human Genome Project, to create the original ESP.ORG web site, with the formal goal of providing free, world-wide access to the literature of classical genetics.

ESP Rationale

Although the methods of molecular biology can seem almost magical to the uninitiated, the original techniques of classical genetics are readily appreciated by one and all: cross individuals that differ in some inherited trait, collect all of the progeny, score their attributes, and propose mechanisms to explain the patterns of inheritance observed.

ESP Goal

In reading the early works of classical genetics, one is drawn, almost inexorably, into ever more complex models, until molecular explanations begin to seem both necessary and natural. At that point, the tools for understanding genome research are at hand. Assisting readers reach this point was the original goal of The Electronic Scholarly Publishing Project.

ESP Usage

Usage of the site grew rapidly and has remained high. Faculty began to use the site for their assigned readings. Other on-line publishers, ranging from The New York Times to Nature referenced ESP materials in their own publications. Nobel laureates (e.g., Joshua Lederberg) regularly used the site and even wrote to suggest changes and improvements.

ESP Content

When the site began, no journals were making their early content available in digital format. As a result, ESP was obliged to digitize classic literature before it could be made available. For many important papers — such as Mendel's original paper or the first genetic map — ESP had to produce entirely new typeset versions of the works, if they were to be available in a high-quality format.

ESP Help

Early support from the DOE component of the Human Genome Project was critically important for getting the ESP project on a firm foundation. Since that funding ended (nearly 20 years ago), the project has been operated as a purely volunteer effort. Anyone wishing to assist in these efforts should send an email to Robbins.

ESP Plans

With the development of methods for adding typeset side notes to PDF files, the ESP project now plans to add annotated versions of some classical papers to its holdings. We also plan to add new reference and pedagogical material. We have already started providing regularly updated, comprehensive bibliographies to the ESP.ORG site.

Electronic Scholarly Publishing
961 Red Tail Lane
Bellingham, WA 98226

E-mail: RJR8222 @ gmail.com

Papers in Classical Genetics

The ESP began as an effort to share a handful of key papers from the early days of classical genetics. Now the collection has grown to include hundreds of papers, in full-text format.

Digital Books

Along with papers on classical genetics, ESP offers a collection of full-text digital books, including many works by Darwin (and even a collection of poetry — Chicago Poems by Carl Sandburg).

Timelines

ESP now offers a much improved and expanded collection of timelines, designed to give the user choice over subject matter and dates.

Biographies

Biographical information about many key scientists.

Selected Bibliographies

Bibliographies on several topics of potential interest to the ESP community are now being automatically maintained and generated on the ESP site.

ESP Picks from Around the Web (updated 07 JUL 2018 )