Viewport Size Code:
Login | Create New Account
picture

  MENU

About | Classical Genetics | Timelines | What's New | What's Hot

About | Classical Genetics | Timelines | What's New | What's Hot

icon

Bibliography Options Menu

icon
QUERY RUN:
HITS:
PAGE OPTIONS:
Hide Abstracts   |   Hide Additional Links
NOTE:
Long bibliographies are displayed in blocks of 100 citations at a time. At the end of each block there is an option to load the next block.

Bibliography on: Pangenome

The Electronic Scholarly Publishing Project: Providing world-wide, free access to classic scientific papers and other scholarly materials, since 1993.

More About:  ESP | OUR CONTENT | THIS WEBSITE | WHAT'S NEW | WHAT'S HOT

ESP: PubMed Auto Bibliography 28 Feb 2024 at 01:33 Created: 

Pangenome

Although the enforced stability of genomic content is ubiquitous among MCEs, the opposite is proving to be the case among prokaryotes, which exhibit remarkable and adaptive plasticity of genomic content. Early bacterial whole-genome sequencing efforts discovered that whenever a particular "species" was re-sequenced, new genes were found that had not been detected earlier — entirely new genes, not merely new alleles. This led to the concepts of the bacterial core-genome, the set of genes found in all members of a particular "species", and the flex-genome, the set of genes found in some, but not all members of the "species". Together these make up the species' pan-genome.

Created with PubMed® Query: ( pangenome OR "pan-genome" OR "pan genome" ) NOT pmcbook NOT ispreviousversion

Citations The Papers (from PubMed®)

-->

RevDate: 2024-02-27

Go S, Koo H, Jung M, et al (2024)

Pan-chloroplast genomes for accession-specific marker development in Hibiscus syriacus.

Scientific data, 11(1):246.

Hibiscus syriacus L. is a renowned ornamental plant. We constructed 95 chloroplast genomes of H. syriacus L. cultivars using a short-read sequencing platform (Illumina) and a long-read sequencing platform (Oxford Nanopore Technology). The following genome assembly, we delineate quadripartite structures encompassing large single-copy, small single-copy, and inverted repeat (IRa and IRb) regions, from 160,231 bp to 161,041 bp. Our comprehensive analyses confirmed the presence of 79 protein-coding genes, 30 tRNA genes, and 4 rRNA genes in the pan-chloroplast genome, consistent with prior research on the H. syriacus chloroplast genome. Subsequent pangenome analysis unveiled widespread genome sequence conservation alongside unique cultivar-specific variant patterns consisting of 193 single-nucleotide polymorphisms and 61 insertions or deletions. The region containing intra-species variant patterns, as identified in this study, has the potential to develop accession-specific molecular markers, enhancing precision in cultivar classification. These findings are anticipated to drive advancements in breeding strategies, augment biodiversity, and unlock the agricultural potential inherent in H. syriacus.

RevDate: 2024-02-27

Dong X, Jia H, Yu Y, et al (2024)

Genomic revisitation and reclassification of the genus Providencia.

mSphere [Epub ahead of print].

Members of Providencia, although typically opportunistic, can cause severe infections in immunocompromised hosts. Recent advances in genome sequencing provide an opportunity for more precise study of this genus. In this study, we first identified and characterized a novel species named Providencia zhijiangensis sp. nov. It has ≤88.23% average nucleotide identity (ANI) and ≤31.8% in silico DNA-DNA hybridization (dDDH) values with all known Providencia species, which fall significantly below the species-defining thresholds. Interestingly, we found that Providencia stuartii and Providencia thailandensis actually fall under the same species, evidenced by an ANI of 98.59% and a dDDH value of 90.4%. By fusing ANI with phylogeny, we have reclassified 545 genomes within this genus into 20 species, including seven unnamed taxa (provisionally titled Taxon 1-7), which can be further subdivided into 23 lineages. Pangenomic analysis identified 1,550 genus-core genes in Providencia, with coenzymes being the predominant category at 10.56%, suggesting significant intermediate metabolism activity. Resistance analysis revealed that most lineages of the genus (82.61%, 19/23) carry a high number of antibiotic-resistance genes (ARGs) and display diverse resistance profiles. Notably, the majority of ARGs are located on plasmids, underscoring the significant role of plasmids in the resistance evolution within this genus. Three species or lineages (P. stuartii, Taxon 3, and Providencia hangzhouensis L12) that possess the highest number of carbapenem-resistance genes suggest their potential influence on clinical treatment. These findings underscore the need for continued surveillance and study of this genus, particularly due to their role in harboring antibiotic-resistance genes.IMPORTANCEThe Providencia genus, known to harbor opportunistic pathogens, has been a subject of interest due to its potential to cause severe infections, particularly in vulnerable individuals. Our research offers groundbreaking insights into this genus, unveiling a novel species, Providencia zhijiangensis sp. nov., and highlighting the need for a re-evaluation of existing classifications. Our comprehensive genomic assessment offers a detailed classification of 545 genomes into distinct species and lineages, revealing the rich biodiversity and intricate species diversity within the genus. The substantial presence of antibiotic-resistance genes in the Providencia genus underscores potential challenges for public health and clinical treatments. Our study highlights the pressing need for increased surveillance and research, enriching our understanding of antibiotic resistance in this realm.

RevDate: 2024-02-27

Kim M, Kim W, Park Y, et al (2024)

Lineage-specific evolution of Aquibium, a close relative of Mesorhizobium, during habitat adaptation.

Applied and environmental microbiology [Epub ahead of print].

The novel genus Aquibium that lacks nitrogenase was recently reclassified from the Mesorhizobium genus. The genomes of Aquibium species isolated from water were smaller and had higher GC contents than those of Mesorhizobium species. Six Mesorhizobium species lacking nitrogenase were found to exhibit low similarity in the average nucleotide identity values to the other 24 Mesorhizobium species. Therefore, they were classified as the non-N2-fixing Mesorhizobium lineage (N-ML), an evolutionary intermediate species. The results of our phylogenomic analyses and the loss of Rhizobiales-specific fur/mur indicated that Mesorhizobium species may have evolved from Aquibium species through an ecological transition. Halotolerant and alkali-resistant Aquibium and Mesorhizobium microcysteis belonging to N-ML possessed many tripartite ATP-independent periplasmic transporter and sodium/proton antiporter subunits composed of seven genes (mrpABCDEFG). These genes were not present in the N2-fixing Mesorhizobium lineage (ML), suggesting that genes acquired for adaptation to highly saline and alkaline environments were lost during the evolution of ML as the habitat changed to soil. Land-to-water habitat changes in Aquibium species, close relatives of Mesorhizobium species, could have influenced their genomic evolution by the gain and loss of genes. Our study indicated that lineage-specific evolution could have played a significant role in shaping their genome architecture and conferring their ability to thrive in different habitats.IMPORTANCEPhylogenetic analyses revealed that the Aquibium lineage (AL) and non-N2-fixing Mesorhizobium lineage (N-ML) were monophyletically grouped into distinct clusters separate from the N2-fixing Mesorhizobium lineage (ML). The N-ML, an evolutionary intermediate species having characteristics of both ancestral and descendant species, could provide a genomic snapshot of the genetic changes that occur during adaptation. Genomic analyses of AL, N-ML, and ML revealed that changes in the levels of genes related to transporters, chemotaxis, and nitrogen fixation likely reflect adaptations to different environmental conditions. Our study sheds light on the complex and dynamic nature of the evolution of rhizobia in response to changes in their environment and highlights the crucial role of genomic analysis in understanding these processes.

RevDate: 2024-02-27

Seo B, Jeon K, Kim WK, et al (2024)

Strain-Specific Anti-Inflammatory Effects of Faecalibacterium prausnitzii Strain KBL1027 in Koreans.

Probiotics and antimicrobial proteins [Epub ahead of print].

Faecalibacterium prausnitzii is one of the most dominant commensal bacteria in the human gut, and certain anti-inflammatory functions have been attributed to a single microbial anti-inflammatory molecule (MAM). Simultaneously, substantial diversity among F. prausnitzii strains is acknowledged, emphasizing the need for strain-level functional studies aimed at developing innovative probiotics. Here, two distinct F. prausnitzii strains, KBL1026 and KBL1027, were isolated from Korean donors, exhibiting notable differences in the relative abundance of F. prausnitzii. Both strains were identified as the core Faecalibacterium amplicon sequence variant (ASV) within the healthy Korean cohort, and their MAM sequences showed a high similarity of 98.6%. However, when a single strain was introduced to mice with dextran sulfate sodium (DSS)-induced colitis, KBL1027 showed the most significant ameliorative effects, including alleviation of colonic inflammation and restoration of gut microbial dysbiosis. Moreover, the supernatant from KBL1027 elevated the secretion of IL-10 cytokine more than that of KBL1026 in mouse bone marrow-derived macrophage (BMDM) cells, suggesting that the strain-specific, anti-inflammatory efficacy of KBL1027 might involve effector compounds other than MAM. Through analysis of the Faecalibacterium pan-genome and comparative genomics, strain-specific functions related to extracellular polysaccharide biosynthesis were identified in KBL1027, which could contribute to the observed morphological disparities. Collectively, our findings highlight the strain-specific, anti-inflammatory functions of F. prausnitzii, even within the same core ASV, emphasizing the influence of their human origin.

RevDate: 2024-02-27

Kogay R, Wolf YI, EV Koonin (2024)

Defense systems and horizontal gene transfer in bacteria.

bioRxiv : the preprint server for biology pii:2024.02.09.579689.

Horizontal gene transfer (HGT) is a fundamental process in the evolution of prokaryotes, making major contributions to diversification and adaptation. Typically, HGT is facilitated by mobile genetic elements (MGEs), such as conjugative plasmids and phages that generally impose fitness costs on their hosts. However, a substantial fraction of bacterial genes is involved in defense mechanisms that limit the propagation of MGEs, raising the possibility that they can actively restrict HGT. Here we examine whether defense systems curb HGT by exploring the connections between HGT rate and the presence of 73 defense systems in 12 bacterial species. We found that only 6 defense systems, 3 of which are different CRISPR-Cas subtypes, are associated with the reduced gene gain rate on the scale of species evolution. The hosts of such defense systems tend to have a smaller pangenome size and harbor fewer phage-related genes compared to genomes lacking these systems, suggesting that these defense mechanisms inhibit HGT by limiting the integration of prophages. We hypothesize that restriction of HGT by defense systems is species-specific and depends on various ecological and genetic factors, including the burden of MGEs and fitness effect of HGT in bacterial populations.

RevDate: 2024-02-26

Huy NQ, Linh NC, Son NT, et al (2024)

Genomic insights into an extensively drug-resistant and hypervirulent Burkholderia dolosa N149 isolate of a novel sequence type (ST2237) from a Vietnamese patient hospitalized for stroke.

Journal of global antimicrobial resistance pii:S2213-7165(24)00036-5 [Epub ahead of print].

OBJECTIVES: Burkholderia dolosa is a clinically important opportunistic pathogen in inpatients. Here we characterized an extensively drug-resistant and hypervirulent B. dolosa isolate from a patient hospitalized for stroke.

METHODS: Resistance to 41 antibiotics was tested with the agar disc diffusion, minimum inhibitory concentration, or broth microdilution method. The complete genome was assembled using short-reads and long-reads and the hybrid de novo assembly method. Allelic profiles obtained by multilocus sequence typing were analyzed using the PubMLST database. Antibiotic-resistance and virulence genes were predicted in silico using public databases and the "baargin" workflow. B. dolosa N149 phylogenetic relationships with all available B. dolosa strains and Burkholderia cepacia complex strains were analyzed using the pangenome obtained with Roary.

RESULTS: B. dolosa N149 displayed extensive resistance to 31 antibiotics and intermediate resistance to 4 antibiotics. The complete genome included three circular chromosomes (6,338,630 bp in total) and one plasmid (167,591 bp). Genotypic analysis revealed various gene clusters (acr, amr, amp, emr, ade, bla and tet) associated with resistance to 35 antibiotic classes. The major intrinsic resistance mechanisms were multidrug efflux pump alterations, inactivation and reduced permeability of targeted antibiotics. Moreover, 91 virulence genes (encoding proteins involved in adherence, formation of capsule, biofilm and colony, motility, phagocytosis inhibition, secretion systems, protease secretion, transmission and quorum sensing) were identified. B. dolosa N149 was assigned to a novel sequence type (ST2237) and formed a mono-phylogenetic clade separated from other B. dolosa strains.

CONCLUSION: This study provided insights into the antimicrobial resistance and virulence mechanisms of B. dolosa.

RevDate: 2024-02-26

Selvaraj Anand S, Wu CT, Bremer J, et al (2024)

Identification of a novel CG307 sub-clade in third-generation-cephalosporin-resistant Klebsiella pneumoniae causing invasive infections in the USA.

Microbial genomics, 10(2):.

Despite the notable clinical impact, recent molecular epidemiology regarding third-generation-cephalosporin-resistant (3GC-R) Klebsiella pneumoniae in the USA remains limited. We performed whole-genome sequencing of 3GC-R K. pneumoniae bacteraemia isolates collected from March 2016 to May 2022 at a tertiary care cancer centre in Houston, TX, USA, using Illumina and Oxford Nanopore Technologies platforms. A comprehensive comparative genomic analysis was performed to dissect population structure, transmission dynamics and pan-genomic signatures of our 3GC-R K. pneumoniae population. Of the 178 3GC-R K. pneumoniae bacteraemias that occurred during our study time frame, we were able to analyse 153 (86 %) bacteraemia isolates, 126 initial and 27 recurrent isolates. While isolates belonging to the widely prevalent clonal group (CG) 258 were rarely observed, the predominant CG, 307, accounted for 37 (29 %) index isolates and displayed a significant correlation (Pearson correlation test P value=0.03) with the annual frequency of 3GC-R K. pneumoniae bacteraemia. Interestingly, only 11 % (4/37) of CG307 isolates belonged to the commonly detected 'Texas-specific' clade that has been observed in previous Texas-based K. pneumoniae antimicrobial-resistance surveillance studies. We identified nearly half of our CG307 isolates (n=18) belonged to a novel, monophyletic CG307 sub-clade characterized by the chromosomally encoded bla SHV-205 and unique accessory genome content. This CG307 sub-clade was detected in various regions of the USA, with genome sequences from 24 additional strains becoming recently available in the National Center for Biotechnology Information (NCBI) SRA database. Collectively, this study underscores the emergence and dissemination of a distinct CG307 sub-clade that is a prevalent cause of 3GC-R K. pneumoniae bacteraemia among cancer patients seen in Houston, TX, and has recently been isolated throughout the USA.

RevDate: 2024-02-25

van Westerhoven AC, Aguilera-Galvez C, Nakasato-Tagami G, et al (2024)

Segmental duplications drive the evolution of accessory regions in a major crop pathogen.

The New phytologist [Epub ahead of print].

Many pathogens evolved compartmentalized genomes with conserved core and variable accessory regions (ARs) that carry effector genes mediating virulence. The fungal plant pathogen Fusarium oxysporum has such ARs, often spanning entire chromosomes. The presence of specific ARs influences the host range, and horizontal transfer of ARs can modify the pathogenicity of the receiving strain. However, how these ARs evolve in strains that infect the same host remains largely unknown. We defined the pan-genome of 69 diverse F. oxysporum strains that cause Fusarium wilt of banana, a significant constraint to global banana production, and analyzed the diversity and evolution of the ARs. Accessory regions in F. oxysporum strains infecting the same banana cultivar are highly diverse, and we could not identify any shared genomic regions and in planta-induced effectors. We demonstrate that segmental duplications drive the evolution of ARs. Furthermore, we show that recent segmental duplications specifically in accessory chromosomes cause the expansion of ARs in F. oxysporum. Taken together, we conclude that extensive recent duplications drive the evolution of ARs in F. oxysporum, which contribute to the evolution of virulence.

RevDate: 2024-02-24

Straková D, Sánchez-Porro C, de la Haba RR, et al (2024)

Decoding the Genomic Profile of the Halomicroarcula Genus: Comparative Analysis and Characterization of Two Novel Species.

Microorganisms, 12(2):.

The genus Halomicroarcula, classified within the family Haloarculaceae, presently comprises eight haloarchaeal species isolated from diverse saline habitats, such as solar salterns, hypersaline soils, marine salt, and marine algae. Here, a detailed taxogenomic study and comparative genomic analysis of the genus Halomicroarcula was carried out. In addition, two strains, designated S1CR25-12[T] and S3CR25-11[T], that were isolated from hypersaline soils located in the Odiel Saltmarshes in Huelva (Spain) were included in this study. The 16S rRNA and rpoB' gene sequence analyses affiliated the two strains to the genus Halomicroarcula. Typically, the species of the genus Halomicroarcula possess multiple heterogeneous copies of the 16S rRNA gene, which can lead to misclassification of the taxa and overestimation of the prokaryotic diversity. In contrast, the application of overall genome relatedness indexes (OGRIs) augments the capacity for the precise taxonomic classification and categorization of prokaryotic organisms. The relatedness indexes of the two new isolates, particularly digital DNA-DNA hybridization (dDDH), orthologous average nucleotide identity (OrthoANI), and average amino acid identity (AAI), confirmed that strains S1CR25-12[T] (= CECT 30620[T] = CCM 9252[T]) and S3CR25-11[T] (= CECT 30621[T] = CCM 9254[T]) constitute two novel species of the genus Halomicroarcula. The names Halomicroarcula saliterrae sp. nov. and Halomicroarcula onubensis sp. nov. are proposed for S1CR25-12[T] and S3CR25-11[T], respectively. Metagenomic fragment recruitment analysis, conducted using seven shotgun metagenomic datasets, revealed that the species belonging to the genus Halomicroarcula were predominantly recruited from hypersaline soils found in the Odiel Saltmarshes and the ponds of salterns with high salt concentrations. This reinforces the understanding of the extreme halophilic characteristics associated with the genus Halomicroarcula. Finally, comparing pan-genomes across the twenty Halomicroarcula and Haloarcula species allowed for the identification of commonalities and differences between the species of these two related genera.

RevDate: 2024-02-24

Rhoads DD, Pummill J, AAK Alrubaye (2024)

Molecular Genomic Analyses of Enterococcus cecorum from Sepsis Outbreaks in Broilers.

Microorganisms, 12(2): pii:microorganisms12020250.

Extensive genomic analyses of Enterococcus cecorum isolates from sepsis outbreaks in broilers suggest a polyphyletic origin, likely arising from core genome mutations rather than gene acquisition. This species is a normal intestinal flora of avian species with particular isolates associated with osteomyelitis. More recently, this species has been associated with sepsis outbreaks affecting broilers during the first 3 weeks post-hatch. Understanding the genetic and management basis of this new phenotype is critical for developing strategies to mitigate this emerging problem. Phylogenomic analyses of 227 genomes suggest that sepsis isolates are polyphyletic and closely related to both commensal and osteomyelitis isolate genomes. Pangenome analyses detect no gene acquisitions that distinguish all the sepsis isolates. Core genome single nucleotide polymorphism analyses have identified a number of mutations, affecting the protein-coding sequences, that are enriched in sepsis isolates. The analysis of the protein substitutions supports the mutational origins of sepsis isolates.

RevDate: 2024-02-24

Nedashkovskaya O, Balabanova L, Otstavnykh N, et al (2024)

In-Depth Genome Characterization and Pan-Genome Analysis of Strain KMM 296, a Producer of Highly Active Alkaline Phosphatase; Proposal for the Reclassification of Cobetia litoralis and Cobetia pacifica as the Later Heterotypic Synonyms of Cobetia amphilecti and Cobetia marina, and Emended Description of the Species Cobetia amphilecti and Cobetia marina.

Biomolecules, 14(2): pii:biom14020196.

A strictly aerobic, Gram-stain-negative, rod-shaped, and motile bacterium, designated strain KMM 296, isolated from the coelomic fluid of the mussel Crenomytilus grayanus, was investigated in detail due to its ability to produce a highly active alkaline phosphatase CmAP of the structural family PhoA. A previous taxonomic study allocated the strain to the species Cobetia marina, a member of the family Halomonadaceae of the class Gammaproteobacteria. However, 16S rRNA gene sequencing showed KMM 296's relatedness to Cobetia amphilecti NRIC 0815[T]. The isolate grew with 0.5-19% NaCl at 4-42 °C and hydrolyzed Tweens 20 and 40 and L-tyrosine. The DNA G+C content was 62.5 mol%. The prevalent fatty acids were C18:1 ω7c, C12:0 3-OH, C18:1 ω7c, C12:0, and C17:0 cyclo. The polar lipid profile was characterized by the presence of phosphatidylethanolamine, phosphatidylglycerol, phosphatidic acid, and also an unidentified aminolipid, phospholipid, and a few unidentified lipids. The major respiratory quinone was Q-8. According to phylogenomic and chemotaxonomic evidence, and the nearest neighbors, the strain KMM 296 represents a member of the species C. amphilecti. The genome-based analysis of C. amphilecti NRIC 0815[T] and C. litoralis NRIC 0814[T] showed their belonging to a single species. In addition, the high similarity between the C. pacifica NRIC 0813[T] and C. marina LMG 2217[T] genomes suggests their affiliation to one species. Based on the rules of priority, C. litoralis should be reclassified as a later heterotypic synonym of C. amphilecti, and C. pacifica is a later heterotypic synonym of C. marina. The emended descriptions of the species C. amphilecti and C. marina are also proposed.

RevDate: 2024-02-24

Evseev PV, Shneider MM, Kolupaeva LV, et al (2024)

New Obolenskvirus Phages Brutus and Scipio: Biology, Evolution, and Phage-Host Interaction.

International journal of molecular sciences, 25(4): pii:ijms25042074.

Two novel virulent phages of the genus Obolenskvirus infecting Acinetobacter baumannii, a significant nosocomial pathogen, have been isolated and studied. Phages Brutus and Scipio were able to infect A. baumannii strains belonging to the K116 and K82 capsular types, respectively. The biological properties and genomic organization of the phages were characterized. Comparative genomic, phylogenetic, and pangenomic analyses were performed to investigate the relationship of Brutus and Scipio to other bacterial viruses and to trace the possible origin and evolutionary history of these phages and other representatives of the genus Obolenskvirus. The investigation of enzymatic activity of the tailspike depolymerase encoded in the genome of phage Scipio, the first reported virus infecting A. baumannii of the K82 capsular type, was performed. The study of new representatives of the genus Obolenskvirus and mechanisms of action of depolymerases encoded in their genomes expands knowledge about the diversity of viruses within this taxonomic group and strategies of Obolenskvirus-host bacteria interaction.

RevDate: 2024-02-24

Sepich-Poore GD, McDonald D, Kopylova E, et al (2024)

Robustness of cancer microbiome signals over a broad range of methodological variation.

Oncogene [Epub ahead of print].

In 2020, we identified cancer-specific microbial signals in The Cancer Genome Atlas (TCGA) [1]. Multiple peer-reviewed papers independently verified or extended our findings [2-12]. Given this impact, we carefully considered concerns by Gihawi et al. [13] that batch correction and database contamination with host sequences artificially created the appearance of cancer type-specific microbiomes. (1) We tested batch correction by comparing raw and Voom-SNM-corrected data per-batch, finding predictive equivalence and significantly similar features. We found consistent results with a modern microbiome-specific method (ConQuR [14]), and when restricting to taxa found in an independent, highly-decontaminated cohort. (2) Using Conterminator [15], we found low levels of human contamination in our original databases (~1% of genomes). We demonstrated that the increased detection of human reads in Gihawi et al. [13] was due to using a newer human genome reference. (3) We developed Exhaustive, a method twice as sensitive as Conterminator, to clean RefSeq. We comprehensively host-deplete TCGA with many human (pan)genome references. We repeated all analyses with this and the Gihawi et al. [13] pipeline, and found cancer type-specific microbiomes. These extensive re-analyses and updated methods validate our original conclusion that cancer type-specific microbial signatures exist in TCGA, and show they are robust to methodology.

RevDate: 2024-02-24

Patakova P, Vasylkivska M, Sedlar K, et al (2024)

Whole genome sequencing and characterization of Pantoea agglomerans DBM 3797, endophyte, isolated from fresh hop (Humulus lupulus L.).

Frontiers in microbiology, 15:1305338.

BACKGROUND: This paper brings new information about the genome and phenotypic characteristics of Pantoea agglomerans strain DBM 3797, isolated from fresh Czech hop (Humulus lupulus) in the Saaz hop-growing region. Although P. agglomerans strains are frequently isolated from different materials, there are not usually thoroughly characterized even if they have versatile metabolism and those isolated from plants may have a considerable potential for application in agriculture as a support culture for plant growth.

METHODS: P. agglomerans DBM 3797 was cultured under aerobic and anaerobic conditions, its metabolites were analyzed by HPLC and it was tested for plant growth promotion abilities, such as phosphate solubilization, siderophore and indol-3-acetic acid productions. In addition, genomic DNA was extracted, sequenced and de novo assembly was performed. Further, genome annotation, pan-genome analysis and selected genome analyses, such as CRISPR arrays detection, antibiotic resistance and secondary metabolite genes identification were carried out.

RESULTS AND DISCUSSION: The typical appearance characteristics of the strain include the formation of symplasmata in submerged liquid culture and the formation of pale yellow colonies on agar. The genetic information of the strain (in total 4.8 Mb) is divided between a chromosome and two plasmids. The strain lacks any CRISPR-Cas system but is equipped with four restriction-modification systems. The phenotypic analysis focused on growth under both aerobic and anaerobic conditions, as well as traits associated with plant growth promotion. At both levels (genomic and phenotypic), the production of siderophores, indoleacetic acid-derived growth promoters, gluconic acid, and enzyme activities related to the degradation of complex organic compounds were found. Extracellular gluconic acid production under aerobic conditions (up to 8 g/l) is probably the result of glucose oxidation by the membrane-bound pyrroloquinoline quinone-dependent enzyme glucose dehydrogenase. The strain has a number of properties potentially beneficial to the hop plant and its closest relatives include the strains also isolated from the aerial parts of plants, yet its safety profile needs to be addressed in follow-up research.

RevDate: 2024-02-23

Miao J, Wei X, Cao C, et al (2024)

Pig pangenome graph reveals functional features of non-reference sequences.

Journal of animal science and biotechnology, 15(1):32.

BACKGROUND: The reliance on a solitary linear reference genome has imposed a significant constraint on our comprehensive understanding of genetic variation in animals. This constraint is particularly pronounced for non-reference sequences (NRSs), which have not been extensively studied.

RESULTS: In this study, we constructed a pig pangenome graph using 21 pig assemblies and identified 23,831 NRSs with a total length of 105 Mb. Our findings revealed that NRSs were more prevalent in breeds exhibiting greater genetic divergence from the reference genome. Furthermore, we observed that NRSs were rarely found within coding sequences, while NRS insertions were enriched in immune-related Gene Ontology terms. Notably, our investigation also unveiled a close association between novel genes and the immune capacity of pigs. We observed substantial differences in terms of frequencies of NRSs between Eastern and Western pigs, and the heat-resistant pigs exhibited a substantial number of NRS insertions in an 11.6 Mb interval on chromosome X. Additionally, we discovered a 665 bp insertion in the fourth intron of the TNFRSF19 gene that may be associated with the ability of heat tolerance in Southern Chinese pigs.

CONCLUSIONS: Our findings demonstrate the potential of a graph genome approach to reveal important functional features of NRSs in pig populations.

RevDate: 2024-02-22

Pena-Fernández N, Ocejo M, van der Graaf-van Bloois L, et al (2024)

Comparative pangenomic analysis of Campylobacter fetus isolated from Spanish bulls and other mammalian species.

Scientific reports, 14(1):4347.

Campylobacter fetus comprises two closely related mammal-associated subspecies: Campylobacter fetus subsp. fetus (Cff) and Campylobacter fetus subsp. venerealis (Cfv). The latter causes bovine genital campylobacteriosis, a sexually-transmitted disease endemic in Spain that results in significant economic losses in the cattle industry. Here, 33 C. fetus Spanish isolates were whole-genome sequenced and compared with 62 publicly available C. fetus genomes from other countries. Genome-based taxonomic identification revealed high concordance with in silico PCR, confirming Spanish isolates as Cff (n = 4), Cfv (n = 9) and Cfv biovar intermedius (Cfvi, n = 20). MLST analysis assigned the Spanish isolates to 6 STs, including three novel: ST-76 and ST-77 for Cfv and ST-78 for Cff. Core genome SNP phylogenetic analysis of the 95 genomes identified multiple clusters, revealing associations at subspecies and biovar level between genomes with the same ST and separating the Cfvi genomes from Spain and other countries. A genome-wide association study identified pqqL as a Cfv-specific gene and a potential candidate for more accurate identification methods. Functionality analysis revealed variations in the accessory genome of C. fetus subspecies and biovars that deserve further studies. These results provide valuable information about the regional variants of C. fetus present in Spain and the genetic diversity and predicted functionality of the different subspecies.

RevDate: 2024-02-22

Arizala D, M Arif (2024)

Impact of homologous recombination on core genome evolution and host adaptation of Pectobacterium parmentieri.

Genome biology and evolution pii:7612553 [Epub ahead of print].

Homologous recombination is a major force mechanism driving bacterial evolution, host adaptability and acquisition of novel virulence traits. Pectobacterium parmentieri is a plant bacterial pathogen distributed worldwide, primarily affecting potatoes, by causing soft rot and blackleg diseases. The goal of this investigation was to understand the impact of homologous recombination on the genomic evolution of P. parmentieri. Analysis of P. parmentieri genomes using Roary revealed a dynamic pan-genome with 3,742 core genes and over 55% accessory genome variability. Bayesian population structure analysis identified seven lineages, indicating species heterogeneity. ClonalFrameML analysis displayed 5,125 recombination events, with the lineage 4 exhibiting the highest events. fastGEAR analysis identified 486 ancestral and 941 recent recombination events ranging 43 bp - 119 kb and 36 bp - 13.96 kb, respectively, suggesting ongoing adaptation. Notably, 11% (412 genes) of the core genome underwent recent recombination, with lineage 1 as the main donor. The prevalence of recent recombination (double compared to ancient) events implies continuous adaptation, possibly driven by global potato trade. Recombination events were found in genes involved in vital cellular processes (DNA replication, DNA repair, RNA processing, homeostasis, and metabolism), pathogenicity determinants (type secretion systems, cell-wall degrading enzymes, iron scavengers, lipopolysaccharides, flagellum, etc.), antimicrobial compounds (phenazine and colicin) and even CRISPR-Cas genes. Overall, these results emphasize the potential role of homologous recombination in P. parmentieri's evolutionary dynamics, influencing host colonization, pathogenicity, adaptive immunity, and ecological fitness.

RevDate: 2024-02-22

Tariq DE (2024)

Pangenomic analyses of tuberculosis strains to identify resistomes using computational approaches.

JPMA. The Journal of the Pakistan Medical Association, 74(1 (Supple-2)):S74-S78.

OBJECTIVE: To locate resistomes in tuberculosis strains, to determine the severity of drug resistance, and to infer its implications with respect to high tuberculosis prevalence in a Third World setting.

METHODS: The pangenomic study was conducted from October 2022 to January 2023 in Sir Syed University of Engineering and Technology, Karachi, and comprised 2012-22 data on multiple sequence alignment to assess the genetic evolution of tuberculosis strains. Antibiotic resistance drug classes were identified using the Canadian Antibiotic Resistance Database, which entailed multidrug-resistant and extremely drug-resistant strains. Also, GenBank was used for tuberculosis genome FASTA (fast-all; nucleotide and protein sequence representation) files, prediction of resistome sequences on the basis of Canadian Antibiotic Resistance Database, and multiple sequence alignment was done in Mauve.

RESULTS: Evolutionarily, the 6 strains identified were structurally similar with polymorphisms in their core chromosomal regions. Their resistome genes showed perfect hits for isoniazid, rifamycin, cephalosporin, fluoroquinolone, aminoglycosides, penem, penam and cephamycin.

CONCLUSION: Drugs discovered in antibiotic resistance genes are now less effective in treatment, and have the potential to develop into more dangerous bacteria, if not monitored. For treatment, staying long durations in hospitals for quality healthcare and supervision in third world countries is unaffordable.

RevDate: 2024-02-22

Turco S, Russo S, Pietrucci D, et al (2024)

High clonality of Mycobacterium avium subsp. paratuberculosis field isolates from red deer revealed by two different methodological approaches of comparative genomic analysis.

Frontiers in veterinary science, 11:1301667.

Mycobacterium avium subsp. paratuberculosis (MAP) is the aetiological agent of paratuberculosis (Johne's disease) in both domestic and wild ruminants. In the present study, using a whole-genome sequence (WGS) approach, we investigated the genetic diversity of 15 Mycobacterium avium field strains isolated in the last 10 years from red deer inhabiting the Stelvio National Park and affected by paratuberculosis. Combining de novo assembly and a reference-based method, followed by a pangenome analysis, we highlight a very close relationship among 13 MAP field isolates, suggesting that a single infecting event occurred in this population. Moreover, two isolates have been classified as Mycobacterium avium subsp. hominissuis, distinct from the other MAPs under comparison but close to each other. This is the first time that this subspecies has been found in Italy in samples without evident epidemiological correlations, having been isolated in two different locations of the Stelvio National Park and in different years. Our study highlights the importance of a multidisciplinary approach incorporating molecular epidemiology and ecology into traditional infectious disease knowledge in order to investigate the nature of infectious disease in wildlife populations.

RevDate: 2024-02-20

Schreiber M, Jayakodi M, Stein N, et al (2024)

Plant pangenomes for crop improvement, biodiversity and evolution.

Nature reviews. Genetics [Epub ahead of print].

Plant genome sequences catalogue genes and the genetic elements that regulate their expression. Such inventories further research aims as diverse as mapping the molecular basis of trait diversity in domesticated plants or inquiries into the origin of evolutionary innovations in flowering plants millions of years ago. The transformative technological progress of DNA sequencing in the past two decades has enabled researchers to sequence ever more genomes with greater ease. Pangenomes - complete sequences of multiple individuals of a species or higher taxonomic unit - have now entered the geneticists' toolkit. The genomes of crop plants and their wild relatives are being studied with translational applications in breeding in mind. But pangenomes are applicable also in ecological and evolutionary studies, as they help classify and monitor biodiversity across the tree of life, deepen our understanding of how plant species diverged and show how plants adapt to changing environments or new selection pressures exerted by human beings.

RevDate: 2024-02-20

Truong TC, Park H, Kim JH, et al (2024)

The evolutionary phylodynamics of human parechovirus A type 3 reveal multiple recombination events in South Korea.

Journal of medical virology, 96(2):e29477.

Human parechovirus A (HPeV-A) is a causative agent of respiratory and gastrointestinal illnesses, acute flaccid paralysis encephalitis, meningitis, and neonatal sepsis. To clarify the characteristics of HPeV-A infection in children, 391 fecal specimens were collected from January 2014 to October 2015 from patients with acute gastroenteritis in Seoul, South Korea. Of these, 221/391 (56.5%) HPeV-A positive samples were found in children less than 2 years old. Three HPeV-A genotypes HPeV-A1 (117/221; 52.94%), HPeV-A3 (100/221; 45.25%), and HPeV-A6 (4/221; 1.81%) were detected, among which HPeV-A3 was predominant with the highest recorded value of 58.6% in 2015. Moreover, recombination events in the Korean HPeV-A3 strains were detected. Phylogenetic analysis revealed that the capsid-encoding regions and noncapsid gene 2A of the four Korean HPeV-A3 strains are closely related to the HPeV-A3 strains isolated in Canada in 2007 (Can82853-01), Japan in 2008 (A308/99), and Taiwan in 2011 (TW-03067-2011) while noncapsid genes P2 (2B-2C) and P3 (3A-3D) are closely related to those of HPeV-A1 strains BNI-788St (Germany in 2008) and TW-71594-2010 (Taiwan in 2010). This first report on the whole-genome analysis of HPeV-A3 in Korea provides insight into the evolving status and pathogenesis of HPeVs in children.

RevDate: 2024-02-20

Cooper HB, Vezina B, Hawkey J, et al (2024)

A validated pangenome-scale metabolic model for the Klebsiella pneumoniae species complex.

Microbial genomics, 10(2):.

The Klebsiella pneumoniae species complex (KpSC) is a major source of nosocomial infections globally with high rates of resistance to antimicrobials. Consequently, there is growing interest in understanding virulence factors and their association with cellular metabolic processes for developing novel anti-KpSC therapeutics. Phenotypic assays have revealed metabolic diversity within the KpSC, but metabolism research has been neglected due to experiments being difficult and cost-intensive. Genome-scale metabolic models (GSMMs) represent a rapid and scalable in silico approach for exploring metabolic diversity, which compile genomic and biochemical data to reconstruct the metabolic network of an organism. Here we use a diverse collection of 507 KpSC isolates, including representatives of globally distributed clinically relevant lineages, to construct the most comprehensive KpSC pan-metabolic model to date, KpSC pan v2. Candidate metabolic reactions were identified using gene orthology to known metabolic genes, prior to manual curation via extensive literature and database searches. The final model comprised a total of 3550 reactions, 2403 genes and can simulate growth on 360 unique substrates. We used KpSC pan v2 as a reference to derive strain-specific GSMMs for all 507 KpSC isolates, and compared these to GSMMs generated using a prior KpSC pan-reference (KpSC pan v1) and two single-strain references. We show that KpSC pan v2 includes a greater proportion of accessory reactions (8.8 %) than KpSC pan v1 (2.5 %). GSMMs derived from KpSC pan v2 also generate more accurate growth predictions, with high median accuracies of 95.4 % (aerobic, n=37 isolates) and 78.8 % (anaerobic, n=36 isolates) for 124 matched carbon substrates. KpSC pan v2 is freely available at https://github.com/kelwyres/KpSC-pan-metabolic-model, representing a valuable resource for the scientific community, both as a source of curated metabolic information and as a reference to derive accurate strain-specific GSMMs. The latter can be used to investigate the relationship between KpSC metabolism and traits of interest, such as reservoirs, epidemiology, drug resistance or virulence, and ultimately to inform novel KpSC control strategies.

RevDate: 2024-02-20

Benning S, Pritsch K, Radl V, et al (2024)

(Pan)genomic analysis of two Rhodococcus isolates and their role in phenolic compound degradation.

Microbiology spectrum [Epub ahead of print].

The genus Rhodococcus is recognized for its potential to degrade a large range of aromatic substances, including plant-derived phenolic compounds. We used comparative genomics in the context of the broader Rhodococcus pan-genome to study genomic traits of two newly described Rhodococcus strains (type-strain Rhodococcus pseudokoreensis R79[T] and Rhodococcus koreensis R85) isolated from apple rhizosphere. Of particular interest was their ability to degrade phenolic compounds as part of an integrated approach to treat apple replant disease (ARD) syndrome. The pan-genome of the genus Rhodococcus based on 109 high-quality genomes was open with a small core (1.3%) consisting of genes assigned to basic cell functioning. The range of genome sizes in Rhodococcus was high, from 3.7 to 10.9 Mbp. Genomes from host-associated strains were generally smaller compared to environmental isolates which were characterized by exceptionally large genome sizes. Due to large genomic differences, we propose the reclassification of distinct groups of rhodococci like the Rhodococcus equi cluster to new genera. Taxonomic species affiliation was the most important factor in predicting genetic content and clustering of the genomes. Additionally, we found genes that discriminated between the strains based on habitat. All members of the genus Rhodococcus had at least one gene involved in the pathway for the degradation of benzoate, while biphenyl degradation was mainly restricted to strains in close phylogenetic relationships with our isolates. The ~40% of genes still unclassified in larger Rhodococcus genomes, particularly those of environmental isolates, need more research to explore the metabolic potential of this genus.IMPORTANCERhodococcus is a diverse, metabolically powerful genus, with high potential to adapt to different habitats due to the linear plasmids and large genome sizes. The analysis of its pan-genome allowed us to separate host-associated from environmental strains, supporting taxonomic reclassification. It was shown which genes contribute to the differentiation of the genomes based on habitat, which can possibly be used for targeted isolation and screening for desired traits. With respect to apple replant disease (ARD), our isolates showed genome traits that suggest potential for application in reducing plant-derived phenolic substances in soil, which makes them good candidates for further testing against ARD.

RevDate: 2024-02-20

Lagerstrom KM, Scales NC, EA Hadly (2024)

Impressive pan-genomic diversity of E. coli from a wild animal community near urban development reflects human impacts.

iScience, 27(3):109072.

Human and domesticated animal waste infiltrates global freshwater, terrestrial, and marine environments, widely disseminating fecal microbes, antibiotics, and other chemical pollutants. Emerging evidence suggests that guts of wild animals are being invaded by our microbes, including Escherichia coli, which face anthropogenic selective pressures to gain antimicrobial resistance (AMR) and increase virulence. However, wild animal sources remain starkly under-represented among genomic sequence repositories. We sequenced whole genomes of 145 E. coli isolates from 55 wild and 13 domestic animal fecal samples, averaging 2 (ranging 1-7) isolates per sample, on a preserve imbedded in a human-dominated landscape in California Bay Area, USA, to assess AMR, virulence, and pan-genomic diversity. With single nucleotide polymorphism analyses we predict potential transmission routes. We illustrate the usefulness of E. coli to aid our understanding of and ability to surveil the emergence of zoonotic pathogens created by the mixing of human and wild bacteria in the environment.

RevDate: 2024-02-19

Bolognini D, Halgren A, Lou RN, et al (2024)

Global diversity, recurrent evolution, and recent selection on amylase structural haplotypes in humans.

bioRxiv : the preprint server for biology pii:2024.02.07.579378.

The adoption of agriculture, first documented ∼12,000 years ago in the Fertile Crescent, triggered a rapid shift toward starch-rich diets in human populations. Amylase genes facilitate starch digestion and increased salivary amylase copy number has been observed in some modern human populations with high starch intake, though evidence of recent selection is lacking. Here, using 52 long-read diploid assemblies and short read data from ∼5,600 contemporary and ancient humans, we resolve the diversity, evolutionary history, and selective impact of structural variation at the amylase locus. We find that both salivary and pancreatic amylase genes have higher copy numbers in populations with agricultural subsistence compared to fishing, hunting, and pastoral groups. We identify 28 distinct amylase structural architectures and demonstrate that identical structures have arisen independently multiple times throughout recent human history. Using a pangenome graph-based approach to infer structural haplotypes across thousands of humans, we identify extensively duplicated haplotypes present at higher frequencies in modern agricultural populations. Leveraging 534 ancient human genomes we find that duplication-containing haplotypes have increased in frequency more than seven-fold over the last 12,000 years providing evidence for recent selection in Eurasians at this locus comparable in magnitude to that at lactase. Together, our study highlights the strong impact of the agricultural revolution on human genomes and the importance of long-read sequencing in identifying signatures of selection at structurally complex loci.

RevDate: 2024-02-19

Lypaczewski P, Chac D, Dunmire CN, et al (2024)

Diversity of Vibrio cholerae O1 through the human gastrointestinal tract during cholera.

bioRxiv : the preprint server for biology pii:2024.02.08.579476.

UNLABELLED: Vibrio cholerae O1 causes the diarrheal disease cholera, and the small intestine is the site of active infection. During cholera, cholera toxin is secreted from V. cholerae and induces a massive fluid influx into the small intestine, which causes vomiting and diarrhea. Typically, V. cholerae genomes are sequenced from bacteria passed in stool, but rarely from vomit, a fluid that may more closely represents the site of active infection. We hypothesized that the V. cholerae O1 population bottlenecks along the gastrointestinal tract would result in reduced genetic variation in stool compared to vomit. To test this, we sequenced V. cholerae genomes from ten cholera patients with paired vomit and stool samples. Genetic diversity was low in both vomit and stool, consistent with a single infecting population rather than co-infection with divergent V. cholerae O1 lineages. The number of single nucleotide variants decreased between vomit and stool in four patients, increased in two, and remained unchanged in four. The number of genes encoded in the V. cholerae genome decreased between vomit and stool in eight patients and increased in two. Pangenome analysis of assembled short-read sequencing demonstrated that the toxin-coregulated pilus operon more frequently contained deletions in genomes from vomit compared to stool. However, these deletions were not detected by PCR or long-read sequencing, indicating that interpreting gene presence or absence patterns from short-read data alone may be incomplete. Overall, we found that V. cholerae O1 isolated from stool is genetically similar to V. cholerae recovered from the upper intestinal tract.

IMPORTANCE: Vibrio cholerae O1, the bacterium that causes cholera, is ingested in contaminated food or water and then colonizes the upper small intestine and is excreted in stool. Shed V. cholerae genomes are usually studied, but V. cholerae isolated from vomit may be more representative of where V. cholerae colonizes in the upper intestinal epithelium. V. cholerae may experience bottlenecks, or large reductions in bacterial population sizes or genetic diversity, as it passes through the gut. Passage through the gut may select for distinct V. cholerae mutants that are adapted for survival and gut colonization. We did not find strong evidence for such adaptive mutations, and instead observed that passage through the gut results in modest reductions in V. cholerae genetic diversity, and only in some patients. These results fill a gap in our understanding of the V. cholerae life cycle, transmission, and evolution.

RevDate: 2024-02-19

Yuan C, An T, Li X, et al (2023)

Genomic analysis of Ralstonia pickettii reveals the genetic features for potential pathogenicity and adaptive evolution in drinking water.

Frontiers in microbiology, 14:1272636.

Ralstonia pickettii, the most critical clinical pathogen of the genus Ralstonia, has been identified as a causative agent of numerous harmful infections. Additionally, Ralstonia pickettii demonstrates adaptability to extreme environmental conditions, such as those found in drinking water. In this study, we conducted a comprehensive genomic analysis to investigate the genomic characteristics related to potential pathogenicity and adaptive evolution in drinking water environments of Ralstonia pickettii. Through phylogenetic analysis and population genetic analysis, we divided Ralstonia pickettii into five Groups, two of which were associated with drinking water environments. The open pan-genome with a large and flexible gene repertoire indicated a high genetic plasticity. Significant differences in functional enrichment were observed between the core- and pan-genome of different groups. Diverse mobile genetic elements (MGEs), extensive genomic rearrangements, and horizontal gene transfer (HGT) events played a crucial role in generating genetic diversity. In drinking water environments, Ralstonia pickettii exhibited strong adaptability, and the acquisition of specific adaptive genes was potentially facilitated by genomic islands (GIs) and HGT. Furthermore, environmental pressures drove the adaptive evolution of Ralstonia pickettii, leading to the accumulation of unique mutations in key genes. These mutations may have a significant impact on various physiological functions, particularly carbon metabolism and energy metabolism. The presence of virulence-related elements associated with macromolecular secretion systems, virulence factors, and antimicrobial resistance indicated the potential pathogenicity of Ralstonia pickettii, making it capable of causing multiple nosocomial infections. This study provides comprehensive insights into the potential pathogenicity and adaptive evolution of Ralstonia pickettii in drinking water environments from a genomic perspective.

RevDate: 2024-02-16

Shen L, Liu Y, Chen L, et al (2024)

Genomic basis of environmental adaptation in the widespread poly-extremophilic Exiguobacterium group.

The ISME journal, 18(1):.

Delineating cohesive ecological units and determining the genetic basis for their environmental adaptation are among the most important objectives in microbiology. In the last decade, many studies have been devoted to characterizing the genetic diversity in microbial populations to address these issues. However, the impact of extreme environmental conditions, such as temperature and salinity, on microbial ecology and evolution remains unclear so far. In order to better understand the mechanisms of adaptation, we studied the (pan)genome of Exiguobacterium, a poly-extremophile bacterium able to grow in a wide range of environments, from permafrost to hot springs. To have the genome for all known Exiguobacterium type strains, we first sequenced those that were not yet available. Using a reverse-ecology approach, we showed how the integration of phylogenomic information, genomic features, gene and pathway enrichment data, regulatory element analyses, protein amino acid composition, and protein structure analyses of the entire Exiguobacterium pangenome allows to sharply delineate ecological units consisting of mesophilic, psychrophilic, halophilic-mesophilic, and halophilic-thermophilic ecotypes. This in-depth study clarified the genetic basis of the defined ecotypes and identified some key mechanisms driving the environmental adaptation to extreme environments. Our study points the way to organizing the vast microbial diversity into meaningful ecologically units, which, in turn, provides insight into how microbial communities adapt and respond to different environmental conditions in a changing world.

RevDate: 2024-02-16

Wu Z, Li T, Jiang Z, et al (2024)

Human pangenome analysis of sequences missing from the reference genome reveals their widespread evolutionary, phenotypic, and functional roles.

Nucleic acids research pii:7607875 [Epub ahead of print].

Nonreference sequences (NRSs) are DNA sequences present in global populations but absent in the current human reference genome. However, the extent and functional significance of NRSs in the human genomes and populations remains unclear. Here, we de novo assembled 539 genomes from five genetically divergent human populations using long-read sequencing technology, resulting in the identification of 5.1 million NRSs. These were merged into 45284 unique NRSs, with 29.7% being novel discoveries. Among these NRSs, 38.7% were common across the five populations, and 35.6% were population specific. The use of a graph-based pangenome approach allowed for the detection of 565 transcript expression quantitative trait loci on NRSs, with 426 of these being novel findings. Moreover, 26 NRS candidates displayed evidence of adaptive selection within human populations. Genes situated in close proximity to or intersecting with these candidates may be associated with metabolism and type 2 diabetes. Genome-wide association studies revealed 14 NRSs to be significantly associated with eight phenotypes. Additionally, 154 NRSs were found to be in strong linkage disequilibrium with 258 phenotype-associated SNPs in the GWAS catalogue. Our work expands the understanding of human NRSs and provides novel insights into their functions, facilitating evolutionary and biomedical researches.

RevDate: 2024-02-16

Bonnie JK, Ahmed OY, B Langmead (2024)

DandD: Efficient measurement of sequence growth and similarity.

iScience, 27(3):109054 pii:S2589-0042(24)00275-X.

Genome assembly databases are growing rapidly. The redundancy of sequence content between a new assembly and previous ones is neither conceptually nor algorithmically easy to measure. We introduce pertinent methods and DandD, a tool addressing how much new sequence is gained when a sequence collection grows. DandD can describe how much structural variation is discovered in each new human genome assembly and when discoveries will level off in the future. DandD uses a measure called δ ("delta"), developed initially for data compression and chiefly dependent on k-mer counts. DandD rapidly estimates δ using genomic sketches. We propose δ as an alternative to k-mer-specific cardinalities when computing the Jaccard coefficient, thereby avoiding the pitfalls of a poor choice of k. We demonstrate the utility of DandD's functions for estimating δ, characterizing the rate of pangenome growth, and computing all-pairs similarities using k-independent Jaccard.

RevDate: 2024-02-15

Zhou L, Liu D, Zhu Y, et al (2024)

Advance typing of Vibrio parahaemolyticus through the mtlA and aer gene: A high-resolution, cost-effective approach.

Heliyon, 10(3):e25642.

Vibrio parahaemolyticus is a significant cause of foodborne illness, and its incidence worldwide is on the rise. It is thus imperative to develop a straightforward and efficient method for typing strains of this pathogen. In this study, we conducted a pangenome analysis of 75 complete genomes of V. parahaemolyticus and identified the core gene mtlA with the highest degree of variation, which distinguished 44 strains and outperformed traditional seven-gene-based MLST when combined with aer, another core gene with high degree of variation. The mtlA gene had higher resolution to type strains with a close relationship compared to the traditional MLST genes in the phylogenetic tree built by core genomes. Strong positive selection was also detected in the gene mtlA (ω > 1), representing adaptive and evolution in response to the environment. Therefore, the panel of gene mtlA and aer may serve as a tool for the typing of V. parahaemolyticus, potentially contributing to the prevention and control of this foodborne disease.

RevDate: 2024-02-14

Leonard AS, Mapel XM, H Pausch (2024)

Pangenome genotyped structural variation improves molecular phenotype mapping in cattle.

Genome research pii:gr.278267.123 [Epub ahead of print].

Expression and splicing quantitative trait loci (e/sQTL) are large contributors to phenotypic variability. Achieving sufficient statistical power for e/sQTL mapping requires large cohorts with both genotypes and molecular phenotypes, and so the genomic variation is often called from short-read alignments which are unable to comprehensively resolve structural variation. Here we build a pangenome from 16 HiFi haplotype-resolved assemblies to identify small and structural variation and genotype them with PanGenie in 307 short-read samples. We find high (>90%) concordance of PanGenie-genotyped and DeepVariant-called small variation, and confidently genotype close to 21M small and 43k structural variants in the larger population. We validate 85% of these structural variants (with MAF>0.1) directly with a subset of 25 short-read samples that also have medium coverage HiFi reads. We then conduct e/sQTL mapping with this comprehensive variant set in a subset of 117 cattle that have testis transcriptome data and find 92 structural variants as causal candidates for eQTL and 73 for sQTL. We find that roughly half of top associated structural variants affecting expression or splicing are transposable elements, such as SV-eQTLs for STN1 and MYH7 and SV-sQTLs for CEP89 and ASAH2 Extensive linkage disequilibrium between small and structural variation results in only 28 additional eQTL and 17 sQTL discovered when including SVs, although many top associated SVs are compelling candidates.

RevDate: 2024-02-14

Raghuram V, Petit RA, Karol Z, et al (2024)

Average Nucleotide Identity based Staphylococcus aureus strain grouping allows identification of strain-specific genes in the pangenome.

bioRxiv : the preprint server for biology pii:2024.01.29.577756.

UNLABELLED: Staphylococcus aureus causes both hospital and community acquired infections in humans worldwide. Due to the high incidence of infection S. aureus is also one of the most sampled and sequenced pathogens today, providing an outstanding resource to understand variation at the bacterial subspecies level. We processed and downsampled 83,383 public S. aureus Illumina whole genome shotgun sequences and 1,263 complete genomes to produce 7,954 representative substrains. Pairwise comparison of core gene Average Nucleotide Identity (ANI) revealed a natural boundary of 99.5% that could be used to define 145 distinct strains within the species. We found that intermediate frequency genes in the pangenome (present in 10-95% of genomes) could be divided into those closely linked to strain background ("strain-concentrated") and those highly variable within strains ("strain-diffuse"). Non-core genes had different patterns of chromosome location; notably, strain-diffuse associated with prophages, strain-concentrated with the vSaβ genome island and rare genes (<10% frequency) concentrated near the origin of replication. Antibiotic genes were enriched in the strain-diffuse class, while virulence genes were distributed between strain-diffuse, strain-concentrated, core and rare classes. This study shows how different patterns of gene movement help create strains as distinct subspecies entities and provide insight into the diverse histories of important S. aureus functions.

IMPORTANCE: We analyzed the genomic diversity of Staphylococcus aureus , a globally prevalent bacterial species that causes serious infections in humans. Our goal was to build a genetic picture of the different strains of S. aureus and which genes may be associated with them. We used a large public dataset (>84,000 genomes) that was re-processed and subsampled to remove redundancy. We found that individual genomes could be grouped into strains by sharing > 99.5% identical nucleotide sequence of the core part of their genome. We also showed that a portion of genes that are present in intermediate frequency in the species are strongly associated with some strains but completely absent from others, suggesting a role in strain-specificity. This work lays the foundation for understanding individual gene histories of the S. aureus species and also outlines strategies for processing large bacterial genomic datasets.

RevDate: 2024-02-14

Li X, Wang Y, Cai C, et al (2024)

Large-scale gene expression alterations introduced by structural variation drive morphotype diversification in Brassica oleracea.

Nature genetics [Epub ahead of print].

Brassica oleracea, globally cultivated for its vegetable crops, consists of very diverse morphotypes, characterized by specialized enlarged organs as harvested products. This makes B. oleracea an ideal model for studying rapid evolution and domestication. We constructed a B. oleracea pan-genome from 27 high-quality genomes representing all morphotypes and their wild relatives. We identified structural variations (SVs) among these genomes and characterized these in 704 B. oleracea accessions using graph-based genome tools. We show that SVs exert bidirectional effects on the expression of numerous genes, either suppressing through DNA methylation or promoting probably by harboring transcription factor-binding elements. The following examples illustrate the role of SVs modulating gene expression: SVs promoting BoPNY and suppressing BoCKX3 in cauliflower/broccoli, suppressing BoKAN1 and BoACS4 in cabbage and promoting BoMYBtf in ornamental kale. These results provide solid evidence for the role of SVs as dosage regulators of gene expression, driving B. oleracea domestication and diversification.

RevDate: 2024-02-12

Chen Y, Li X, Liu Z, et al (2024)

Genomic analysis and experimental pathogenic characterization of Riemerella anatipestifer isolates from chickens in China.

Poultry science, 103(4):103497 pii:S0032-5791(24)00076-2 [Epub ahead of print].

Waterfowl have a high likelihood of being infected with Riemerella anatipestifer. Although the pathogen is found in domestic ducks, turkeys, geese, and wild birds, there is little information available about the consequences of infection during egg laying and hatching in chickens. Here, we present the first report of a novel sequence type of R. anatipestifer S63 isolated from chickens in China. On the basis of pan-genome analysis, we showed S63's genome occupies a distinct branch with other R. anatipestifer isolates from other hosts. Galleria mellonella larval tests indicated that S63 is less virulent than R. anatipestifer Ra36 isolated from ducks. Ducks and hens are susceptible to S63 infection. There is no mortality rate for chickens or ducks, but adult chickens experience neurological symptoms that reduce egg production and hatching rates. In chickens, S63 might be passed vertically from parents to offspring, resulting in "jelly-like" lifeless embryos. Using quantitative PCR, S63 was detected in the brain, liver, reproductive organs, and embryos. As far as we know, this is the first report of R. anatipestifer in hens, a disease that can reduce egg productivity, lower hatching rates, and produce jelly-like lifeless embryos, and the first report to raise the possibility that hens can be infected by roosters via semen.

RevDate: 2024-02-10

Zhang T, Chen X, Yan W, et al (2024)

Comparative Analysis of Chloroplast Pan-Genomes and Transcriptomics Reveals Cold Adaptation in Medicago sativa.

International journal of molecular sciences, 25(3): pii:ijms25031776.

Alfalfa (Medicago sativa) is a perennial forage legume that is widely distributed all over the world; therefore, it has an extremely complex genetic background. Though population structure and phylogenetic studies have been conducted on a large group of alfalfa nuclear genomes, information about the chloroplast genomes is still lacking. Chloroplast genomes are generally considered to be conservative and play an important role in population diversity analysis and species adaptation in plants. Here, 231 complete alfalfa chloroplast genomes were successfully assembled from 359 alfalfa resequencing data, on the basis of which the alfalfa chloroplast pan-genome was constructed. We investigated the genetic variations of the alfalfa chloroplast genome through comparative genomic, genetic diversity, phylogenetic, population genetic structure, and haplotype analysis. Meanwhile, the expression of alfalfa chloroplast genes under cold stress was explored through transcriptome analysis. As a result, chloroplast genomes of 231 alfalfa lack an IR region, and the size of the chloroplast genome ranges from 125,192 bp to 126,105 bp. Using population structure, haplotypes, and construction of a phylogenetic tree, it was found that alfalfa populations could be divided into four groups, and multiple highly variable regions were found in the alfalfa chloroplast genome. Transcriptome analysis showed that tRNA genes were significantly up-regulated in the cold-sensitive varieties, while rps7, rpl32, and ndhB were down-regulated, and the editing efficiency of ycf1, ycf2, and ndhF was decreased in the cold-tolerant varieties, which may be due to the fact that chloroplasts store nutrients through photosynthesis to resist cold. The huge number of genetic variants in this study provide powerful resources for molecular markers.

RevDate: 2024-02-10

Andorf CM, Haley OC, Hayford RK, et al (2024)

PanEffect: a pan-genome visualization tool for variant effects in maize.

Bioinformatics (Oxford, England) pii:7604577 [Epub ahead of print].

UNLABELLED: Understanding the effects of genetic variants is crucial for accurately predicting traits and functional outcomes. Recent approaches have utilized artificial intelligence and protein language models to score all possible missense variant effects at the proteome level for a single genome, but a reliable tool is needed to explore these effects at the pan-genome level. To address this gap, we introduce a new tool called PanEffect. We implemented PanEffect at MaizeGDB to enable a comprehensive examination of the potential effects of coding variants across 50 maize genomes. The tool allows users to visualize over 550 million possible amino acid substitutions in the B73 maize reference genome and to observe the effects of the 2.3 million natural variations in the maize pan-genome. Each variant effect score, calculated from the Evolutionary Scale Modeling (ESM) protein language model, shows the log-likelihood ratio difference between B73 and all variants in the pan-genome. These scores are shown using heatmaps spanning benign outcomes to potential functional consequences. Additionally, PanEffect displays secondary structures and functional domains along with the variant effects, offering additional functional and structural context. Using PanEffect, researchers now have a platform to explore protein variants and identify genetic targets for crop enhancement.

AVAILABILITY: The PanEffect code is freely available on GitHub (https://github.com/Maize-Genetics-and-Genomics-Database/PanEffect). A maize implementation of PanEffect and underlying datasets are available at MaizeGDB (https://www.maizegdb.org/effect/maize/).

SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

RevDate: 2024-02-09

Bachari A, Nassar N, Telukutla S, et al (2024)

Evaluating the Mechanism of Cell Death in Melanoma Induced by the Cannabis Extract PHEC-66.

Cells, 13(3): pii:cells13030268.

Research suggests the potential of using cannabinoid-derived compounds to function as anticancer agents against melanoma cells. Our recent study highlighted the remarkable in vitro anticancer effects of PHEC-66, an extract from Cannabis sativa, on the MM418-C1, MM329, and MM96L melanoma cell lines. However, the complete molecular mechanism behind this action remains to be elucidated. This study aims to unravel how PHEC-66 brings about its antiproliferative impact on these cell lines, utilising diverse techniques such as real-time polymerase chain reaction (qPCR), assays to assess the inhibition of CB1 and CB2 receptors, measurement of reactive oxygen species (ROS), apoptosis assays, and fluorescence-activated cell sorting (FACS) for apoptosis and cell cycle analysis. The outcomes obtained from this study suggest that PHEC-66 triggers apoptosis in these melanoma cell lines by increasing the expression of pro-apoptotic markers (BAX mRNA) while concurrently reducing the expression of anti-apoptotic markers (Bcl-2 mRNA). Additionally, PHEC-66 induces DNA fragmentation, halting cell progression at the G1 cell cycle checkpoint and substantially elevating intracellular ROS levels. These findings imply that PHEC-66 might have potential as an adjuvant therapy in the treatment of malignant melanoma. However, it is essential to conduct further preclinical investigations to delve deeper into its potential and efficacy.

RevDate: 2024-02-09

Sakurai A, Suzuki M, Hayashi K, et al (2024)

Taxonomic classification of genus Aeromonas using open reading frame-based binarized structure network analysis.

Fujita medical journal, 10(1):8-15.

OBJECTIVES: Taxonomic assignment based on whole-genome sequencing data facilitates clear demarcation of species within a complex genus. Here, we applied a unique pan-genome phylogenetic method, open reading frame (ORF)-based binarized structure network analysis (OSNA), for taxonomic inference of Aeromonas spp., a complex taxonomic group consisting of 30 species.

METHODS: Data from 335 publicly available Aeromonas genomes, including the reference genomes of 30 species, were used to build a phylogenetic tree using OSNA. In OSNA, whole-genome structures are expressed as binary sequences based on the presence or absence of ORFs, and a tree is generated using neighbor-net, a distance-based method for constructing phylogenetic networks from binary sequences. The tree built by OSNA was compared to that constructed by a core-genome single-nucleotide polymorphism (SNP)-based analysis. Furthermore, the orthologous average nucleotide identity (OrthoANI) values of the sequences that clustered in a single clade in the OSNA-based tree were calculated.

RESULTS: The phylogenetic tree constructed with OSNA successfully delineated the majority of species of the genus Aeromonas forming conspecific clades for individual species, which was corroborated by OrthoANI values. Moreover, the OSNA-based phylogenetic tree demonstrated high compositional similarity to the core-genome SNP-based phylogenetic tree, supported by the Fowlkes-Mallows index.

CONCLUSIONS: We propose that OSNA is a useful tool in predicting the taxonomic classification of complex bacterial genera.

RevDate: 2024-02-08

Newcomer EP, Fishbein SRS, Zhang K, et al (2024)

Genomic surveillance of Clostridioides difficile transmission and virulence in a healthcare setting.

mBio [Epub ahead of print].

Clostridioides difficile infection (CDI) is a major cause of healthcare-associated diarrhea, despite the widespread implementation of contact precautions for patients with CDI. Here, we investigate strain contamination in a hospital setting and the genomic determinants of disease outcomes. Across two wards over 6 months, we selectively cultured C. difficile from patients (n = 384) and their environments. Whole-genome sequencing (WGS) of 146 isolates revealed that most C. difficile isolates were from clade 1 (131/146, 89.7%), while only one isolate of the hypervirulent ST1 was recovered. Of culture-positive admissions (n = 79), 19 (24%) patients were colonized with toxigenic C. difficile on admission to the hospital. We defined 25 strain networks at ≤2 core gene single nucleotide polymorphisms; two of these networks contain strains from different patients. Strain networks were temporally linked (P < 0.0001). To understand the genomic correlates of the disease, we conducted WGS on an additional cohort of C. difficile (n = 102 isolates) from the same hospital and confirmed that clade 1 isolates are responsible for most CDI cases. We found that while toxigenic C. difficile isolates are associated with the presence of cdtR, nontoxigenic isolates have an increased abundance of prophages. Our pangenomic analysis of clade 1 isolates suggests that while toxin genes (tcdABER and cdtR) were associated with CDI symptoms, they are dispensable for patient colonization. These data indicate that toxigenic and nontoxigenic C. difficile contamination persist in a hospital setting and highlight further investigation into how accessory genomic repertoires contribute to C. difficile colonization and disease.IMPORTANCEClostridioides difficile infection remains a leading cause of hospital-associated diarrhea, despite increased antibiotic stewardship and transmission prevention strategies. This suggests a changing genomic landscape of C. difficile. Our study provides insight into the nature of prevalent C. difficile strains in a hospital setting and transmission patterns among carriers. Longitudinal sampling of surfaces and patient stool revealed that both toxigenic and nontoxigenic strains of C. difficile clade 1 dominate these two wards. Moreover, quantification of transmission in carriers of these clade 1 isolates underscores the need to revisit infection prevention measures in this patient group. We identified unique genetic signatures associated with virulence in this clade. Our data highlight the complexities of preventing transmission of this pathogen in a hospital setting and the need to investigate the mechanisms of in vivo persistence and virulence of prevalent lineages in the host gut microbiome.

RevDate: 2024-02-07

Zhong C, Hu G, Hu C, et al (2024)

Comparative genomics analysis reveals genetic characteristics and nitrogen fixation profile of Bradyrhizobium.

iScience, 27(2):108948 pii:S2589-0042(24)00169-X.

Bradyrhizobium is a genus of nitrogen-fixing bacteria, with some species producing nodules in leguminous plants. Investigations into Bradyrhizobium have recently revealed its substantial genetic resources and agricultural benefits, but a comprehensive survey of its genetic diversity and functional properties is lacking. Using a panel of various strains (N = 278), this study performed a comparative genomics analysis to anticipate genes linked with symbiotic nitrogen fixation. Bradyrhizobium's pan-genome consisted of 84,078 gene families, containing 824 core genes and 42,409 accessory genes. Core genes were mainly involved in crucial cell processes, while accessory genes served diverse functions, including nitrogen fixation and nodulation. Three distinct genetic profiles were identified based on the presence/absence of gene clusters related to nodulation, nitrogen fixation, and secretion systems. Most Bradyrhizobium strains from soil and non-leguminous plants lacked major nif/nod genes and were evolutionarily more closely related. These findings shed light on Bradyrhizobium's genetic features for symbiotic nitrogen fixation.

RevDate: 2024-02-02

Zheng Z, Zhu M, Zhang J, et al (2024)

A sequence-aware merger of genomic structural variations at population scale.

Nature communications, 15(1):960.

Merging structural variations (SVs) at the population level presents a significant challenge, yet it is essential for conducting comprehensive genotypic analyses, especially in the era of pangenomics. Here, we introduce PanPop, a tool that utilizes an advanced sequence-aware SV merging algorithm to efficiently merge SVs of various types. We demonstrate that PanPop can merge and optimize the majority of multiallelic SVs into informative biallelic variants. We show its superior precision and lower rates of missing data compared to alternative software solutions. Our approach not only enables the filtering of SVs by leveraging multiple SV callers for enhanced accuracy but also facilitates the accurate merging of large-scale population SVs. These capabilities of PanPop will help to accelerate future SV-related studies.

RevDate: 2024-02-02

Chen P, Wang S, Li H, et al (2024)

Comparative genomic analyses of Cutibacterium granulosum provide insights into genomic diversity.

Frontiers in microbiology, 15:1343227.

Cutibacterium granulosum, a commensal bacterium found on human skin, formerly known as Propionibacterium granulosum, rarely causes infections and is generally considered non-pathogenic. Recent research has revealed the transferability of the multidrug-resistant plasmid pTZC1 between C. granulosum and Cutibacterium acnes, the latter being an opportunistic pathogen in surgical site infections. However, there is a noticeable lack of research on the genome of C. granulosum, and the genetic landscape of this species remains largely uncharted. We investigated the genomic features and evolutionary structure of C. granulosum by analyzing a total of 30 Metagenome-Assembled Genomes (MAGs) and isolate genomes retrieved from public databases, as well as those generated in this study. A pan-genome of 6,077 genes was identified for C. granulosum. Remarkably, the 'cloud genes' constituted 62.38% of the pan-genome. Genes associated with mobilome: prophages, transposons [X], defense mechanisms [V] and replication, recombination and repair [L] were enriched in the cloud genome. Phylogenomic analysis revealed two distinct mono-clades, highlighting the genomic diversity of C. granulosum. The genomic diversity was further confirmed by the distribution of Average Nucleotide Identity (ANI) values. The functional profiles analysis of C. granulosum unveiled a wide range of potential Antibiotic Resistance Genes (ARGs) and virulence factors, suggesting its potential tolerance to various environmental challenges. Subtype I-E of the CRISPR-Cas system was the most abundant in these genomes, a feature also detected in C. acnes genomes. Given the widespread distribution of C. granulosum strains within skin microbiome, our findings make a substantial contribution to our broader understanding of the genetic diversity, which may open new avenues for investigating the mechanisms and treatment of conditions such as acne vulgaris.

RevDate: 2024-02-01

Hayeck TJ, Li Y, Mosbruger TL, et al (2024)

The Impact of Patterns in Linkage Disequilibrium and Sequencing Quality on the Imprint of Balancing Selection.

Genome biology and evolution pii:7596324 [Epub ahead of print].

Regions under balancing selection are characterized by dense polymorphisms and multiple persistent haplotypes, along with other sequence complexities. Successful identification of these patterns depends on both the statistical approach and the quality of sequencing. To address this challenge, at first, a new statistical method called LD-ABF was developed, employing efficient Bayesian techniques to effectively test for balancing selection. LD-ABF demonstrated the most robust detection of selection in a variety of simulation scenarios, compared against a range of existing tests/tools (Tajima's D, HKA, Dng, BetaScan, and BalLerMix). Furthermore, the impact of the quality of sequencing on detection of balancing selection was explored, as well, using: 1) SNP genotyping and exome data, 2) targeted high-resolution HLA genotyping (IHIW), and 3) whole-genome long-read sequencing data (Pangenome). In the analysis of SNP genotyping and exome data, we identified known targets and 38 new selection signatures in genes not previously linked to balancing selection. To further investigate the impact of sequencing quality on detection of balancing selection, a detailed investigation of the MHC was performed with high-resolution HLA typing data. Higher quality sequencing revealed the HLA-DQ genes consistently demonstrated strong selection signatures otherwise not observed from the sparser SNP array and exome data. The HLA-DQ selection signature was also replicated in the Pangenome samples using considerably less samples but, with high quality long-read sequence data. The improved statistical method, coupled with higher quality sequencing, leads to more consistent identification of selection and enhanced localization of variants under selection, particularly in complex regions.

RevDate: 2024-02-01

Lee J, Cha IT, Lee KE, et al (2024)

Complete genome sequence and potential pathogenic assessment of Flavobacterium plurextorum RSG-18 isolated from the gut of Schlegel's black rockfish, Sebastes schlegelii.

Environmental microbiology reports [Epub ahead of print].

Flavobacterium plurextorum is a potential fish pathogen of interest, previously isolated from diseased rainbow trout (Oncorhynchus mykiss) and oomycete-infected chum salmon (Oncorhynchus keta) eggs. We report here the first complete genome sequence of F. plurextorum RSG-18 isolated from the gut of Schlegel's black rockfish (Sebastes schlegelii). The genome of RSG-18 consists of a circular chromosome of 5,610,911 bp with a 33.57% GC content, containing 4858 protein-coding genes, 18 rRNAs, 63 tRNAs and 1 tmRNA. A comparative analysis was conducted on 11 Flavobacterium species previously reported as pathogens or isolated from diseased fish to confirm the potential pathogenicity of RSG-18. In the SEED classification, RSG-18 was found to have 36 genes categorized in 'Virulence, Disease and Defense'. Across all Flavobacterium species, a total of 16 antibiotic resistance genes and 61 putative virulence factors were identified. All species had at least one phage region and type I, III and IX secretion systems. In pan-genomic analysis, core genes consist of genes linked to phages, integrases and matrix-tolerated elements associated with pathology. The complete genome sequence of F. plurextorum RSG-18 will serve as a foundation for future research, enhancing our understanding of Flavobacterium pathogenicity in fish and contributing to the development of effective prevention strategies.

RevDate: 2024-01-31

Chen Y, Xiang G, Liu P, et al (2024)

Prevalence and Molecular Characteristics of Ceftazidime-avibactam Resistance among carbapenem-resistant Pseudomonas aeruginosa Clinical Isolates.

Journal of global antimicrobial resistance pii:S2213-7165(24)00014-6 [Epub ahead of print].

BACKGROUND: Resistance against ceftazidime-avibactam (CZA) in carbapenem-resistant Pseudomonas aeruginosa (CRPA) is emerging. This study was aimed at detecting the prevalence and molecular characteristics of CZA-resistant CRPA clinical isolates in Guangdong Province, China.

METHODS: The antimicrobial susceptibility profile of these strains was determined. A subset of sixteen CZA-resistant CRPA isolates was analyzed by whole genome sequencing (WGS). Genetic surroundings of carbapenem resistance genes and pan-genome-wide association analysis were further studied.

RESULTS: Of the 250 CRPA isolates, CZA resistance rate was 6.4% (16/250). The minimum inhibitory concentration (MIC) of CZA range was from 0.25 to >256 mg/L. MIC50 and MIC90 were 2/4 and 8/4 mg/L, respectively. Among the sixteen CZA-resistant CRPA strains, 31.3% (5/16) of them carried class B carbapenem resistance genes including blaIMP-4, blaIMP-45 and blaVIM-2, located on IncP-2 megaplasmids or chromosome, respectively. Pan-genome-wide association analysis of accessory genes for CZA-susceptible or -resistant CRPA isolates showed that PA1874, a hypothetical protein containing BapA prefix-like domain, was enriched in CZA-resistant group significantly.

CONCLUSIONS: Class B carbapenem resistance genes play important roles in CZA resistance. Meanwhile, PA1874 gene may be a novel mechanism involving in CZA resistance. It is necessary to continually monitor CZA-resistant CRPA isolates.

RevDate: 2024-01-31

Kim B, Han SR, Lee H, et al (2023)

Insights into group-specific pattern of secondary metabolite gene cluster in Burkholderia genus.

Frontiers in microbiology, 14:1302236.

Burkholderia is a versatile strain that has expanded into several genera. It has been steadily reported that the genome features of Burkholderia exhibit activities ranging from plant growth promotion to pathogenicity across various isolation areas. The objective of this study was to investigate the secondary metabolite patterns of 366 Burkholderia species through comparative genomics. Samples were selected based on assembly quality assessment and similarity below 80% in average nucleotide identity. Duplicate samples were excluded. Samples were divided into two groups using FastANI analysis. Group A included B. pseudomallei complex. Group B included B. cepacia complex. The limitations of MLST were proposed. The detection of genes was performed, including environmental and virulence-related genes. In the pan-genome analysis, each complex possessed a similar pattern of cluster for orthologous groups. Group A (n = 185) had 14,066 cloud genes, 2,465 shell genes, 682 soft-core genes, and 2,553 strict-core genes. Group B (n = 181) had 39,867 cloud genes, 4,986 shell genes, 324 soft-core genes, 222 core genes, and 2,949 strict-core genes. AntiSMASH was employed to analyze the biosynthetic gene cluster (BGC). The results were then utilized for network analysis using BiG-SCAPE and CORASON. Principal component analysis was conducted and a table was constructed using the results obtained from antiSMASH. The results were divided into Group A and Group B. We expected the various species to show similar patterns of secondary metabolite gene clusters. For in-depth analysis, a network analysis of secondary metabolite gene clusters was conducted, exemplified by BiG-SCAPE analysis. Depending on the species and complex, Burkholderia possessed several kinds of siderophore. Among them, ornibactin was possessed in most Burkholderia and was clustered into 4,062 clans. There was a similar pattern of gene clusters depending on the species. NRPS_04014 belonged to siderophore BGCs including ornibactin and indigoidine. However, it was observed that each family included a similar species. This suggests that, besides siderophores being species-specific, the ornibactin gene cluster itself might also be species-specific. The results suggest that siderophores are associated with environmental adaptation, possessing a similar pattern of siderophore gene clusters among species, which could provide another perspective on species-specific environmental adaptation mechanisms.

RevDate: 2024-01-30

Joubert PM, KV Krasileva (2024)

Distinct genomic contexts predict gene presence-absence variation in different pathotypes of Magnaporthe oryzae.

Genetics pii:7593594 [Epub ahead of print].

Fungi use the accessory gene content of their pangenomes to adapt to their environments. While gene presence-absence variation (PAV) contributes to shaping accessory gene reservoirs, the genomic contexts that shape these events remain unclear. Since pangenome studies are typically species-wide and do not analyze different populations separately, it is yet to be uncovered whether PAV patterns and mechanisms are consistent across populations. Fungal plant pathogens are useful models for studying PAV because they rely on it to adapt to their hosts, and members of a species often infect distinct hosts. We analyzed gene PAV in the blast fungus, Magnaporthe oryzae (syn. Pyricularia oryzae), and found that PAV genes involved in host-pathogen and microbe-microbe interactions may drive the adaptation of the fungus to its environment. We then analyzed genomic and epigenomic features of PAV and observed that proximity to transposable elements, gene GC content, gene length, expression level in the host, and histone H3K27me3 marks were different between PAV genes and conserved genes. We used these features to construct a model that was able to predict whether a gene is likely to experience PAV with high precision (86.06%) and recall (92.88%) in M. oryzae. Finally, we found that PAV genes in the rice and wheat pathotypes of M. oryzae differed in their number and their genomic context. Our results suggest that genomic and epigenomic features of gene PAV can be used to better understand and predict fungal pangenome evolution. We also show that substantial intra-species variation can exist in these features.

RevDate: 2024-01-28

Zaccaron AZ, I Stergiopoulos (2024)

Analysis of five near-complete genome assemblies of the tomato pathogen Cladosporium fulvum uncovers additional accessory chromosomes and structural variations induced by transposable elements effecting the loss of avirulence genes.

BMC biology, 22(1):25.

BACKGROUND: Fungal plant pathogens have dynamic genomes that allow them to rapidly adapt to adverse conditions and overcome host resistance. One way by which this dynamic genome plasticity is expressed is through effector gene loss, which enables plant pathogens to overcome recognition by cognate resistance genes in the host. However, the exact nature of these loses remains elusive in many fungi. This includes the tomato pathogen Cladosporium fulvum, which is the first fungal plant pathogen from which avirulence (Avr) genes were ever cloned and in which loss of Avr genes is often reported as a means of overcoming recognition by cognate tomato Cf resistance genes. A recent near-complete reference genome assembly of C. fulvum isolate Race 5 revealed a compartmentalized genome architecture and the presence of an accessory chromosome, thereby creating a basis for studying genome plasticity in fungal plant pathogens and its impact on avirulence genes.

RESULTS: Here, we obtained near-complete genome assemblies of four additional C. fulvum isolates. The genome assemblies had similar sizes (66.96 to 67.78 Mb), number of predicted genes (14,895 to 14,981), and estimated completeness (98.8 to 98.9%). Comparative analysis that included the genome of isolate Race 5 revealed high levels of synteny and colinearity, which extended to the density and distribution of repetitive elements and of repeat-induced point (RIP) mutations across homologous chromosomes. Nonetheless, structural variations, likely mediated by transposable elements and effecting the deletion of the avirulence genes Avr4E, Avr5, and Avr9, were also identified. The isolates further shared a core set of 13 chromosomes, but two accessory chromosomes were identified as well. Accessory chromosomes were significantly smaller in size, and one carried pseudogenized copies of two effector genes. Whole-genome alignments further revealed genomic islands of near-zero nucleotide diversity interspersed with islands of high nucleotide diversity that co-localized with repeat-rich regions. These regions were likely generated by RIP, which generally asymmetrically affected the genome of C. fulvum.

CONCLUSIONS: Our results reveal new evolutionary aspects of the C. fulvum genome and provide new insights on the importance of genomic structural variations in overcoming host resistance in fungal plant pathogens.

RevDate: 2024-01-26

Rajput J, Chandra G, C Jain (2024)

Co-linear chaining on pangenome graphs.

Algorithms for molecular biology : AMB, 19(1):4.

Pangenome reference graphs are useful in genomics because they compactly represent the genetic diversity within a species, a capability that linear references lack. However, efficiently aligning sequences to these graphs with complex topology and cycles can be challenging. The seed-chain-extend based alignment algorithms use co-linear chaining as a standard technique to identify a good cluster of exact seed matches that can be combined to form an alignment. Recent works show how the co-linear chaining problem can be efficiently solved for acyclic pangenome graphs by exploiting their small width and how incorporating gap cost in the scoring function improves alignment accuracy. However, it remains open on how to effectively generalize these techniques for general pangenome graphs which contain cycles. Here we present the first practical formulation and an exact algorithm for co-linear chaining on cyclic pangenome graphs. We rigorously prove the correctness and computational complexity of the proposed algorithm. We evaluate the empirical performance of our algorithm by aligning simulated long reads from the human genome to a cyclic pangenome graph constructed from 95 publicly available haplotype-resolved human genome assemblies. While the existing heuristic-based algorithms are faster, the proposed algorithm provides a significant advantage in terms of accuracy. Implementation (https://github.com/at-cg/PanAligner).

RevDate: 2024-01-26

Mondol SM, Islam I, Islam MR, et al (2024)

Genomic landscape of NDM-1 producing multidrug-resistant Providencia stuartii causing burn wound infections in Bangladesh.

Scientific reports, 14(1):2246.

The increasing antimicrobial resistance in Providencia stuartii (P. stuartii) worldwide, particularly concerning for immunocompromised and burn patients, has raised concern in Bangladesh, where the significance of this infectious opportunistic pathogen had been previously overlooked, prompting a need for investigation. The two strains of P. stuartii (P. stuartii SHNIBPS63 and P. stuartii SHNIBPS71) isolated from wound swab of two critically injured burn patients were found to be multidrug-resistant and P. stuartii SHNIBPS63 showed resistance to all the 22 antibiotics tested as well as revealed the co-existence of blaVEB-6 (Class A), blaNDM-1 (Class B), blaOXA-10 (Class D) beta lactamase genes. Complete resistance to carbapenems through the production of NDM-1, is indicative of an alarming situation as carbapenems are considered to be the last line antibiotic to combat this pathogen. Both isolates displayed strong biofilm-forming abilities and exhibited resistance to copper, zinc, and iron, in addition to carrying multiple genes associated with metal resistance and the formation of biofilms. The study also encompassed a pangenome analysis utilizing a dataset of eighty-six publicly available P. stuartii genomes (n = 86), revealing evidence of an open or expanding pangenome for P. stuartii. Also, an extensive genome-wide analysis of all the P. stuartii genomes revealed a concerning global prevalence of diverse antimicrobial resistance genes, with a particular alarm raised over the abundance of carbapenem resistance gene blaNDM-1. Additionally, this study highlighted the notable genetic diversity within P. stuartii, significant informations about phylogenomic relationships and ancestry, as well as potential for cross-species transmission, raising important implications for public health and microbial adaptation across different environments.

RevDate: 2024-01-25

Barbitoff YA, Ushakov MO, Lazareva TE, et al (2024)

Bioinformatics of germline variant discovery for rare disease diagnostics: current approaches and remaining challenges.

Briefings in bioinformatics, 25(2):.

Next-generation sequencing (NGS) has revolutionized the field of rare disease diagnostics. Whole exome and whole genome sequencing are now routinely used for diagnostic purposes; however, the overall diagnosis rate remains lower than expected. In this work, we review current approaches used for calling and interpretation of germline genetic variants in the human genome, and discuss the most important challenges that persist in the bioinformatic analysis of NGS data in medical genetics. We describe and attempt to quantitatively assess the remaining problems, such as the quality of the reference genome sequence, reproducible coverage biases, or variant calling accuracy in complex regions of the genome. We also discuss the prospects of switching to the complete human genome assembly or the human pan-genome and important caveats associated with such a switch. We touch on arguably the hardest problem of NGS data analysis for medical genomics, namely, the annotation of genetic variants and their subsequent interpretation. We highlight the most challenging aspects of annotation and prioritization of both coding and non-coding variants. Finally, we demonstrate the persistent prevalence of pathogenic variants in the coding genome, and outline research directions that may enhance the efficiency of NGS-based disease diagnostics.

RevDate: 2024-01-25

Singh S, Singh R, Priyadarsini S, et al (2024)

Genomics empowering conservation action and improvement of celery in the face of climate change.

Planta, 259(2):42.

Integration of genomic approaches like whole genome sequencing, functional genomics, evolutionary genomics, and CRISPR/Cas9-based genome editing has accelerated the improvement of crop plants including leafy vegetables like celery in the face of climate change. The anthropogenic climate change is a real peril to the existence of life forms on our planet, including human and plant life. Climate change is predicted to be a significant threat to biodiversity and food security in the coming decades and is rapidly transforming global farming systems. To avoid the ghastly future in the face of climate change, the elucidation of shifts in the geographical range of plant species, species adaptation, and evolution is necessary for plant scientists to develop climate-resilient strategies. In the post-genomics era, the increasing availability of genomic resources and integration of multifaceted genomics elements is empowering biodiversity conservation action, restoration efforts, and identification of genomic regions adaptive to climate change. Genomics has accelerated the true characterization of crop wild relatives, genomic variations, and the development of climate-resilient varieties to ensure food security for 10 billion people by 2050. In this review, we have summarized the applications of multifaceted genomic tools, like conservation genomics, whole genome sequencing, functional genomics, genome editing, pangenomics, in the conservation and adaptation of plant species with a focus on celery, an aromatic and medicinal Apiaceae vegetable. We focus on how conservation scientists can utilize genomics and genomic data in conservation and improvement.

RevDate: 2024-01-24

Uruén C, Fernandez A, Arnal JL, et al (2024)

Genomic and phenotypic analysis of invasive Streptococcus suis isolated in Spain reveals genetic diversification and associated virulence traits.

Veterinary research, 55(1):11.

Streptococcus suis is a zoonotic pathogen that causes a major health problem in the pig production industry worldwide. Spain is one of the largest pig producers in the world. This work aimed to investigate the genetic and phenotypic features of invasive S. suis isolates recovered in Spain. A panel of 156 clinical isolates recovered from 13 Autonomous Communities, representing the major pig producers, were analysed. MLST and serotyping analysis revealed that most isolates (61.6%) were assigned to ST1 (26.3%), ST123 (18.6%), ST29 (9.6%), and ST3 (7.1%). Interestingly, 34 new STs were identified, indicating the emergence of novel genetic lineages. Serotypes 9 (27.6%) and 1 (21.8%) prevailed, followed by serotypes 7 (12.8%) and 2 (12.2%). Analysis of 13 virulence-associated genes showed significant associations between ST, serotype, virulence patterns, and clinical features, evidencing particular virulence traits associated with genetic clusters. The pangenome was generated, and the core genome was distributed in 7 Bayesian groups where each group included a variable set of over- and under-represented genes of different categories. The study provides comprehensive data and knowledge to improve the design of new vaccines, antimicrobial treatments, and bacterial typing approaches.

RevDate: 2024-01-24

Kothe CI, Monnet C, Irlinger F, et al (2024)

Halomonas citrativorans sp. nov., Halomonas casei sp. nov. and Halomonas colorata sp. nov., isolated from French cheese rinds.

International journal of systematic and evolutionary microbiology, 74(1):.

Eight Gram-stain-negative bacterial strains were isolated from cheese rinds sampled in France. On the basis of 16S rRNA gene sequence analysis, all isolates were assigned to the genus Halomonas. Phylogenetic investigations, including 16S rRNA gene studies, multilocus sequence analysis, reconstruction of a pan-genome phylogenetic tree with the concatenated core-genome content and average nucleotide identity (ANI) calculations, revealed that they constituted three novel and well-supported clusters. The closest relative species, determined using the whole-genome sequences of the strains, were Halomonas zhanjiangensis for two groups of cheese strains, sharing 82.4 and 93.1 % ANI, and another cluster sharing 92.2 % ANI with the Halomonas profundi type strain. The strains isolated herein differed from the previously described species by ANI values <95 % and several biochemical, enzymatic and colony characteristics. The results of phenotypic, phylogenetic and chemotaxonomic analyses indicated that the isolates belonged to three novel Halomonas species, for which the names Halomonas citrativorans sp. nov., Halomonas casei sp. nov. and Halomonas colorata sp. nov. are proposed, with isolates FME63[T] (=DSM 113315[T]=CIRM-BIA2430[T]=CIP 111880[T]=LMG 32013[T]), FME64[T] (=DSM 113316[T]=CIRM-BIA2431[T]=CIP 111877[T]=LMG 32015[T]) and FME66[T] (=DSM 113318[T]=CIRM-BIA2433[T]=CIP 111876[T]=LMG 32014[T]) as type strains, respectively.

RevDate: 2024-01-23

Teyssonniere EM, Shichino Y, Mito M, et al (2024)

Translation variation across genetic backgrounds reveals a post-transcriptional buffering signature in yeast.

Nucleic acids research pii:7585675 [Epub ahead of print].

Gene expression is known to vary among individuals, and this variability can impact the phenotypic diversity observed in natural populations. While the transcriptome and proteome have been extensively studied, little is known about the translation process itself. Here, we therefore performed ribosome and transcriptomic profiling on a genetically and ecologically diverse set of natural isolates of the Saccharomyces cerevisiae yeast. Interestingly, we found that the Euclidean distances between each profile and the expression fold changes in each pairwise isolate comparison were higher at the transcriptomic level. This observation clearly indicates that the transcriptional variation observed in the different isolates is buffered through a phenomenon known as post-transcriptional buffering at the translation level. Furthermore, this phenomenon seemed to have a specific signature by preferentially affecting essential genes as well as genes involved in complex-forming proteins, and low transcribed genes. We also explored the translation of the S. cerevisiae pangenome and found that the accessory genes related to introgression events displayed similar transcription and translation levels as the core genome. By contrast, genes acquired through horizontal gene transfer events tended to be less efficiently translated. Together, our results highlight both the extent and signature of the post-transcriptional buffering.

RevDate: 2024-01-23

Villani F, Guarracino A, Ward RR, et al (2024)

Pangenome reconstruction in rats enhances genotype-phenotype mapping and novel variant discovery.

bioRxiv : the preprint server for biology pii:2024.01.10.575041.

The HXB/BXH family of recombinant inbred rat strains is a unique genetic resource that has been extensively phenotyped over 25 years, resulting in a vast dataset of quantitative molecular and physiological phenotypes. We built a pangenome graph from 10x Genomics linked-read data for 31 recombinant inbred rats to study genetic variation and association mapping. The pangenome length was on average 2.4 times greater than the corresponding length of the reference mRatBN7.2, confirming the capture of substantial additional variation. We validated variants in challenging regions, including complex structural variants resolving into multiple haplotypes. Phenome-wide association analysis of validated SNPs uncovered variants associated with glucose/insulin levels and hippocampal gene expression. We propose an interaction between Pirl1l1, Cromogranine expression, TNF-α levels, and insulin regulation. This study demonstrates the utility of linked-read pangenomes for comprehensive variant detection and mapping phenotypic diversity in a widely used rat genetic reference panel.

RevDate: 2024-01-23

Chen F, Yin Y, Chen H, et al (2024)

Global genetic diversity and Asian clades evolution: a phylogeographic study of Staphylococcus aureus sequence type 5.

Antimicrobial agents and chemotherapy [Epub ahead of print].

Staphylococcus aureus sequence type (ST) 5 has spread worldwide; however, phylogeographic studies on the evolution of global phylogenetic and Asian clades of ST5 are lacking. This study included 368 ST5 genome sequences, including 111 newly generated sequences. Primary phylogenetic analysis suggested that there are five clades, and geographical clustering of ST5 methicillin-resistant S. aureus (MRSA) was linked to the acquisition of S. aureus pathogenicity islands (SaPIs; enterotoxin gene island) and integration of the prophage φSa3. The most recent common ancestor of global S. aureus ST5 dates back to the mid-1940s, coinciding with the clinical introduction of penicillin. Bayesian phylogeographic inference allowed to ancestrally trace the Asian ST5 MRSA clade to Japan, which may have spread to major cities in China and Korea in the 1990s. Based on a pan-genome-wide association study, the emergence of Asian ST5 clades was attributed to the gain of prophages, SaPIs, and plasmids, as well as the coevolution of resistance genes. Clade IV displayed greater genomic diversity than the Asian MRSA clades. Collectively, our study provides in-depth insights into the global evolution of S. aureus ST5 mainly in China and the United States and reveals that different S. aureus ST5 clades have arisen independently in different parts of the world, with limited geographic dispersal across continents.

RevDate: 2024-01-23

Afordoanyi DM, Akosah YA, Shnakhova L, et al (2023)

Biotechnological Key Genes of the Rhodococcus erythropolis MGMM8 Genome: Genes for Bioremediation, Antibiotics, Plant Protection, and Growth Stimulation.

Microorganisms, 12(1): pii:microorganisms12010088.

Anthropogenic pollution, including residues from the green revolution initially aimed at addressing food security and healthcare, has paradoxically exacerbated environmental challenges. The transition towards comprehensive green biotechnology and bioremediation, achieved with lower financial investment, hinges on microbial biotechnology, with the Rhodococcus genus emerging as a promising contender. The significance of fully annotating genome sequences lies in comprehending strain constituents, devising experimental protocols, and strategically deploying these strains to address pertinent issues using pivotal genes. This study revolves around Rhodococcus erythropolis MGMM8, an associate of winter wheat plants in the rhizosphere. Through the annotation of its chromosomal genome and subsequent comparison with other strains, its potential applications were explored. Using the antiSMASH server, 19 gene clusters were predicted, encompassing genes responsible for antibiotics and siderophores. Antibiotic resistance evaluation via the Comprehensive Antibiotic Resistance Database (CARD) identified five genes (vanW, vanY, RbpA, iri, and folC) that were parallel to strain CCM2595. Leveraging the NCBI Prokaryotic Genome Annotation Pipeline (PGAP) for biodegradation, heavy metal resistance, and remediation genes, the presence of chlorimuron-ethyl, formaldehyde, benzene-desulfurization degradation genes, and heavy metal-related genes (ACR3, arsC, corA, DsbA, modA, and recG) in MGMM8 was confirmed. Furthermore, quorum-quenching signal genes, critical for curbing biofilm formation and virulence elicited by quorum-sensing in pathogens, were also discerned within MGMM8's genome. In light of these predictions, the novel isolate MGMM8 warrants phenotypic assessment to gauge its potential in biocontrol and bioremediation. This evaluation extends to isolating active compounds for potential antimicrobial activities against pathogenic microorganisms. The comprehensive genome annotation process has facilitated the genetic characterization of MGMM8 and has solidified its potential as a biotechnological strain to address global anthropogenic predicaments.

RevDate: 2024-01-23

Godoy M, Montes de Oca M, Suarez R, et al (2023)

Genomics of Re-Emergent Aeromonas salmonicida in Atlantic Salmon Outbreaks.

Microorganisms, 12(1): pii:microorganisms12010064.

Furunculosis, caused by Aeromonas salmonicida, poses a significant threat to both salmonid and non-salmonid fish in diverse aquatic environments. This study explores the genomic intricacies of re-emergent A. salmonicida outbreaks in Atlantic salmon (Salmo salar). Previous clinical cases have exhibited pathological characteristics, such as periorbital hemorrhages and gastrointestinal abnormalities. Genomic sequencing of three Chilean isolates (ASA04, ASA05, and CIBA_5017) and 25 previously described genomes determined the pan-genome, phylogenomics, insertion sequences, and restriction-modification systems. Unique gene families have contributed to an improved understanding of the psychrophilic and mesophilic clades, while phylogenomic analysis has been used to identify mesophilic and psychrophilic strains, thereby further differentiating between typical and atypical psychrophilic isolates. Diverse insertion sequences and restriction-modification patterns have highlighted genomic structural differences, and virulence factor predictions can emphasize exotoxin disparities, especially between psychrophilic and mesophilic strains. Thus, a novel plasmid was characterized which emphasized the role of plasmids in virulence and antibiotic resistance. The analysis of antibiotic resistance factors revealed resistance against various drug classes in Chilean strains. Overall, this study elucidates the genomic dynamics of re-emergent A. salmonicida and provides novel insights into their virulence, antibiotic resistance, and population structure.

RevDate: 2024-01-23

Fan J, Khan J, Singh NP, et al (2024)

Fulgor: a fast and compact k-mer index for large-scale matching and color queries.

Algorithms for molecular biology : AMB, 19(1):3.

The problem of sequence identification or matching-determining the subset of reference sequences from a given collection that are likely to contain a short, queried nucleotide sequence-is relevant for many important tasks in Computational Biology, such as metagenomics and pangenome analysis. Due to the complex nature of such analyses and the large scale of the reference collections a resource-efficient solution to this problem is of utmost importance. This poses the threefold challenge of representing the reference collection with a data structure that is efficient to query, has light memory usage, and scales well to large collections. To solve this problem, we describe an efficient colored de Bruijn graph index, arising as the combination of a k-mer dictionary with a compressed inverted index. The proposed index takes full advantage of the fact that unitigs in the colored compacted de Bruijn graph are monochromatic (i.e., all k-mers in a unitig have the same set of references of origin, or color). Specifically, the unitigs are kept in the dictionary in color order, thereby allowing for the encoding of the map from k-mers to their colors in as little as 1 + o(1) bits per unitig. Hence, one color per unitig is stored in the index with almost no space/time overhead. By combining this property with simple but effective compression methods for integer lists, the index achieves very small space. We implement these methods in a tool called Fulgor, and conduct an extensive experimental analysis to demonstrate the improvement of our tool over previous solutions. For example, compared to Themisto-the strongest competitor in terms of index space vs. query time trade-off-Fulgor requires significantly less space (up to 43% less space for a collection of 150,000 Salmonella enterica genomes), is at least twice as fast for color queries, and is 2-6[Formula: see text] faster to construct.

RevDate: 2024-01-22

Groza C, Schwendinger-Schreck C, Cheung WA, et al (2024)

Pangenome graphs improve the analysis of structural variants in rare genetic diseases.

Nature communications, 15(1):657.

Rare DNA alterations that cause heritable diseases are only partially resolvable by clinical next-generation sequencing due to the difficulty of detecting structural variation (SV) in all genomic contexts. Long-read, high fidelity genome sequencing (HiFi-GS) detects SVs with increased sensitivity and enables assembling personal and graph genomes. We leverage standard reference genomes, public assemblies (n = 94) and a large collection of HiFi-GS data from a rare disease program (Genomic Answers for Kids, GA4K, n = 574 assemblies) to build a graph genome representing a unified SV callset in GA4K, identify common variation and prioritize SVs that are more likely to cause genetic disease (MAF < 0.01). Using graphs, we obtain a higher level of reproducibility than the standard reference approach. We observe over 200,000 SV alleles unique to GA4K, including nearly 1000 rare variants that impact coding sequence. With improved specificity for rare SVs, we isolate 30 candidate SVs in phenotypically prioritized genes, including known disease SVs. We isolate a novel diagnostic SV in KMT2E, demonstrating use of personal assemblies coupled with pangenome graphs for rare disease genomics. The community may interrogate our pangenome with additional assemblies to discover new SVs within the allele frequency spectrum relevant to genetic diseases.

RevDate: 2024-01-22

Jeong J, Ahn S, Truong TC, et al (2024)

Description of Mycolicibacterium arenosum sp. nov. Isolated from Coastal Sand on the Yellow Sea Coast.

Current microbiology, 81(3):73.

A Gram-staining-positive, aerobic, non-spore-forming bacterium was isolated from coastal sand samples from Incheon in the Republic of Korea and designated as strain CAU 1645[T]. The optimum conditions for growth were observed at 30 °C in growth media containing 1% (w/v) NaCl at pH 9.0. The predominant respiratory quinone was MK-9 and the major fatty acids were C16:0, C17:1 w7c, and summed feature 7. Similarly, the 16S rRNA gene sequence exhibited the highest similarity with Mycolicibacterium bacteremicum DSM 45578[T] and Mycolicibacterium neoaurum JCM 6365[T], both of which exhibited similarity rates of 97.2%. The genomic DNA G+C content was 68.2%. The whole genome of strain CAU 1645[T] was obtained and annotated with annotation using RAST server. The pan-genome analysis was determined using Prokka, Roary, and Phandango. In the pan-genome analysis, the strain CAU 1645[T] shared 40 core genes with closely related Mycolicibacterium species, including the AcpM gene, the meromycolate extension acyl carrier protein involved in forming impermeable cell walls in mycobacteria. Therefore, our findings demonstrated that the isolate represents a novel species of the genus Mycolicibacterium, for which we propose the name Mycolicibacterium arenosum sp. nov. The type strain is CAU 1645[T] (= KCTC 49724[T] = MCCC 1K07087[T]).

RevDate: 2024-01-22

Deng Y, Jiang ZM, Han XF, et al (2023)

Corrigendum: Pangenome analysis of the genus Herbiconiux and proposal of four new species associated with Chinese medicinal plants.

Frontiers in microbiology, 14:1295710.

[This corrects the article DOI: 10.3389/fmicb.2023.1119226.].

RevDate: 2024-01-21

Song Z, Ge Y, Yu X, et al (2024)

Development of a SNP-based strain-identified method for Streptococcus thermophilus CICC 6038 and Lactobacillus delbrueckii ssp. bulgaricus CICC 6047 using pan-genomics analysis.

Journal of dairy science pii:S0022-0302(24)00014-6 [Epub ahead of print].

The health benefits conferred by probiotics is specific to individual probiotic strains, highlighting the importance of identifying specific strains for research and production purposes. Streptococcus thermophilus CICC 6038 and Lactobacillus delbrueckii ssp. bulgaricus CICC 6047 are exceedingly valuable for commercial use with an excellent mixed-culture fermentation. To differentiate these 2 strains from other S. thermophilus and L. delbrueckii ssp. bulgaricus, a specific, sensitive, accurate, rapid, convenient, and cost-effective method is required. In this study, we conducted a pan-genome analysis of S. thermophilus and L. delbrueckii ssp. bulgaricus to identify species-specific core genes, along with strain-specific single-nucleotide polymorphisms (SNPs). These genes were used to develop suitable PCR primers, and the conformity of sequence length and unique SNPs was confirmed by sequencing for qualitative identification at the strain level. The results demonstrated that SNPs analysis of PCR products derived from these primers could distinguish CICC 6038 and CICC 6047 accurately and reproducibly from the other strains of S. thermophilus and L. delbrueckii ssp. bulgaricus, respectively. The strain-specific PCR method based on SNPs herein is universally applicable for probiotics identification. It offers valuable insights into identifying probiotics at the strain level that is fit-for-purpose in quality control and compliance assessment of commercial dairy products.

RevDate: 2024-01-18

Peng M, Lin W, Zhou A, et al (2024)

High genetic diversity and different type VI secretion systems in Enterobacter species revealed by comparative genomics analysis.

BMC microbiology, 24(1):26.

The human-pathogenic Enterobacter species are widely distributed in diverse environmental conditions, however, the understanding of the virulence factors and genetic variations within the genus is very limited. In this study, we performed comparative genomics analysis of 49 strains originated from diverse niches and belonged to eight Enterobacter species, in order to further understand the mechanism of adaption to the environment in Enterobacter. The results showed that they had an open pan-genome and high genomic diversity which allowed adaptation to distinctive ecological niches. We found the number of secretion systems was the highest among various virulence factors in these Enterobacter strains. Three types of T6SS gene clusters including T6SS-A, T6SS-B and T6SS-C were detected in most Enterobacter strains. T6SS-A and T6SS-B shared 13 specific core genes, but they had different gene structures, suggesting they probably have different biological functions. Notably, T6SS-C was restricted to E. cancerogenus. We detected a T6SS gene cluster, highly similar to T6SS-C (91.2%), in the remote related Citrobacter rodenitum, suggesting that this unique gene cluster was probably acquired by horizontal gene transfer. The genomes of Enterobacter strains possess high genetic diversity, limited number of conserved core genes, and multiple copies of T6SS gene clusters with differentiated structures, suggesting that the origins of T6SS were not by duplication instead by independent acquisition. These findings provide valuable information for better understanding of the functional features of Enterobacter species and their evolutionary relationships.

RevDate: 2024-01-17

Silva-Pereira TT, Soler-Camargo NC, AMS Guimarães (2024)

Diversification of gene content in the Mycobacterium tuberculosis complex is determined by phylogenetic and ecological signatures.

Microbiology spectrum [Epub ahead of print].

In this study, we analyzed the gene content of different ecotypes of the Mycobacterium tuberculosis complex (MTBC), the pathogens of tuberculosis. We found that changes in their gene content are associated with their ecological features, such as host preference. Gene loss was identified as the primary driver of these changes, which can vary even among different strains of the same ecotype. Our study also revealed that the gene content relatedness of these bacteria does not always mirror their evolutionary relationships. In addition, some genes of virulence can be variably lost among strains of the same MTBC ecotype, likely helping them to evade the immune system. Overall, our study highlights the importance of understanding how gene loss can lead to new adaptations in these bacteria and how different selective pressures may influence their genetic makeup.

RevDate: 2024-01-17

Venkatachalam S, Jabir T, Vipindas PV, et al (2024)

Ecological significance of Candidatus ARS69 and Gemmatimonadota in the Arctic glacier foreland ecosystems.

Applied microbiology and biotechnology, 108(1):128.

The Gemmatimonadota phylum has been widely detected in diverse natural environments, yet their specific ecological roles in many habitats remain poorly investigated. Similarly, the Candidatus ARS69 phylum has been identified only in a few habitats, and literature on their metabolic functions is relatively scarce. In the present study, we investigated the ecological significance of phyla Ca. ARS69 and Gemmatimonadota in the Arctic glacier foreland (GF) ecosystems through genome-resolved metagenomics. We have reconstructed the first high-quality metagenome-assembled genome (MAG) belonging to Ca. ARS69 and 12 other MAGs belonging to phylum Gemmatimonadota from the three different Arctic GF samples. We further elucidated these two groups phylogenetic lineage and their metabolic function through phylogenomic and pangenomic analysis. The analysis showed that all the reconstructed MAGs potentially belonged to novel species. The MAGs belonged to Ca. ARS69 consist about 8296 gene clusters, of which only about 8% of single-copy core genes (n = 980) were shared among them. The study also revealed the potential ecological role of Ca. ARS69 is associated with carbon fixation, denitrification, sulfite oxidation, and reduction biochemical processes in the GF ecosystems. Similarly, the study demonstrates the widespread distribution of different classes of Gemmatimonadota across wide ranges of ecosystems and their metabolic functions, including in the polar region. KEY POINTS: • Glacier foreland ecosystems act as a natural laboratory to study microbial community structure. • We have reconstructed 13 metagenome-assembled genomes from the soil samples. • All the reconstructed MAGs belonged to novel species with different metabolic processes. • Ca. ARS69 and Gemmatimonadota MAGs were found to participate in carbon fixation and denitrification processes.

RevDate: 2024-01-15

Han DM, Baek JH, Choi DG, et al (2024)

Comparative pangenome analysis of Aspergillus flavus and Aspergillus oryzae reveals their phylogenetic, genomic, and metabolic homogeneity.

Food microbiology, 119:104435.

Aspergillus flavus and Aspergillus oryzae are closely related fungal species with contrasting roles in food safety and fermentation. To comprehensively investigate their phylogenetic, genomic, and metabolic characteristics, we conducted an extensive comparative pangenome analysis using complete, dereplicated genome sets for both species. Phylogenetic analyses, employing both the entirety of the identified single-copy orthologous genes and six housekeeping genes commonly used for fungal classification, did not reveal clear differentiation between A. flavus and A. oryzae genomes. Upon analyzing the aflatoxin biosynthesis gene clusters within the genomes, we observed that non-aflatoxin-producing strains were dispersed throughout the phylogenetic tree, encompassing both A. flavus and A. oryzae strains. This suggests that aflatoxin production is not a distinguishing trait between the two species. Furthermore, A. oryzae and A. flavus strains displayed remarkably similar genomic attributes, including genome sizes, gene contents, and G + C contents, as well as metabolic features and pathways. The profiles of CAZyme genes and secondary metabolite biosynthesis gene clusters within the genomes of both species further highlight their similarity. Collectively, these findings challenge the conventional differentiation of A. flavus and A. oryzae as distinct species and highlight their phylogenetic, genomic, and metabolic homogeneity, potentially indicating that they may indeed belong to the same species.

RevDate: 2024-01-15

Wendisch VF, Brito LF, LMP Passaglia (2024)

Genome-based analyses to learn from and about Paenibacillus sonchi genomovar Riograndensis SBR5T.

Genetics and molecular biology, 46(3 Suppl 1):e20230115 pii:S1415-47572023000600113.

Paenibacillus sonchi genomovar Riograndensis SBR5T is a plant growth-promoting rhizobacterium (PGPR) isolated in the Brazilian state of Rio Grande do Sul from the rhizosphere of Triticum aestivum. It fixes nitrogen, produces siderophores as well as the phytohormone indole-3-acetic acid, solubilizes phosphate and displays antagonist activity against Listeria monocytogenes and Pectobacterium carotovorum. Comprehensive omics analysis and the development of genetic tools are key to characterizing and engineering such non-model microorganisms. Therefore, the complete genome of SBR5T was sequenced, and shown to encode 6,705 proteins, 87 tRNAs, and 27 rRNAs and it enabled a landscape transcriptome analysis that unveiled conserved transcriptional and translational patterns and characterized operon structures and riboswitches. The pangenome of P. sonchi species is open with a stable core pangenome. At the same time, the analysis of genes coding for nitrogenases revealed that the trait of nitrogen fixation is sparse within the Paenibacillaceae family and the presence of Fe-only nitrogenase in the P. sonchi group was exclusive to SBR5T. The development of genetic tools for SBR5T enabled genetic transformation, plasmid construction for constitutive and inducible gene expression, and gene repression using the CRISPRi system. Altogether, the work with P. sonchi can guide the study of non-model bacteria with economic potential.

RevDate: 2024-01-13

Monterrubio-López GP, Llamas-Monroy JL, Martínez-Gómez ÁA, et al (2024)

Novel vaccine candidates of Bordetella pertussis identified by reverse vaccinology.

Biologicals : journal of the International Association of Biological Standardization, 85:101740 pii:S1045-1056(23)00079-9 [Epub ahead of print].

Whooping cough is a disease caused by Bordetella pertussis, whose morbidity has increased, motivating the improvement of current vaccines. Reverse vaccinology is a strategy that helps identify proteins with good characteristics fast and with fewer resources. In this work, we applied reverse vaccinology to study the B. pertussis proteome and pangenome with several in-silico tools. We analyzed the B. pertussis Tohama I proteome with NERVE software and compared 234 proteins with B. parapertussis, B. bronchiseptica, and B. holmessi. VaxiJen was used to calculate an antigenicity value; our threshold was 0.6, selecting 84 proteins. The candidates were depurated and grouped in eight family proteins to select representative candidates, according to bibliographic information and their immunological response predicted with ABCpred, Bcepred, IgPred, and C-ImmSim. Additionally, a pangenome study was conducted with 603 B. pertussis strains and PanRV software, identifying 3421 core proteins that were analyzed to select the best candidates. Finally, we selected 15 proteins from the proteome study and seven proteins from the pangenome analysis as good vaccine candidates.

RevDate: 2024-01-12

Yang Z, Yang X, Wang M, et al (2024)

Genome-wide association study reveals serovar-associated genetic loci in Riemerella anatipestifer.

BMC genomics, 25(1):57.

BACKGROUND: The disease caused by Riemerella anatipestifer (R. anatipestifer, RA) results in large economic losses to the global duck industry every year. Serovar-related genomic variation, such as the O-antigen and capsular polysaccharide (CPS) gene clusters, has been widely used for serotyping in many gram-negative bacteria. RA has been classified into at least 21 serovars based on slide agglutination, but the molecular basis of serotyping is unknown. In this study, we performed a pan-genome-wide association study (Pan-GWAS) to identify the genetic loci associated with RA serovars.

RESULTS: The results revealed a significant association between the putative CPS synthesis gene locus and the serological phenotype. Further characterization of the CPS gene clusters in 11 representative serovar strains indicated that they were highly diverse and serovar-specific. The CPS gene cluster contained the key genes wzx and wzy, which are involved in the Wzx/Wzy-dependent pathway of CPS synthesis. Similar CPS loci have been found in some other species within the family Weeksellaceae. We have also shown that deletion of the wzy gene in RA results in capsular defects and cross-agglutination.

CONCLUSIONS: This study indicates that the CPS synthesis gene cluster of R. anatipestifer is a serotype-specific genetic locus. Importantly, our finding provides a new perspective for the systematic analysis of the genetic basis of the R anatipestifer serovars and a potential target for establishing a complete molecular serotyping scheme.

RevDate: 2024-01-12

Schreiber M, Wonneberger R, Haaning AM, et al (2024)

Genomic resources for a historical collection of cultivated two-row European spring barley genotypes.

Scientific data, 11(1):66.

Barley genomic resources are increasing rapidly, with the publication of a barley pangenome as one of the latest developments. Two-row spring barley cultivars are intensely studied as they are the source of high-quality grain for malting and distilling. Here we provide data from a European two-row spring barley population containing 209 different genotypes registered for the UK market between 1830 to 2014. The dataset encompasses RNA-sequencing data from six different tissues across a range of barley developmental stages, phenotypic datasets from two consecutive years of field-grown trials in the United Kingdom, Germany and the USA; and whole genome shotgun sequencing from all cultivars, which was used to complement the RNA-sequencing data for variant calling. The outcomes are a filtered SNP marker file, a phenotypic database and a large gene expression dataset providing a comprehensive resource which allows for downstream analyses like genome wide association studies or expression associations.

RevDate: 2024-01-12

Park S, Kim I, Chhetri G, et al (2024)

Cellulomonas alba sp. nov. and Cellulomonas edaphi sp. nov., isolated from wetland soils.

International journal of systematic and evolutionary microbiology, 74(1):.

Two novel strains were isolated from wetland soils in Goyang, Republic of Korea. The two Gram-stain-positive, facultatively anaerobic, rod-shaped bacterial-type strains were designated MW4[T] and MW9[T]. Phylogenomic analysis based on whole-genome sequences suggested that both strains belonged to the genus Cellulomonas. The cells of strain MW4[T] were non-motile and grew at 20-40 °C (optimum, 35 °C), at pH 6.0-10.0 (optimum, pH 8.0) and in the presence of 0-1.0% NaCl (optimum, 0 %). The cells of strain MW9[T] were non-motile and grew at 20-40 °C (optimum, 35 °C), at pH 5.0-9.0 (optimum, pH 8.0) and in the presence of 0-1.0% NaCl (optimum, 0 %). The average nucleotide identity (77.1-88.1 %) and digital DNA-DNA hybridization values (21.0-34.8 %) between the two novel strains and with their closely related strains fell within the range for the genus Cellulomonas. The novel strains MW4[T] and MW9[T] and reference strains possessed alkane synthesis gene clusters (oleA, oleB, oleC and oleD). Phylogenomic, phylogenetic, average nucleotide identity, digital DNA-DNA hybridization, physiological and biochemical data indicated that the novel strains were distinct from other members of the family Cellulomonadaceae. We propose the names Cellulomonas alba sp. nov. (type strain MW4[T]=KACC 23260[T]=TBRC 17645[T]) and Cellulomons edaphi sp. nov. (type strain MW9[T]=KACC 23261[T]=TBRC 17646[T]) for the two strains.

RevDate: 2024-01-12

Ferrero-Serrano Á, Chakravorty D, Kirven KJ, et al (2024)

Oryza CLIMtools: A Genome-Environment Association Resource Reveals Adaptive Roles for Heterotrimeric G Proteins in the Regulation of Rice Agronomic Traits.

Plant communications pii:S2590-3462(24)00033-6 [Epub ahead of print].

Modern crop varieties display a degree of mismatch between their current distributions and the suitability of the local climate for their productivity. To this end, we present Oryza CLIMtools (https://gramene.org/CLIMtools/oryza_v1.0/), the first resource for pan-genome prediction of climate-associated genetic variants in a crop species. Oryza CLIMtools consists of interactive web-based databases that allow the user to: i) explore the local environments of traditional rice varieties (landraces) in South-Eastern Asia, and; ii) investigate the environment by genome associations for 658 Indica and 283 Japonica rice landrace accessions collected from georeferenced local environments and included in the 3K Rice Genomes Project. We exemplify the value of these resources, identifying an interplay between flowering time and temperature in the local environment that is facilitated by adaptive natural variation in OsHD2 and disrupted by a natural variant in OsSOC1. Prior QTL analysis has suggested the importance of heterotrimeric G proteins in the control of agronomic traits. Accordingly, we analyzed the climate associations of natural variants in the different heterotrimeric G protein subunits. We identified a coordinated role of G proteins in adaptation to the prevailing Potential Evapotranspiration gradient and their regulation of key agronomic traits including plant height and seed and panicle length. We conclude by highlighting the prospect of targeting heterotrimeric G proteins to produce crops that are climate resilient.

RevDate: 2024-01-11

Bin Hafeez A, Pełka K, Worobo R, et al (2024)

In Silico Safety Assessment of Bacillus Isolated from Polish Bee Pollen and Bee Bread as Novel Probiotic Candidates.

International journal of molecular sciences, 25(1): pii:ijms25010666.

Bacillus species isolated from Polish bee pollen (BP) and bee bread (BB) were characterized for in silico probiotic and safety attributes. A probiogenomics approach was used, and in-depth genomic analysis was performed using a wide array of bioinformatics tools to investigate the presence of virulence and antibiotic resistance properties, mobile genetic elements, and secondary metabolites. Functional annotation and Carbohydrate-Active enZYmes (CAZYme) profiling revealed the presence of genes and a repertoire of probiotics properties promoting enzymes. The isolates BB10.1, BP20.15 (isolated from bee bread), and PY2.3 (isolated from bee pollen) genome mining revealed the presence of several genes encoding acid, heat, cold, and other stress tolerance mechanisms, adhesion proteins required to survive and colonize harsh gastrointestinal environments, enzymes involved in the metabolism of dietary molecules, antioxidant activity, and genes associated with the synthesis of vitamins. In addition, genes responsible for the production of biogenic amines (BAs) and D-/L-lactate, hemolytic activity, and other toxic compounds were also analyzed. Pan-genome analyses were performed with 180 Bacillus subtilis and 204 Bacillus velezensis genomes to mine for any novel genes present in the genomes of our isolates. Moreover, all three isolates also consisted of gene clusters encoding secondary metabolites.

RevDate: 2024-01-11

Liu K, Xu H, Gao X, et al (2023)

Pan-Genome Analysis of TIFY Gene Family and Functional Analysis of CsTIFY Genes in Cucumber.

International journal of molecular sciences, 25(1): pii:ijms25010185.

Cucumbers are frequently affected by gray mold pathogen Botrytis cinerea, a pathogen that causes inhibited growth and reduced yield. Jasmonic acid (JA) plays a primary role in plant responses to biotic stresses, and the jasmonate-ZIM-Domain (JAZ) proteins are key regulators of the JA signaling pathway. In this study, we used the pan-genome of twelve cucumber varieties to identify cucumber TIFY genes. Our findings revealed that two CsTIFY genes were present in all twelve cucumber varieties and showed no differences in protein sequence, gene structure, and motif composition. This suggests their evolutionary conservation across different cucumber varieties and implies that they may play a crucial role in cucumber growth. On the other hand, the other fourteen CsTIFY genes exhibited variations in protein sequence and gene structure or conserved motifs, which could be the result of divergent evolution, as these genes adapt to different cultivation and environmental conditions. Analysis of the expression profiles of the CsTIFY genes showed differential regulation by B. cinerea. Transient transfection plants overexpressing CsJAZ2, CsJAZ6, or CsZML2 were found to be more susceptible to B. cinerea infection compared to control plants. Furthermore, these plants infected by the pathogen showed lower levels of the enzymatic activities of POD, SOD and CAT. Importantly, after B. cinerea infection, the content of JA was upregulated in the plants, and cucumber cotyledons pretreated with exogenous MeJA displayed increased resistance to B. cinerea infection compared to those pretreated with water. Therefore, this study explored key TIFY genes in the regulation of cucumber growth and adaptability to different cultivation environments based on bioinformatics analysis and demonstrated that CsJAZs negatively regulate cucumber disease resistance to gray mold via multiple signaling pathways.

RevDate: 2024-01-10

Sosinsky A, Ambrose J, Cross W, et al (2024)

Insights for precision oncology from the integration of genomic and clinical data of 13,880 tumors from the 100,000 Genomes Cancer Programme.

Nature medicine [Epub ahead of print].

The Cancer Programme of the 100,000 Genomes Project was an initiative to provide whole-genome sequencing (WGS) for patients with cancer, evaluating opportunities for precision cancer care within the UK National Healthcare System (NHS). Genomics England, alongside NHS England, analyzed WGS data from 13,880 solid tumors spanning 33 cancer types, integrating genomic data with real-world treatment and outcome data, within a secure Research Environment. Incidence of somatic mutations in genes recommended for standard-of-care testing varied across cancer types. For instance, in glioblastoma multiforme, small variants were present in 94% of cases and copy number aberrations in at least one gene in 58% of cases, while sarcoma demonstrated the highest occurrence of actionable structural variants (13%). Homologous recombination deficiency was identified in 40% of high-grade serous ovarian cancer cases with 30% linked to pathogenic germline variants, highlighting the value of combined somatic and germline analysis. The linkage of WGS and longitudinal life course clinical data allowed the assessment of treatment outcomes for patients stratified according to pangenomic markers. Our findings demonstrate the utility of linking genomic and real-world clinical data to enable survival analysis to identify cancer genes that affect prognosis and advance our understanding of how cancer genomics impacts patient outcomes.

RevDate: 2024-01-08

Zhang RY, Wang YR, Liu RL, et al (2024)

Metagenomic characterization of a novel non-ammonia-oxidizing Thaumarchaeota from hadal sediment.

Microbiome, 12(1):7.

BACKGROUND: The hadal sediment, found at an ocean depth of more than 6000 m, is geographically isolated and under extremely high hydrostatic pressure, resulting in a unique ecosystem. Thaumarchaeota are ubiquitous marine microorganisms predominantly present in hadal environments. While there have been several studies on Thaumarchaeota there, most of them have primarily focused on ammonia-oxidizing archaea (AOA). However, systematic metagenomic research specifically targeting heterotrophic non-AOA Thaumarchaeota is lacking.

RESULTS: In this study, we explored the metagenomes of Challenger Deep hadal sediment, focusing on the Thaumarchaeota. Functional analysis of sequence reads revealed the potential contribution of Thaumarchaeota to recalcitrant dissolved organic matter degradation. Metagenome assembly binned one new group of hadal sediment-specific and ubiquitously distributed non-AOA Thaumarchaeota, named Group-3.unk. Pathway reconstruction of this new type of Thaumarchaeota also supports heterotrophic characteristics of Group-3.unk, along with ABC transporters for the uptake of amino acids and carbohydrates and catabolic utilization of these substrates. This new clade of Thaumarchaeota also contains aerobic oxidation of carbon monoxide-related genes. Complete glyoxylate cycle is a distinctive feature of this clade in supplying intermediates of anabolic pathways. The pan-genomic and metabolic analyses of metagenome-assembled genomes belonging to Group-3.unk Thaumarchaeota have highlighted distinctions, including the dihydroxy phthalate decarboxylase gene associated with the degradation of aromatic compounds and the absence of genes related to the synthesis of some types of vitamins compared to AOA. Notably, Group-3.unk shares a common feature with deep ocean AOA, characterized by their high hydrostatic pressure resistance, potentially associated with the presence of V-type ATP and di-myo-inositol phosphate syntheses-related genes. The enrichment of organic matter in hadal sediments might be attributed to the high recruitment of sequence reads of the Group-3.unk clade of heterotrophic Thaumarchaeota in the trench sediment. Evolutionary and genetic dynamic analyses suggest that Group-3 non-AOA consists of mesophilic Thaumarchaeota organisms. These results indicate a potential role in the transition from non-AOA to AOA Thaumarchaeota and from thermophilic to mesophilic Thaumarchaeota, shedding light on recent evolutionary pathways.

CONCLUSIONS: One novel clade of heterotrophic non-AOA Thaumarchaeota was identified through metagenome analysis of sediments from Challenger Deep. Our study provides insight into the ecology and genomic characteristics of the new sub-group of heterotrophic non-AOA Thaumarchaeota, thereby extending the knowledge of the evolution of Thaumarchaeota. Video Abstract.

RevDate: 2024-01-08

Biderre-Petit C, Courtine D, Hennequin C, et al (2024)

A pan-genomic approach reveals novel Sulfurimonas clade in the ferruginous meromictic Lake Pavin.

Molecular ecology resources [Epub ahead of print].

The permanently anoxic waters in meromictic lakes create suitable niches for the growth of bacteria using sulphur metabolisms like sulphur oxidation. In Lake Pavin, the anoxic water mass hosts an active cryptic sulphur cycle that interacts narrowly with iron cycling, however the metabolisms of the microorganisms involved are poorly known. Here we combined metagenomics, single-cell genomics, and pan-genomics to further expand our understanding of the bacteria and the corresponding metabolisms involved in sulphur oxidation in this ferruginous sulphide- and sulphate-poor meromictic lake. We highlighted two new species within the genus Sulfurimonas that belong to a novel clade of chemotrophic sulphur oxidisers exclusive to freshwaters. We moreover conclude that this genus holds a key-role not only in limiting sulphide accumulation in the upper part of the anoxic layer but also constraining carbon, phosphate and iron cycling.

RevDate: 2024-01-08

Karthik K, Subramanian S, Vinoli Priyadharshini M, et al (2023)

Whole genome sequencing and comparative genomics of Mycobacterium orygis isolated from different animal hosts to identify specific diagnostic markers.

Frontiers in cellular and infection microbiology, 13:1302393.

INTRODUCTION: Mycobacterium orygis, a member of MTBC has been identified in higher numbers in the recent years from animals of South Asia. Comparative genomics of this important zoonotic pathogen is not available which can provide data on the molecular difference between other MTBC members. Hence, the present study was carried out to isolate, whole genome sequence M. orygis from different animal species (cattle, buffalo and deer) and to identify molecular marker for the differentiation of M. orygis from other MTBC members.

METHODS: Isolation and whole genome sequencing of M. orygis was carried out for 9 samples (4 cattle, 4 deer and 1 buffalo) died due to tuberculosis. Comparative genomics employing 53 genomes (44 from database and 9 newly sequenced) was performed to identify SNPs, spoligotype, pangenome structure, and region of difference.

RESULTS: M. orygis was isolated from water buffalo and sambar deer which is the first of its kind report worldwide. Comparative pangenomics of all M. orygis strains worldwide (n= 53) showed a closed pangenome structure which is also reported for the first time. Pairwise SNP between TANUVAS_2, TANUVAS_4, TANUVAS_5, TANUVAS_7 and NIRTAH144 was less than 15 indicating that the same M. orygis strain may be the cause for infection. Region of difference prediction showed absence of RD7, RD8, RD9, RD10, RD12, RD301, RD315 in all the M. orygis analyzed. SNPs in virulence gene, PE35 was found to be unique to M. orygis which can be used as marker for identification.

CONCLUSION: The present study is yet another supportive evidence that M. orygis is more prevalent among animals in South Asia and the zoonotic potential of this organism needs to be evaluated.

RevDate: 2024-01-08

Oles RE, Terrazas MC, Loomis LR, et al (2023)

Pangenome comparison of Bacteroides fragilis genomospecies unveil genetic diversity and ecological insights.

bioRxiv : the preprint server for biology pii:2023.12.20.572674.

Bacteroides fragilis is a Gram-negative commensal bacterium commonly found in the human colon that differentiates into two genomospecies termed division I and II. We leverage a comprehensive collection of 694 B. fragilis whole genome sequences and report differential gene abundance to further support the recent proposal that divisions I and II represent separate species. In division I strains, we identify an increased abundance of genes related to complex carbohydrate degradation, colonization, and host niche occupancy, confirming the role of division I strains as gut commensals. In contrast, division II strains display an increased prevalence of plant cell wall degradation genes and exhibit a distinct geographic distribution, primarily originating from Asian countries, suggesting dietary influences. Notably, division II strains have an increased abundance of genes linked to virulence, survival in toxic conditions, and antimicrobial resistance, consistent with a higher incidence of these strains in bloodstream infections. This study provides new evidence supporting a recent proposal for classifying divisions I and II B. fragilis strains as distinct species, and our comparative genomic analysis reveals their niche-specific roles.

RevDate: 2024-01-06

Yu K, Huang Z, Xiao Y, et al (2023)

Global spread characteristics of CTX-M-type extended-spectrum β-lactamases: A genomic epidemiology analysis.

Drug resistance updates : reviews and commentaries in antimicrobial and anticancer chemotherapy, 73:101036 pii:S1368-7646(23)00119-X [Epub ahead of print].

BACKGROUND: Extended-spectrum β-lactamases (ESBLs) producing bacteria have spread worldwide and become a global public health concern. Plasmid-mediated transfer of ESBLs is an important route for resistance acquisition.

METHODS: We collected 1345 complete sequences of plasmids containing CTX-Ms from public database. The global transmission pattern of plasmids and evolutionary dynamics of CTX-Ms have been inferred. We applied the pan-genome clustering based on plasmid genomes and evolution analysis to demonstrate the transmission events.

FINDINGS: Totally, 48 CTX-Ms genotypes and 186 incompatible types of plasmids were identified. The geographical distribution of CTX-Ms showed significant differences across countries and continents. CTX-M-14 and CTX-M-55 were found to be the dominant genotypes in Asia, while CTX-M-1 played a leading role in Europe. The plasmids can be divided into 12 lineages, some of which forming distinct geographical clusters in Asia and Europe, while others forming hybrid populations. The Inc types of plasmids are lineage-specific, with the CTX-M-1_IncI1-I (Alpha) and CTX-M-65_IncFII (pHN7A8)/R being the dominant patterns of cross-host and cross-regional transmission. The IncI-I (Alpha) plasmids with the highest number, were presumed to form communication groups in Europe-Asia and Asia-America-Oceania, showing the transmission model as global dissemination and regional microevolution. Meanwhile, the main kinetic elements of blaCTX-Ms showed genotypic preferences. ISEcpl and IS26 were most frequently involved in the transfer of CTX-M-14 and CTX-M-65, respectively. IS15 has become a crucial participant in mediating the dissemination of blaCTX-Ms. Interestingly, blaTEM and blaCTX-Ms often coexisted in the same transposable unit. Furthermore, antibiotic resistance genes associated with aminoglycosides, sulfonamides and cephalosporins showed a relatively high frequency of synergistic effects with CTX-Ms.

CONCLUSIONS: We recognized the dominant blaCTX-Ms and mainstream plasmids of different continents. The results of this study provide support for a more effective response to the risks associated with the evolution of blaCTX-Ms-bearing plasmids, and lay the foundation for genotype-specific epidemiological surveillance of resistance, which are of important public health implications.

RevDate: 2024-01-05

Verma N, Sharma T, Bhardwaj A, et al (2024)

Comparative genomics and characterization of a multidrug-resistant Acinetobacter baumannii VRL-M19 isolated from a crowded setting in India.

Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases pii:S1567-1348(23)00147-8 [Epub ahead of print].

A crowded vegetable market serves as a mass gathering, posing a potential risk for infection transmission. In this study, we isolated a multidrug-resistant Acinetobacter baumannii strain, VRL-M19, from the air of such a market and conducted comparative genomics and phenotypic characterization. Antimicrobial susceptibility testing, genome sequencing using Illumina HiSeq X10, and pan-genome analysis with 788 clinical isolates identified core, accessory, and unique drug-resistant determinants. Mutational analysis of drug-resistance genes, virulence factor annotation, in vitro pathogenicity assessment, subsystem analysis, Multilocus sequence typing, and whole genome phylogenetic analysis were performed. VRL-M19 exhibited multidrug resistance with 69 determinants, and analysis across 788 clinical isolates and 350 Indian isolates revealed more accessory genes (52 out of 69) in the Indian isolates. Multiple mutations were observed in drug target modification genes, and the strain was identified as a moderate biofilm-former with 55 virulence factors. Whole genome phylogenetics indicated a close relationship between VRL-M19 and clinical A. baumannii strains. In conclusion, our comprehensive study suggests that VRL-M19 is a multidrug-resistant, potential pathogen with biofilm-forming capabilities, closely associated with clinical A. baumannii strains.

RevDate: 2024-01-04

Domingo-Sananes MR, CJ Meehan (2024)

The population genetics of prokaryotic pangenomes.

Nature ecology & evolution [Epub ahead of print].

RevDate: 2024-01-04

Douglas GM, BJ Shapiro (2024)

Pseudogenes act as a neutral reference for detecting selection in prokaryotic pangenomes.

Nature ecology & evolution [Epub ahead of print].

A long-standing question is to what degree genetic drift and selection drive the divergence in rare accessory gene content between closely related bacteria. Rare genes, including singletons, make up a large proportion of pangenomes (all genes in a set of genomes), but it remains unclear how many such genes are adaptive, deleterious or neutral to their host genome. Estimates of species' effective population sizes (Ne) are positively associated with pangenome size and fluidity, which has independently been interpreted as evidence for both neutral and adaptive pangenome models. We hypothesized that pseudogenes, used as a neutral reference, could be used to distinguish these models. We find that most functional categories are depleted for rare pseudogenes when a genome encodes only a single intact copy of a gene family. In contrast, transposons are enriched in pseudogenes, suggesting they are mostly neutral or deleterious to the host genome. Thus, even if individual rare accessory genes vary in their effects on host fitness, we can confidently reject a model of entirely neutral or deleterious rare genes. We also define the ratio of singleton intact genes to singleton pseudogenes (si/sp) within a pangenome, compare this measure across 668 prokaryotic species and detect a signal consistent with the adaptive value of many rare accessory genes. Taken together, our work demonstrates that comparing with pseudogenes can improve inferences of the evolutionary forces driving pangenome variation.

RevDate: 2024-01-04

Sarr M, Alou MT, Padane A, et al (2023)

A review of the literature of Listeria monocytogenes in Africa highlights breast milk as an overlooked human source.

Frontiers in microbiology, 14:1213953.

According to the latest WHO estimates (2015) of the global burden of foodborne diseases, Listeria monocytogenes is responsible for one of the most serious foodborne infections and commonly results in severe clinical outcomes. The 2013 French MONALISA prospective cohort identified that women born in Africa has a 3-fold increase in the risk of maternal neonatal listeriosis. One of the largest L. monocytogenes outbreaks occurred in South Africa in 2017-2018 with over 1,000 cases. Moreover, recent findings identified L. monocytogenes in human breast milk in Mali and Senegal with its relative abundance positively correlated with severe acute malnutrition. These observations suggest that the carriage of L. monocytogenes in Africa should be further explored, starting with the existing literature. For that purpose, we searched the peer-reviewed and grey literature published dating back to 1926 to date using six databases. Ultimately, 225 articles were included in this review. We highlighted that L. monocytogenes is detected in various sample types including environmental samples, food samples as well as animal and human samples. These studies were mostly conducted in five east African countries, four west African countries, four north African countries, and two Southern African countries. Moreover, only ≈ 0.2% of the Listeria monocytogenes genomes available on NCBI were obtained from African samples, contracted with its detection. The pangenome resulting from the African Listeria monocytogenes samples revealed three clusters including two from South-African strains as well as one consisting of the strains isolated from breast milk in Mali and Senegal and, a vaginal post-miscarriage sample. This suggests there was a clonal complex circulating in Mali and Senegal. As this clone has not been associated to infections, further studies should be conducted to confirm its circulation in the region and explore its association with foodborne infections. Moreover, it is apparent that more resources should be allocated to the detection of L. monocytogenes as only 15/54 countries have reported its detection in the literature. It seems paramount to map the presence and carriage of L. monocytogenes in all African countries to prevent listeriosis outbreaks and the related miscarriages and confirm its association with severe acute malnutrition.

RevDate: 2024-01-03

Choi DG, Baek JH, Han DM, et al (2024)

Comparative pangenome analysis of Enterococcus faecium and Enterococcus lactis provides new insights into the adaptive evolution by horizontal gene acquisitions.

BMC genomics, 25(1):28.

BACKGROUND: Enterococcus faecium and E. lactis are phylogenetically closely related lactic acid bacteria that are ubiquitous in nature and are known to be beneficial or pathogenic. Despite their considerable industrial and clinical importance, comprehensive studies on their evolutionary relationships and genomic, metabolic, and pathogenic traits are still lacking. Therefore, we conducted comparative pangenome analyses using all available dereplicated genomes of these species.

RESULTS: E. faecium was divided into two subclades: subclade I, comprising strains derived from humans, animals, and food, and the more recent phylogenetic subclade II, consisting exclusively of human-derived strains. In contrast, E. lactis strains, isolated from diverse sources including foods, humans, animals, and the environment, did not display distinct clustering based on their isolation sources. Despite having similar metabolic features, noticeable genomic differences were observed between E. faecium subclades I and II, as well as E. lactis. Notably, E. faecium subclade II strains exhibited significantly larger genome sizes and higher gene counts compared to both E. faecium subclade I and E. lactis strains. Furthermore, they carried a higher abundance of antibiotic resistance, virulence, bacteriocin, and mobile element genes. Phylogenetic analysis of antibiotic resistance and virulence genes suggests that E. faecium subclade II strains likely acquired these genes through horizontal gene transfer, facilitating their effective adaptation in response to antibiotic use in humans.

CONCLUSIONS: Our study offers valuable insights into the adaptive evolution of E. faecium strains, enabling their survival as pathogens in the human environment through horizontal gene acquisitions.

RevDate: 2024-01-03

Lin J, Xiao Y, Liu H, et al (2024)

Combined transcriptomic and pangenomic analyses guide metabolic amelioration to enhance tiancimycins production.

Applied microbiology and biotechnology, 108(1):1-11.

Exploration of high-yield mechanism is important for further titer improvement of valuable antibiotics, but how to achieve this goal is challenging. Tiancimycins (TNMs) are anthraquinone-fused enediynes with promising drug development potentials, but their prospective applications are limited by low titers. This work aimed to explore the intrinsic high-yield mechanism in previously obtained TNMs high-producing strain Streptomyces sp. CB03234-S for the further titer amelioration of TNMs. First, the typical ribosomal RpsL(K43N) mutation in CB03234-S was validated to be merely responsible for the streptomycin resistance but not the titer improvement of TNMs. Subsequently, the combined transcriptomic, pan-genomic and KEGG analyses revealed that the significant changes in the carbon and amino acid metabolisms could reinforce the metabolic fluxes of key CoA precursors, and thus prompted the overproduction of TNMs in CB03234-S. Moreover, fatty acid metabolism was considered to exert adverse effects on the biosynthesis of TNMs by shunting and reducing the accumulation of CoA precursors. Therefore, different combinations of relevant genes were respectively overexpressed in CB03234-S to strengthen fatty acid degradation. The resulting mutants all showed the enhanced production of TNMs. Among them, the overexpression of fadD, a key gene responsible for the first step of fatty acid degradation, achieved the highest 21.7 ± 1.1 mg/L TNMs with a 63.2% titer improvement. Our studies suggested that comprehensive bioinformatic analyses are effective to explore metabolic changes and guide rational metabolic reconstitution for further titer improvement of target products. KEY POINTS: • Comprehensive bioinformatic analyses effectively reveal primary metabolic changes. • Primary metabolic changes cause precursor enrichment to enhance TNMs production. • Strengthening of fatty acid degradation further improves the titer of TNMs.

RevDate: 2024-01-03

Triesch S, Denton AK, Bouvier JW, et al (2024)

Transposable elements contribute to the establishment of the glycine shuttle in Brassicaceae species.

Plant biology (Stuttgart, Germany) [Epub ahead of print].

C3 -C4 intermediate photosynthesis has evolved at least five times convergently in the Brassicaceae, despite this family lacking bona fide C4 species. The establishment of this carbon concentrating mechanism is known to require a complex suite of ultrastructural modifications, as well as changes in spatial expression patterns, which are both thought to be underpinned by a reconfiguration of existing gene-regulatory networks. However, to date, the mechanisms which underpin the reconfiguration of these gene networks are largely unknown. In this study, we used a pan-genomic association approach to identify genomic features that could confer differential gene expression towards the C3 -C4 intermediate state by analysing eight C3 species and seven C3 -C4 species from five independent origins in the Brassicaceae. We found a strong correlation between transposable element (TE) insertions in cis-regulatory regions and C3 -C4 intermediacy. Specifically, our study revealed 113 gene models in which the presence of a TE within a gene correlates with C3 -C4 intermediate photosynthesis. In this set, genes involved in the photorespiratory glycine shuttle are enriched, including the glycine decarboxylase P-protein whose expression domain undergoes a spatial shift during the transition to C3 -C4 photosynthesis. When further interrogating this gene, we discovered independent TE insertions in its upstream region which we conclude to be responsible for causing the spatial shift in GLDP1 gene expression. Our findings hint at a pivotal role of TEs in the evolution of C3 -C4 intermediacy, especially in mediating differential spatial gene expression.

RevDate: 2024-01-03

Guo N, Wang S, Wang T, et al (2024)

Graph-based Pan-genome of Brassica oleracea Provides New Insights into Its Domestication and Morphotype Diversification.

Plant communications pii:S2590-3462(23)00349-8 [Epub ahead of print].

The domestication of Brassica oleracea has resulted in diverse morphological types with distinct patterns of organ development. Here we report a graph-based pan-genome of B. oleracea constructed with high-quality genome assemblies of different morphotypes. The pan-genome harbors over 200 structural variant (SV) hotspot regions enriched with auxin and flowering-related genes. Population genomic analyses reveal that early domestication of B. oleracea focused on leaf or stem development. Gene flows resulting from agricultural practices and variety improvement are detected among different morphotypes. Selective sweep and pan-genome analyses identify an auxin-responsive SAUR gene and a CLE family gene as crucial players in the leaf-stem differentiation during the early stage of B. oleracea domestication, and the BoKAN1 gene as instrumental in shaping the leafy heads of cabbage and Brussels sprouts. Our pan-genome and functional analyses further discover that variations in the BoFLC2 gene play key roles in the divergence of vernalization and flowering characteristics among different morphotypes, and variations in the first intron of BoFLC3 are involved in fine-tuning the flowering process in cauliflower. This study provides a comprehensive understanding of the pan-genome of B. oleracea and sheds light on the domestication and differential organ development of this globally important crop species.

RevDate: 2024-01-03

Sirén J, Eskandar P, Ungaro MT, et al (2023)

Personalized Pangenome References.

bioRxiv : the preprint server for biology pii:2023.12.13.571553.

Pangenomes, by including genetic diversity, should reduce reference bias by better representing new samples compared to them. Yet when comparing a new sample to a pangenome, variants in the pangenome that are not part of the sample can be misleading, for example, causing false read mappings. These irrelevant variants are generally rarer in terms of allele frequency, and have previously been dealt with using allele frequency filters. However, this is a blunt heuristic that both fails to remove some irrelevant variants and removes many relevant variants. We propose a new approach, inspired by local ancestry inference methods, that imputes a personalized pangenome subgraph based on sampling local haplotypes according to k -mer counts in the reads. Our approach is tailored for the Giraffe short read aligner, as the indexes it needs for read mapping can be built quickly. We compare the accuracy of our approach to state-of-the-art methods using graphs from the Human Pangenome Reference Consortium. The resulting personalized pangenome pipelines provide faster pangenome read mapping than comparable pipelines that use a linear reference, reduce small variant genotyping errors by 4x relative to the Genome Analysis Toolkit (GATK) best-practice pipeline, and for the first time make short-read structural variant genotyping competitive with long-read discovery methods.

RevDate: 2024-01-03

Qiu X, McGee L, Hammitt LL, et al (2023)

Prediction of post-PCV13 pneumococcal evolution using invasive disease data enhanced by inverse-invasiveness weighting.

medRxiv : the preprint server for health sciences pii:2023.12.10.23299786.

BACKGROUND: After introduction of pneumococcal conjugate vaccines (PCVs), serotype replacement occurred in the population of Streptococcus pneumoniae. Predicting which pneumococcal clones and serotypes will become more common in carriage after vaccination can enhance vaccine design and public health interventions, while also improving our understanding of pneumococcal evolution. We sought to use invasive disease data to assess how well negative frequency-dependent selection (NFDS) models could explain pneumococcal carriage population evolution in the post-PCV13 epoch by weighting invasive data to approximate strain proportions in the carriage population.

METHODS: Invasive pneumococcal isolates were collected and sequenced during 1998-2018 by the Active Bacterial Core surveillance (ABCs) from the Centers for Disease Control and Prevention (CDC). To predict the post-PCV13 population dynamics in the carriage population using a NFDS model, all genomic data were processed under a bioinformatic pipeline of assembly, annotation, and pangenome analysis to define genetically similar sequence clusters (i.e., strains) and a set of accessory genes present in 5% to 95% of the isolates. The NFDS model predicted the strain proportion by calculating the post-vaccine strain composition in the weighted invasive disease population that would best match pre-vaccine accessory gene frequencies. To overcome the biases of invasive disease data, serotype-specific inverse-invasiveness weights were defined as the ratio of the proportion of the serotype in the carriage data to the proportion in the invasive data, using data from 1998-2001 in the United States, before conjugate vaccine introduction. The weights were applied to adjust both the observed strain proportion and the accessory gene frequencies.

RESULTS: Inverse-invasiveness weighting increased the correlation of accessory gene frequencies between invasive and carriage data with reduced residuals in linear or logit scale for pre-vaccine, post-PCV7, and post-PCV13. Similarly, weighting increased the correlation of accessory gene frequencies between different time periods in the invasive data. By weighting the invasive data, we were able to use the NFDS model to predict strain proportions in the carriage population in the post-PCV13 epoch, with the adjusted R-squared between predicted and observed strain proportions increasing from 0.176 to 0.544 after weighting.

CONCLUSIONS: The weighting system adjusted the invasive disease surveillance data to better represent the carriage population of S. pneumoniae . The NFDS mechanism predicted the strain proportions in the projected carriage population as estimated from the weighted invasive disease frequencies in the post-PCV13 epoch. Our methods enrich the value of genomic sequences from invasive disease surveillance, which is readily available, easy to collect, and of direct interest to public health.

IMPORTANCE: Streptococcus pneumoniae , a common colonizer in the human nasopharynx, can cause invasive diseases including pneumonia, bacteremia, and meningitis mostly in children under 5 years or older adults. The PCV7 was introduced in 2000 in the United States within the pediatric population to prevent disease and reduce deaths, followed by PCV13 in 2010, PCV15 in 2022, and PCV20 in 2023. After the removal of vaccine serotypes, the prevalence of carriage remained stable as the vacated pediatric ecological niche was filled with certain non-vaccine serotypes. Predicting which pneumococcal clones, and which serotypes, will be most successful in colonization after vaccination can enhance vaccine design and public health interventions, while also improving our understanding of pneumococcal evolution. While carriage data, which are collected from the pneumococcal population that is competing to colonize and transmit, are most directly relevant to evolutionary studies, invasive disease data are often more plentiful. Previously, evolutionary models based on negative frequency-dependent selection (NFDS) on the accessory genome were shown to predict which non-vaccine strains and serotypes were most successful in colonization following the introduction of PCV7. Here, we show that an inverse-invasiveness weighting system applied to invasive disease surveillance data allows the NFDS model to predict strain proportions in the projected carriage population in the post-PCV13/pre-PCV15 and -PCV20 epoch. The significance of our research lies in using a sample of invasive disease surveillance data to extend the use of NFDS as an evolutionary mechanism to predict post-PCV13 population dynamics. This has shown that we can correct for biased sampling that arises from differences in virulence and can enrich the value of genomic data from disease surveillance and advances our understanding of how NFDS impacts carriage population dynamics after both PCV7 and PCV13 vaccination.

RevDate: 2024-01-01

Abondio P, Bruno F, Passarino G, et al (2023)

Pangenomics: a new era in the field of neurodegenerative diseases.

Ageing research reviews pii:S1568-1637(23)00339-2 [Epub ahead of print].

A pangenome is composed of all the genetic variability of a group of individuals, and its application to the study of neurodegenerative diseases may provide valuable insights into the underlying aspects of genetic heterogenetiy for these complex ailments, including gene expression, epigenetics, and translation mechanisms. Furthermore, a reference pangenome allows for the identification of previously undetected structural commonalities and differences among individuals, which may help in the diagnosis of a disease, support the prediction of what will happen over time (prognosis) and aid in developing novel treatments in the perspective of personalized medicine. Therefore, in the present review, the application of the pangenome concept to the study of neurodegenerative diseases will be discussed and analyzed for its potential to enable an improvement in diagnosis and prognosis for these illnesses, leading to the development of tailored treatments for individual patients from the knowledge of the genomic composition of a whole population.

RevDate: 2023-12-30

Lv Y, Liu C, Li X, et al (2023)

A centromere map based on super pan-genome highlights the structure and function of rice centromeres.

Journal of integrative plant biology [Epub ahead of print].

Rice (Oryza sativa) is a significant crop worldwide with a genome shaped by various evolutionary factors. Rice centromeres are crucial for chromosome segregation, and contain some unreported genes. Due to the diverse and complex centromere region, a comprehensive understanding of rice centromere structure and function at the population level is needed. We constructed a high-quality centromere map based on the rice super pan-genome consisting of a 251-accession panel comprising both cultivated and wild species of Asian and African rice. We showed that rice centromeres have diverse satellite repeat CentO, which vary across chromosomes and subpopulations, reflecting their distinct evolutionary patterns. We also revealed that long terminal repeats (LTRs), especially young Gypsy-type LTRs, are abundant in the peripheral CentO-enriched regions (CoERs) and drive rice centromere expansion and evolution. Furthermore, high-quality genome assembly and complete T2T reference genome enable us to obtain more centromeric genome information despite the mapping and cloning of centromere genes is challenging. We investigated the association between structural variations (SVs) and gene expression in the rice centromere. A centromere gene, OsMAB, that positively regulates rice tiller number, was further confirmed by eQTL, haplotype analysis and CRISPR/Cas9 methods. By revealing the new insights into the evolutionary patterns and biological roles of rice centromeres, our finding will facilitate future research on centromere biology and crop improvement. This article is protected by copyright. All rights reserved.

RevDate: 2023-12-29

Yu Y, H Chen (2023)

Human pangenome: far-reaching implications in precision medicine.

Frontiers of medicine [Epub ahead of print].

RevDate: 2023-12-26

Beavan A, Domingo-Sananes MR, JO McInerney (2024)

Contingency, repeatability, and predictability in the evolution of a prokaryotic pangenome.

Proceedings of the National Academy of Sciences of the United States of America, 121(1):e2304934120.

Pangenomes exhibit remarkable variability in many prokaryotic species, much of which is maintained through the processes of horizontal gene transfer and gene loss. Repeated acquisitions of near-identical homologs can easily be observed across pangenomes, leading to the question of whether these parallel events potentiate similar evolutionary trajectories, or whether the remarkably different genetic backgrounds of the recipients mean that postacquisition evolutionary trajectories end up being quite different. In this study, we present a machine learning method that predicts the presence or absence of genes in the Escherichia coli pangenome based on complex patterns of the presence or absence of other accessory genes within a genome. Our analysis leverages the repeated transfer of genes through the E. coli pangenome to observe patterns of repeated evolution following similar events. We find that the presence or absence of a substantial set of genes is highly predictable from other genes alone, indicating that selection potentiates and maintains gene-gene co-occurrence and avoidance relationships deterministically over long-term bacterial evolution and is robust to differences in host evolutionary history. We propose that at least part of the pangenome can be understood as a set of genes with relationships that govern their likely cohabitants, analogous to an ecosystem's set of interacting organisms. Our findings indicate that intragenomic gene fitness effects may be key drivers of prokaryotic evolution, influencing the repeated emergence of complex gene-gene relationships across the pangenome.

RevDate: 2023-12-25

Dabbaghie F, Srikakulam SK, Marschall T, et al (2023)

PanPA: generation and alignment of panproteome graphs.

Bioinformatics advances, 3(1):vbad167.

MOTIVATION: Compared to eukaryotes, prokaryote genomes are more diverse through different mechanisms, including a higher mutation rate and horizontal gene transfer. Therefore, using a linear representative reference can cause a reference bias. Graph-based pangenome methods have been developed to tackle this problem. However, comparisons in DNA space are still challenging due to this high diversity. In contrast, amino acid sequences have higher similarity due to evolutionary constraints, whereby a single amino acid may be encoded by several synonymous codons. Coding regions cover the majority of the genome in prokaryotes. Thus, panproteomes present an attractive alternative leveraging the higher sequence similarity while not losing much of the genome in non-coding regions.

RESULTS: We present PanPA, a method that takes a set of multiple sequence alignments of protein sequences, indexes them, and builds a graph for each multiple sequence alignment. In the querying step, it can align DNA or amino acid sequences back to these graphs. We first showcase that PanPA generates correct alignments on a panproteome from 1350 Escherichia coli. To demonstrate that panproteomes allow comparisons at longer phylogenetic distances, we compare DNA and protein alignments from 1073 Salmonella enterica assemblies against E.coli reference genome, pangenome, and panproteome using BWA, GraphAligner, and PanPA, respectively; with PanPA aligning around 22% more sequences. We also aligned a DNA short-reads whole genome sequencing (WGS) sample from S.enterica against the E.coli reference with BWA and the panproteome with PanPA, where PanPA was able to find alignment for 68% of the reads compared to 5% with BWA.

PanPA is available at https://github.com/fawaz-dabbaghieh/PanPA.

RevDate: 2023-12-23

Yin S, Zhao L, Liu J, et al (2023)

Pan-genome Analysis of WOX Gene Family and Function Exploration of CsWOX9 in Cucumber.

International journal of molecular sciences, 24(24): pii:ijms242417568.

Cucumber is an economically important vegetable crop, and the warts (composed of spines and Tubercules) of cucumber fruit are an important quality trait that influences its commercial value. WOX transcription factors are known to have pivotal roles in regulating various aspects of plant growth and development, but their studies in cucumber are limited. Here, genome-wide identification of cucumber WOX genes was performed using the pan-genome analysis of 12 cucumber varieties. Our findings revealed diverse CsWOX genes in different cucumber varieties, with variations observed in protein sequences and lengths, gene structure, and conserved protein domains, possibly resulting from the divergent evolution of CsWOX genes as they adapt to diverse cultivation and environmental conditions. Expression profiles of the CsWOX genes demonstrated that CsWOX9 was significantly expressed in unexpanded ovaries, especially in the epidermis. Additionally, analysis of the CsWOX9 promoter revealed two binding sites for the C2H2 zinc finger protein. We successfully executed a yeast one-hybrid assay (Y1H) and a dual-luciferase (LUC) transaction assay to demonstrate that CsWOX9 can be transcriptionally activated by the C2H2 zinc finger protein Tu, which is crucial for fruit Tubercule formation in cucumber. Overall, our results indicated that CsWOX9 is a key component of the molecular network that regulates wart formation in cucumber fruits, and provide further insight into the function of CsWOX genes in cucumber.

RevDate: 2023-12-23

Zhang Y, Pan M, Wang Q, et al (2023)

Complete Genome Sequence and Pan-Genome Analysis of Shewanella oncorhynchi Z-P2, a Siderophore Putrebactin-Producing Bacterium.

Microorganisms, 11(12): pii:microorganisms11122961.

In this study, we reported the complete genome sequence of Shewanella oncorhynchi for the first time. S. oncorhynchi Z-P2 is a bacterium that produces the siderophore putrebactin. Its genome consists of a circular chromosome of 5,034,612 bp with a G + C content of 45.4%. A total of 4544 protein-coding genes, 109 tRNAs and 31 rRNAs were annotated by the RAST. Five non-ribosomal peptide synthetase (NRPS) and polyketide synthetase (PKS) gene clusters were identified by the antiSMASH analysis. The pan-genome analysis of Z-P2 and 10 Shewanella putrefaciens revealed 9228 pan-gene clusters and 2681 core gene clusters, with Z-P2 having 618 unique gene clusters. Additionally, the gene cluster involved in putrebactin biosynthesis in Z-P2 was annotated, and the mechanism of putrebactin biosynthesis was analyzed. The putrebactin produced by Z-P2 was detected using UPLC-MS analysis, with an [M + H][+] molecular ion at m/z 373.21. These findings provide valuable support for further research on the genetic engineering of putrebactin biosynthetic genes of Z-P2 and their potential applications.

LOAD NEXT 100 CITATIONS

ESP Quick Facts

ESP Origins

In the early 1990's, Robert Robbins was a faculty member at Johns Hopkins, where he directed the informatics core of GDB — the human gene-mapping database of the international human genome project. To share papers with colleagues around the world, he set up a small paper-sharing section on his personal web page. This small project evolved into The Electronic Scholarly Publishing Project.

ESP Support

In 1995, Robbins became the VP/IT of the Fred Hutchinson Cancer Research Center in Seattle, WA. Soon after arriving in Seattle, Robbins secured funding, through the ELSI component of the US Human Genome Project, to create the original ESP.ORG web site, with the formal goal of providing free, world-wide access to the literature of classical genetics.

ESP Rationale

Although the methods of molecular biology can seem almost magical to the uninitiated, the original techniques of classical genetics are readily appreciated by one and all: cross individuals that differ in some inherited trait, collect all of the progeny, score their attributes, and propose mechanisms to explain the patterns of inheritance observed.

ESP Goal

In reading the early works of classical genetics, one is drawn, almost inexorably, into ever more complex models, until molecular explanations begin to seem both necessary and natural. At that point, the tools for understanding genome research are at hand. Assisting readers reach this point was the original goal of The Electronic Scholarly Publishing Project.

ESP Usage

Usage of the site grew rapidly and has remained high. Faculty began to use the site for their assigned readings. Other on-line publishers, ranging from The New York Times to Nature referenced ESP materials in their own publications. Nobel laureates (e.g., Joshua Lederberg) regularly used the site and even wrote to suggest changes and improvements.

ESP Content

When the site began, no journals were making their early content available in digital format. As a result, ESP was obliged to digitize classic literature before it could be made available. For many important papers — such as Mendel's original paper or the first genetic map — ESP had to produce entirely new typeset versions of the works, if they were to be available in a high-quality format.

ESP Help

Early support from the DOE component of the Human Genome Project was critically important for getting the ESP project on a firm foundation. Since that funding ended (nearly 20 years ago), the project has been operated as a purely volunteer effort. Anyone wishing to assist in these efforts should send an email to Robbins.

ESP Plans

With the development of methods for adding typeset side notes to PDF files, the ESP project now plans to add annotated versions of some classical papers to its holdings. We also plan to add new reference and pedagogical material. We have already started providing regularly updated, comprehensive bibliographies to the ESP.ORG site.

Electronic Scholarly Publishing
961 Red Tail Lane
Bellingham, WA 98226

E-mail: RJR8222 @ gmail.com

Papers in Classical Genetics

The ESP began as an effort to share a handful of key papers from the early days of classical genetics. Now the collection has grown to include hundreds of papers, in full-text format.

Digital Books

Along with papers on classical genetics, ESP offers a collection of full-text digital books, including many works by Darwin and even a collection of poetry — Chicago Poems by Carl Sandburg.

Timelines

ESP now offers a large collection of user-selected side-by-side timelines (e.g., all science vs. all other categories, or arts and culture vs. world history), designed to provide a comparative context for appreciating world events.

Biographies

Biographical information about many key scientists (e.g., Walter Sutton).

Selected Bibliographies

Bibliographies on several topics of potential interest to the ESP community are automatically maintained and generated on the ESP site.

ESP Picks from Around the Web (updated 07 JUL 2018 )