Originally, PANGOLIN used a maximum-likelihood-based assignment algorithm to assign query SARS-CoV-2 the most likely lineage sequence. Of the countries that have contributed SARS-CoV-2 data, 30% had genomes of this lineage. Article Sequence similarity. Yu, H. et al. Biol.
Don't blame pangolins, coronavirus family tree tracing could prove key The relatively fast evolutionary rate means that it is most appropriate to estimate shallow nodes in the sarbecovirus evolutionary history. 6, e14 (2017). We say that this approach is conservative because sequences and subregions generating recombination signals have been removed, and BFRs were concatenated only when no PI signals could be detected between them. July 26, 2021. It allows a user to assign a SARS-CoV-2 genome sequence the most likely lineage (Pango lineage) to SARS-CoV-2 query sequences. The Pango dynamic nomenclature is a popular system for classifying and naming genetically-distinct lineages of SARS-CoV-2, including variants of concern, and is based on the analysis of complete or near-complete virus genomes. Microbiol.
Pangolins may have incubated the novel coronavirus, gene study shows The shaded region corresponds to the Sprotein. Anderson, K. G., Rambaut, A., Lipkin, W. I., Holmes, E. C. & Garry, R. F. The proximal origin of SARS-CoV-2. All sequence data analysed in this manuscript are available at https://github.com/plemey/SARSCoV2origins. Rev. 5 (NRR1) are conservative in the sense that NRR1 is more likely to be non-recombinant than NRR2 or NRA3. Nguyen, L.-T., Schmidt, H. A., Von Haeseler, A. Holmes, E. C. The Evolution and Emergence of RNA Viruses (Oxford Univ. As informative rate priors for the analysis of the sarbecovirus datasets, we used two different normal prior distributions: one with a mean of 0.00078 and s.d. Cell 181, 223227 (2020). RegionB is 5,525nt long. Identifying the origins of an emerging pathogen can be critical during the early stages of an outbreak, because it may allow for containment measures to be precisely targeted at a stage when the number of daily new infections is still low. Results and discussion Genomic surveillance has been a hallmark of the COVID-19 pandemic that, in contrast to other pandemics, achieves tracking of the virus evolution and spread worldwide almost in real-time ( 4 ). Sarbecovirus, HCoV-OC43 and SARS-CoV data were assembled from GenBank to be as complete as possible, with sampling year as an inclusion criterion. Pangolin was developed to implement the dynamic nomenclature of SARS-CoV-2 lineages, known as the Pango nomenclature. A novel bat coronavirus closely related to SARS-CoV-2 contains natural insertions at the S1/S2 cleavage site of the Spike protein. The Artic Network receives funding from the Wellcome Trust through project no. J. Virol. Virus Evol. Because these subclades had different phylogenetic relationships in regionD (Supplementary Fig. The presence of SARS-CoV-2-related viruses in Malayan pangolins, in silico analysis of the ACE2 receptor polymorphism and sequence similarities between the Receptor Binding Domain (RBD) of the spike proteins of pangolin and human Sarbecoviruses led to the proposal of pangolin as intermediary. 4, vey016 (2018). J. Virol. In such cases, even moderate rate variation among long, deep phylogenetic branches will substantially impact expected root-to-tip divergences over a sampling time range that represents only a small fraction of the evolutionary history40. Smuggled pangolins were carrying viruses closely related to the one sweeping the world, say scientists. The estimated divergence times for the pangolin virus most closely related to the SARS-CoV-2/RaTG13 lineage range from 1851 (17301958) to 1877 (17461986), indicating that these pangolin lineages were acquired from bat viruses divergent to those that gave rise to SARS-CoV-2. Because the SARS-CoV-2 S protein has been implicated in past recombination events or possibly convergent evolution12, we specifically investigated several subregions of the Sproteinthe N-terminal domain of S1, the C-terminal domain of S1, the variable-loop region of the C-terminal domain, and S2. Yres, D. L. et al. PubMed SARS-CoV-2 is an appropriate name for the new coronavirus. We considered (1) the possibility that BFRs could be combined into larger non-recombinant regions and (2) the possibility of further recombination within each BFR. Epidemiology, genetic recombination, and pathogenesis of coronaviruses. Duchene, S. et al. Wang, L. et al. Provided by the Springer Nature SharedIt content-sharing initiative, Molecular and Cellular Biochemistry (2023), Nature Microbiology (Nat Microbiol) Are you sure you want to create this branch? 2a. We named the length-sorted BFRs as: BFRA (ntpositions 13,29119,628, length=6,338nt), BFRB (ntpositions 3,6259,150, length=5,526nt), BFRC (ntpositions 9,26111,795, length=2,535nt), BFRD (ntpositions 27,70228,843, length=1,142nt) and six further regions (EJ). R. Soc. Extended Data Fig. volume5,pages 14081417 (2020)Cite this article. 5, 536544 (2020). Avian influenza a virus (H7N7) epidemic in The Netherlands in 2003: course of the epidemic and effectiveness of control measures. 82, 18191826 (2008). Alternatively, combining 3SEQ-inferred breakpoints, GARD-inferred breakpoints and the necessity of PI signals for inferring recombination, we can use the 9.9-kb region spanning nucleotides 11,88521,753 (NRR2) as a putative non-recombining region; this approach is breakpoint-conservative because it is conservative in identifying breakpoints but not conservative in identifying non-recombining regions. The presence in pangolins of an RBD very similar to that of SARS-CoV-2 means that we can infer this was also probably in the virus that jumped to humans. 82, 48074811 (2008). The research leading to these results received funding (to A.R. & Li, X. Crossspecies transmission of the newly identified coronavirus 2019nCoV. performed Srecombination analysis. The new paper finds that the genetic sequences of several strains of coronavirus found in pangolins were between 88.5 percent and 92.4 percent similar to those of the novel coronavirus. performed recombination analysis for non-recombining regions1 and 2, breakpoint analysis and phylogenetic inference on recombinant segments. & Boni, M. F. Improved algorithmic complexity for the 3SEQ recombination detection algorithm. Virology 507, 110 (2017). Trova, S. et al. We compiled a set of 69SARS-CoV genomes including 58 sampled from humans and 11 sampled from civets and raccoon dogs. 36, 7597 (2002). While there is involvement of other mammalian speciesspecifically pangolins for SARS-CoV-2as a plausible conduit for transmission to humans, there is no evidence that pangolins are facilitating adaptation to humans. But some theories suggest that pangolins may be the source of the novel coronavirus. 1, vev016 (2015). PLoS Pathog. Bioinformatics 28, 32483256 (2012). Preprint at https://doi.org/10.1101/2020.04.20.052019 (2020). A SARS-like cluster of circulating bat coronaviruses shows potential for human emergence. 5. Microbes Infect. However, inconsistency in the nomenclature limits uniformity in its epidemiological understanding. In light of these time-dependent evolutionary rate dynamics, a slower rate is appropriate for calibration of the sarbecovirus evolutionary history. Given that these pangolin viruses are ancestral to the progenitor of the RaTG13/SARS-CoV-2 lineage, it is more likely that they are also acquiring viruses from bats. This is notable because the variable-loop region contains the six key contact residues in the RBD that give SARS-CoV-2 its ACE2-binding specificity27,37. Mol. 6, eabb9153 (2020). Google Scholar.
Phylogenetic Assignment of Named Global Outbreak Lineages 3) to examine the sensitivity of date estimates to this prior specification. and X.J. Since experts have suggested that pangolins may be the reservoir species for COVID-19, the scaly anteater has been catapulted into headlines, news reports, and conversationsand some are calling COVID-19 "the revenge of the . While pangolins could be acting as intermediate hosts for bat viruses to get into humansthey develop severe respiratory disease38 and commonly come into contact with people through traffickingthere is no evidence that pangolin infection is a requirement for bat viruses to cross into humans. Posterior means with 95% HPDs are shown in Supplementary Information Table 2. Hon, C. et al.
cov-lineages/pangolin - GitHub Virus Evol.
SARS-CoV-2 Variant Classifications and Definitions In our second stage, we wanted to construct non-recombinant regions where our approach to breakpoint identification was as conservative as possible. Biazzo et al. Developed by the Centre for Genomic Pathogen Surveillance. Divergence time estimates based on the three regions/alignments where the effects of recombination have been removed. The command line tool is open source software available under the GNU General Public License v3.0. Pink, green and orange bars show BFRs, with regionA (nt 13,29119,628) showing two trimmed segments yielding regionA (nt13,29114,932, 15,40517,162, 18,00919,628). Grey tips correspond to bat viruses, green to pangolin, blue to SARS-CoV and red to SARS-CoV-2. We focused on these three non-recombining regions/alignments for divergence time estimation; this avoids inappropriate modelling of evolutionary processes with recombination on strictly bifurcating trees, which can result in different artefacts such as homoplasies that inflate branch lengths and lead to apparently longer evolutionary divergence times. Biol.
Trafficked pangolins can carry coronaviruses closely related to To evaluate the performance procedure, we confirmed that the recombination masking resulted in (1) a markedly different outcome of the PHI test64, (2) removal of well-supported (bootstrap value >95%) incompatible splits in Neighbor-Net65 and (3) a near-complete reduction of mosaic signal as identified by 3SEQ. GitHub - cov-lineages/pangolin: Software package for assigning SARS-CoV-2 genome sequences to global lineages. Emergence of SARS-CoV-2 through recombination and strong purifying selection. Specifically, progenitors of the RaTG13/SARS-CoV-2 lineage appear to have recombined with the Hong Kong clade (with inferred breakpoints at 11.9 and 20.8kb) to form the CoVZXC21/CoVZC45-lineage. The plots are based on maximum likelihood tree reconstructions with a root position that maximises the residual mean squared for the regression of root-to-tip divergence and sampling time. Kosakovsky Pond, S. L., Posada, D., Gravenor, M. B., Woelk, C. H. & Frost, S. D. W. Automated phylogenetic detection of recombination using a genetic algorithm. PubMed Central Due to the absence of temporal signal in the sarbecovirus datasets, we used informative prior distributions on the evolutionary rate to estimate divergence dates. Using the most conservative approach to identification of a non-recombinant genomic region (NRR1), SARS-CoV-2 forms a sister lineage with RaTG13, with genetically related cousin lineages of coronavirus sampled in pangolins in Guangdong and Guangxi provinces (Fig. Early transmission dynamics in Wuhan, China, of novel coronavirus-infected pneumonia. Boni, M. F., Zhou, Y., Taubenberger, J. K. & Holmes, E. C. Homologous recombination is very rare or absent in human influenza A virus. In this approach, we considered a breakpoint as supported only if it had three types of statistical support: from (1) mosaic signals identified by 3SEQ, (2) PI signals identified by building trees around 3SEQs breakpoints and (3) the GARD algorithm35, which identifies breakpoints by identifying PI signals across proposed breakpoints. Split diversity in constrained conservation prioritization using integer linear programming. Indeed, the rates reported by these studies are in line with the short-term SARS rates that we estimate (Fig. In case of DRAGEN COVID Lineage tool, the minimum accepted alignment score was set to 22 and results with scores <22 were discarded. The difficulty in inferring reliable evolutionary histories for coronaviruses is that their high recombination rate48,49 violates the assumption of standard phylogenetic approaches because different parts of the genome have different histories. Another similarity between SARS-CoV and SARS-CoV-2 is their divergence time (4070years ago) from currently known extant bat virus lineages (Fig.
New COVID-19 Variant Alert: Everything We Know About the IHU Variant With horseshoe bats currently the most plausible origin of SARS-CoV-2, it is important to consider that sarbecoviruses circulate in a variety of horseshoe bat species with widely overlapping species ranges57. Maciej F. Boni, Philippe Lemey, Andrew Rambaut or David L. Robertson. Divergence dates between SARS-CoV-2 and the bat sarbecovirus reservoir were estimated as 1948 (95% highest posterior density (HPD): 18791999), 1969 (95% HPD: 19302000) and 1982 (95% HPD: 19482009), indicating that the lineage giving rise to SARS-CoV-2 has been circulating unnoticed in bats for decades. Methods Ecol. CAS Phylogenies of subregions of NRR1 depict an appreciable degree of spatial structuring of the bat sarbecovirus population across different regions (Fig. A., Lytras, S., Singer, J. . Med. Temporal signal was tested using a recently developed marginal likelihood estimation procedure41 (Supplementary Table 1). Now, the two researchers used genomic sequencing to compare the DNA of the new coronavirus in humans with that in animals and found a 99% match with pangolins. In March, when covid cases began spiking around India, Bani Jolly went hunting for answers in the virus's genetic code. In the presence of time-dependent rate variation, a widely observed phenomenon for viruses43,44,52, slower prior rates appear more appropriate for sarbecoviruses that currently encompass a sampling time range of about 18years. Evol. A tag already exists with the provided branch name. Viruses 11, 174 (2019). We thank A. Chan and A. Irving for helpful comments on the manuscript.
Future trajectory of SARS-CoV-2: Constant spillover back and forth Lancet 395, 949950 (2020). Yuan, J. et al. Download a free copy. PureBasic 53 13 constellations Public Python 42 17 This leaves the insertion of polybasic. and P.L.) Its origin and direct ancestral viruses have not been . and JavaScript. Eight other BFRs <500nt were identified, and the regions were named BFRAJ in order of length.