Introduction to the Retroviridae Family
The Retroviridae family represents a widespread and diverse group of viruses found across a broad spectrum of vertebrate hosts. These viruses, commonly known as retroviruses, are characterized by their unique replication process involving reverse transcription. Retrovirions, typically 80-100 nm in diameter, are enveloped particles distinguished by their surface glycoproteins and an internal core that houses the viral genome and essential enzymes for replication. A key feature of retroviruses is the morphological variability of their inner core, which often serves as a distinguishing characteristic among different genera within the family.
The replication cycle of the Retroviridae family is defined by reverse transcription, a process where the viral positive-sense RNA genome is used as a template to synthesize double-stranded DNA. This DNA is then integrated into the host cell’s chromosomal DNA, forming a provirus. The provirus then acts as a template for the production of new viral genomes. Notably, retroviral integration into germline cells can lead to the creation of heritable proviruses, known as endogenous retroviruses (ERVs). Over evolutionary timescales, this process has resulted in vertebrate genomes being populated with thousands of ERV loci, highlighting the long-standing interaction between retroviruses and their hosts.
Key Characteristics of the Retroviridae Family
The Retroviridae family is defined by a set of shared characteristics that underpin their classification and understanding.
Table 1. Defining Characteristics of the Retroviridae Family
Characteristic | Description | Example |
---|---|---|
Type Species | Murine leukemia virus | Murine leukemia virus, genus Gammaretrovirus, subfamily Orthoretrovirinae (AF033811) |
Virion Structure | Enveloped, spherical, 80–100 nm diameter | 8 nm glycoprotein spikes on the surface |
Genome Type | Dimeric single-stranded RNA (ssRNA) | 7–13 kb (orthoretroviruses), or ssRNA (partially reverse-transcribed to dsDNA in virion) of 11–12 kb (spumaretroviruses) |
Replication Mechanism | Reverse transcription of RNA to dsDNA, integration into host genome | Host cell RNA polymerase II transcribes proviral DNA to synthesize new viral genomes |
Translation Strategy | Capped and polyadenylated genomic and subgenomic mRNAs | Spliced mRNAs for complex gene expression |
Host Range | Primarily Vertebrates | Wide variety of species across different vertebrate classes |
Taxonomic Classification | Realm Riboviria, Kingdom Pararnavirae, Phylum Artverviricota, Class Revtraviricetes, Order Ortervirales | 2 subfamilies (Orthoretrovirinae, Spumaretrovirinae), 11 genera, 68 species |
Virion Structure and Morphology of Retroviruses
Retrovirions exhibit a complex architecture that is crucial for their infectivity and replication cycle.
Morphology Details
Retroviruses are characterized by their spherical, enveloped virions, measuring between 80 and 100 nm in diameter (Vogt 1997). The surface of the virion is studded with glycoprotein projections, approximately 8 nm in length, which are irregularly spaced and play a critical role in host cell attachment and entry. Internally, the virion contains a capsid core composed of assembled capsid proteins. This core encloses a nucleocapsid complex, which consists of nucleocapsid proteins and the viral nucleic acid genome.
Figure 1. Retroviridae Virion Structure. This figure illustrates the structure of retrovirus particles. (Top) A schematic representation (not to scale) showing the arrangement of key structural components and proteins common to retroviral virions. Key: MA – matrix; CA- capsid; NC – nucleocapsid; PR – protease; RT- reverse transcriptase; IN -integrase; SU – surface subunit; TM – transmembrane subunit. (Bottom) Electron micrographs showing budding and mature virus particles. Panel (A): avian leukosis virus (Alpharetrovirus genus), exhibiting type “C” morphology. Panel (B): mouse mammary tumor virus (Betaretrovirus), type “B” morphology. Panel (C): murine leukemia virus (Gamamaretrovirus). Panel (D): bovine leukemia virus (Deltaretrovirus). Panel (E): human immunodeficiency virus 1 (Lentivirus). Panel (F): simian foamy virus Pan troglodytes schweinfurthii (SFVpsc; formerly PFV or HFV) (Simiispumavirus, Spumaretrovirinae subfamily). (Image courtesy of M. Gonda, adapted with permission from Coffin et al., 1997).
The shape of the nucleocapsid varies among retroviral genera. It is eccentric in Betaretrovirus, concentric in Alpharetrovirus, Gammaretrovirus, and Deltaretrovirus, and rod-shaped or truncated cone-shaped in Lentivirus. Spumaretroviruses, belonging to the subfamily Spumaretrovirinae, possess concentric, spherical nucleocapsids.
Historically, electron microscopy was used to classify retroviruses based on their morphology. Alpharetroviruses and Gammaretroviruses, which assemble immature capsids at the plasma membrane, were termed C-type viruses. In contrast, Betaretroviruses were known to assemble A-type particles (immature capsids) in the cytoplasm, which then budded with either a B-type (like mouse mammary tumor virus, MMTV) or D-type (like Mason-Pfizer monkey virus, MPMV) morphology. While this classification scheme is no longer used for formal taxonomy, these morphological descriptions are still occasionally referenced.
Physicochemical and Physical Properties
Retrovirions exhibit specific physicochemical properties. Their buoyant density is measured at 1.16–1.18 g/cm³ in sucrose density gradients, and their sedimentation coefficient (S₂₀,w) is approximately 600S in sucrose gradients. Retrovirions are sensitive to environmental factors such as heat, detergents, and formaldehyde. The surface glycoproteins can be partially removed by proteolytic enzymes or mechanical forces. However, retroviruses are relatively resistant to ultraviolet (UV) light.
Nucleic Acid Composition
Orthoretroviruses typically contain a genome consisting of a homodimer of linear, positive-sense, single-stranded RNA (ssRNA), with each monomer ranging from 7 to 13 kb in size (Goff 2013). RNA constitutes about 2% of the virion’s dry weight, and the two monomers are held together by hydrogen bonds. Each RNA monomer is capped at the 5′-end with a 7-methyl cap structure (type 1) and polyadenylated at the 3′-end. Purified virion RNA is not infectious. Critically, each monomer is associated with a specific tRNA molecule base-paired to a primer binding site (PBS) near the 5′-end, involving about 18 nucleotides at the tRNA’s 3′-end. Other RNA and small DNA fragments of host origin found within virions are considered to be incidental. Spumaretroviruses differ in that their virions contain some double-stranded DNA (dsDNA), as reverse transcription of the viral RNA genome begins within 5–10% of released virus particles. The precise structure of this DNA remains to be fully characterized.
Protein Components
Proteins are the major component of retrovirions, constituting approximately 60% of their dry weight. A standardized nomenclature for retroviral proteins has been established (Leis et al., 1988). Retroviral envelopes contain two primary glycoproteins: the surface subunit (SU) and the transmembrane subunit (TM), both encoded by the env gene. Some spumaretroviruses have a third envelope protein, the leader peptide (LP). Internally, there are 3–6 non-glycosylated structural proteins encoded by the gag gene. These include, from the amino terminus: matrix (MA), capsid protein (CA), and nucleocapsid (NC). A fourth Gag protein, implicated in virion budding, varies in its position. The MA protein is often myristoylated with a myristyl moiety. Essential enzymes include protease (PR, from pro gene), reverse transcriptase (RT, from pol gene), and integrase (IN, also from pol gene). Some viruses also contain a dUTPase (DU) of uncertain function. Spumaretroviruses encode a single Gag protein that is cleaved once near the carboxyl terminus in about half of the proteins. Primate lentiviruses uniquely incorporate additional small, virally-encoded accessory proteins like p6, Vpr, and Vpx, which function during the early stages of infection.
Lipid and Carbohydrate Content
Lipids comprise about 35% of the virion dry weight, derived from the host cell’s plasma membrane during budding. Carbohydrates make up approximately 3% of the virion weight, with variations depending on the specific virus. At least the SU envelope protein, and often both SU and TM, are glycosylated. In spumaviruses, the Env leader peptide is also glycosylated. Cellular glycolipids and some glycoproteins are also incorporated into the virion envelope.
Genome Organization and Replication Cycle
Retroviruses employ a unique replication strategy centered around reverse transcription and proviral integration.
Reverse Transcription and Provirus Formation
A defining characteristic of retroviruses is the reverse transcription of their RNA genome into double-stranded DNA (dsDNA) (Telesnitsky and Goff 1997). This dsDNA contains long terminal repeats (LTRs) at both ends. Following synthesis in the cytoplasm, this viral DNA is transported to the nucleus and integrated into the host cell’s genome, forming a provirus (Brown 1997). The integrated provirus becomes a permanent part of the host cell’s DNA and serves as the template for all subsequent viral RNA and mRNA synthesis, utilizing the host cell’s RNA polymerase II (Goff 2013).
Figure 2. Retroviridae Reverse Transcription. This diagram illustrates the mechanism of reverse transcription in retroviruses. The positive-sense viral RNA genome is shown in blue, while DNA intermediates and dsDNA are depicted in red. The typical viral RNA genome structure is 5′-R-U5-PBS-gag-pro-pol-env-PPT-U3-R-3′. Reverse transcriptase initiates negative-strand DNA synthesis using a tRNA primer complementary to the primer binding site (PBS). Synthesis proceeds to the 5′-end, followed by a “jump” to the 3′-end of the same or another RNA genome (simplified to one in the figure) mediated by R region complementarity. RNAse H digests the RNA template during minus-strand synthesis, leaving a PPT fragment that primes positive-strand DNA synthesis. Some viruses use a central polypurine tract (CPPT) for positive-strand initiation. The tRNA primer also templates PBS regeneration. A second template switch occurs when the elongated positive-strand jumps to the 3′-end of the minus-strand template via PBS complementarity. Extension of both strands results in the final dsDNA molecule with complete 5′- and 3′-LTRs (U3-R-U5).
Orthoretroviruses package two copies of their RNA genome into each virion. The typical gene order in infectious retroviruses is 5′- gag–pro–pol–env-3′, encoding proteins essential for virion structure and replication. Some retroviruses also carry additional genes for regulatory or accessory functions, and some incorporate cell-derived oncogenes involved in carcinogenesis (Rosenberg and Jolicoeur 1997). Oncogene incorporation can occur through insertion into a complete retroviral genome or by replacing viral sequences, often rendering the virus replication-defective and requiring helper viruses for propagation. Cell-derived sequences may be expressed as fusion proteins with viral structural components.
Figure 3. Retroviridae Provirus Structures. This figure shows the provirus structures for representative viruses from different genera within the Retroviridae family, including Alpharetrovirus, Betaretrovirus, Gammaretrovirus, Deltaretrovirus, Epsilonretrovirus, Lentivirus, and Simiispumavirus.
Host Cell Entry
Entry into host cells is initiated by the interaction of the virion’s SU glycoprotein with specific receptors on the host cell surface (Greenwood et al., 2018; Overbaugh et al., 2001). This interaction triggers fusion of the viral envelope with the plasma membrane, either directly or following endocytosis. Retrovirus receptors are diverse cell surface proteins. For example, HIV-1 requires both the CD4 protein and a chemokine receptor (CCR5 or CXCR4) for entry. Gammaretroviruses utilize solute carrier (SLC) proteins as receptors. Alpharetroviruses, such as avian leukosis viruses (ALVs), use a range of receptors, including proteins related to LDL receptors, TNF-receptor family proteins, butyrophilins, and the chicken Na+/H+ exchanger protein (Barnard et al., 2006). The intracellular uncoating process after entry is not fully understood but occurs within the capsid core in the cytoplasm.
Replication Steps
Replication in orthoretroviruses begins with reverse transcription of the virion RNA into cDNA by the reverse transcriptase (RT) enzyme. This process uses the 3′-end of the tRNA bound to the PBS as a primer. The initial cDNA product is extended, and template switching occurs via the R regions at the RNA ends. The viral RNA is concurrently digested by the RNase H activity of RT. A PPT fragment primes positive-sense cDNA synthesis. The resulting dsDNA contains LTRs derived from the U3 and U5 regions of the viral RNA. In spumaretroviruses, reverse transcription occurs during viral assembly or release (Rethwilm 2010), with a similar pathway but different timing.
Reverse transcription is error-prone and recombination-prone due to template switching, leading to high genetic diversity within retroviral populations, often forming complex “swarms” of genetic variants in vivo.
Integration and Transcription
Retroviral DNA integration into the host chromosome is mediated by the viral integrase (IN) protein. This process involves removal of two bases from the viral DNA ends and duplication of a short host sequence at the integration site. Integration can occur at multiple sites, but once integrated, a provirus cannot transpose further within the same cell. Integration is essential for replication. Integration site preferences vary; HIV-1 favors actively transcribed genes, while murine leukemia viruses prefer regions near gene start sites (Kvaratskhelia et al., 2014).
The integrated provirus is transcribed by host RNA polymerase II, driven by transcriptional signals in the viral LTRs, producing virion RNA and mRNAs. Complex retroviruses and spumaretroviruses encode non-structural proteins, including transcriptional transactivators and RNA export factors. Some also encode proteins that counteract host defenses.
Translation and Assembly
Retroviruses produce several classes of mRNA. Full-length genomic mRNA translates gag, pro, and pol genes into Gag and Gag-Pro-Pol polyproteins. Spliced mRNAs, often including the env gene, produce envelope protein precursors. Viruses with additional genes generate further spliced mRNAs. Spumaretroviruses uniquely use an internal promoter (IP) in env to transcribe distal accessory genes. Most retroviral proteins are initially synthesized as polyproteins and require proteolytic cleavage for activation. Gag, Pro, and Pol proteins are derived from nested translation products. Pro and Pol translation often involves ribosomal frameshifting or stop codon readthrough. Spumaretroviruses synthesize Pro and Pol from separate spliced mRNAs, not as Gag-Pro-Pol fusions.
Figure 4. Retroviridae Spumaretrovirus Genome. This figure shows the genome organization of spumaretroviruses, exemplified by the 13.2 kbp simian foamy virus Pan troglodytes schweinfurthii (SFVpsc) provirus. It highlights the LTRs, internal promoter (IP), protein-coding regions (gag, pro, pol, env, tas, and bet), and transcripts (solid line arrows with protein names).
Genomic RNA packaging is mediated by a packaging signal (Ψ), typically located in the 5′-end leader U3 and gag regions (Johnson and Telesnitsky 2010). Ψ is crucial for genome encapsidation but usually absent from subgenomic mRNAs (except in alpharetroviruses). Spumaviruses may have Ψ elsewhere in the genome. Ψ activity is defined by RNA structure, not primary sequence.
Capsid assembly occurs at the plasma membrane for most genera, or intracytoplasmically for spumaretroviruses and betaretroviruses. Virion release is by budding and maturation (Pornillos and Ganser-Pornillos 2019), often at lipid raft microdomains. Spumaviruses and deltaretroviruses are highly cell-associated. Maturation, involving polyprotein processing, occurs during or after budding in orthoretroviruses.
Biology and Pathogenesis of Retroviridae
Retroviruses are significant biological agents with widespread distribution and diverse impacts on vertebrate hosts.
Distribution and Endogenous Retroviruses (ERVs)
Retroviruses are broadly distributed as exogenous infectious agents in vertebrates (Goff 2013). Additionally, endogenous retroviruses (ERVs), resulting from germline infections, are inherited as Mendelian genes (Johnson 2019; Stoye 2012). ERVs are common in vertebrate genomes, constituting up to 10% of genomic DNA. ERV sequences are incorporated into retroviral phylogenies and cluster with exogenous retroviruses, aiding in understanding Retroviridae evolution (Gifford 2012; Jern et al., 2005). ERV presence across vertebrates indicates a long evolutionary history, dating back hundreds of millions of years (Aiewsakun and Katzourakis 2015, 2017). Most ERVs are inactive due to mutations, but some can be activated, replicating like exogenous viruses or recombining with them. ERVs have also been co-opted for host functions, such as viral proteins like syncytins and LTRs acting as regulatory elements (Chuong et al., 2017; Lavialle et al., 2013).
Retroviral Diseases and Transmission
Retroviruses are associated with a range of diseases (Rosenberg and Jolicoeur 1997; Maeda et al., 2008), including malignancies (leukemias, lymphomas, sarcomas, carcinomas), immunodeficiencies (AIDS), autoimmune diseases, motor neuron diseases, and acute tissue damage. Some retroviruses are non-pathogenic. Transmission occurs horizontally (blood, saliva, sexual contact) and vertically (embryo infection, milk, perinatal routes). ERVs are vertically transmitted through germline inheritance.
Antigenicity
Retroviral proteins contain type-specific and group-specific antigenic determinants. Envelope glycoproteins have type-specific determinants involved in neutralization. Group-specific determinants are shared within serogroups and sometimes between genera. Weak cross-reactivity exists between genera. T-cell epitopes are found on many structural proteins. Antigenic properties are not used for Retroviridae classification.
Subfamily and Genus Demarcation within Retroviridae
The Retroviridae family is divided into subfamilies and genera based on distinct molecular and biological criteria.
Subfamily Demarcation
The two subfamilies, Orthoretrovirinae and Spumaretrovirinae, are primarily distinguished by amino acid sequence comparisons of conserved regions within the Pol polyprotein, particularly reverse transcriptase (RT) (Jern et al., 2005; Xiong and Eickbush 1990). Other differentiating factors include gene expression strategies (separate Pol transcript in Spumaretrovirinae vs. Gag-Pro-Pol polyprotein in Orthoretrovirinae), differences in Gag protein domains, and the timing of reverse transcription relative to the replication cycle.
Genus Demarcation
Genera within Retroviridae are delineated using phylogenetic analyses based on conserved RT domains. Rooting phylogenetic trees with RT sequences from viruses outside Retroviridae helps resolve genus relationships. Genus distinction also relies on the presence or absence of specific regulatory or accessory proteins, unique Env protein features, and amino acid sequence comparisons of the Env TM subunit. Parallel RT and TM comparisons can identify recombination events between genera. Spumaretrovirus genus classification is partly based on host association, with phylogenetic trees often mirroring host phylogeny, indicating long-term virus-host co-evolution. ERV sequence incorporation in RT phylogenies suggests the potential for additional genera from extant or extinct retroviruses (Gifford et al., 2018; Hayward et al., 2015).
Derivation of Genus and Subfamily Names
The names within Retroviridae reflect their characteristics or origins:
- Alpharetrovirus, Betaretrovirus, Deltaretrovirus, Epsilonretrovirus, Gammaretrovirus: Named using the first, second, fourth, fifth and third letters of the Greek alphabet respectively (alpha, beta, delta, epsilon, gamma).
- Bovispumavirus, Equispumavirus, Felispumavirus, Prosimiispumavirus, Simiispumavirus: Named based on their association with bovine, equine, feline, prosimian, and simian hosts respectively, combined with spumavirus.
- Lentivirus: From Latin lentus, meaning “slow”, referring to the slow disease progression characteristic of these viruses.
- Orthoretrovirinae: From Greek orthos, meaning “straight”.
- Retroviridae: From Latin retro, meaning “backwards”, referring to reverse transcriptase activity (RNA to DNA).
- Spumaretrovirinae: From Latin spuma, meaning “foam”, referring to the “foamy” cytopathic effect in cell culture.
Phylogenetic Relationships within the Retroviridae Family
Phylogenetic analysis strongly supports the division of Retroviridae into two subfamilies, Orthoretrovirinae and Spumaretrovirinae, based on conserved reverse transcriptase domains (Figure 5). Spumaretroviruses share unique features absent in orthoretroviruses. Endogenous retrovirus (ERV) sequences in vertebrate genomes also support this subfamily division.
Figure 5. Retroviridae Phylogeny. This phylogenetic tree illustrates the relationships among selected retroviruses, based on an amino-acid alignment spanning reverse transcriptase and the NTD and CCD domains of integrase (Xiong and Eickbush 1990; Lesbats et al., 2016). The unrooted tree, generated using maximum likelihood (PhyML3.2.2) (Guindon et al., 2010; Guindon and Gascuel 2003), was rooted to separate Orthoretrovirinae and Spumaretrovirinae subfamilies for clarity. Numbers at nodes indicate bootstrap support (100 replicates). Colored circles denote genera within each subfamily (5 in Spumaretrovirinae, 6 in Orthoretrovirinae). The phylogenetic tree and sequence alignment are available from the Resources page.
Evolutionary Relationships with Other Taxa
Retroviruses share deep evolutionary relationships with other families in the order Ortervirales, particularly in their reverse transcriptase (RT) protein (Xiong and Eickbush 1990; Krupovic et al., 2018). Families like Caulimoviridae, Metaviridae, Pseudoviridae, and Belpaoviridae also share characteristics with retroviruses, including capsid (CA), nucleocapsid (NC), protease (PR) domains, and tRNA priming (Krupovic et al., 2018). Metaviridae, Pseudoviridae, and Belpaoviridae also encode integrase (IN) and possess long terminal repeats (LTRs), similar to retroviruses, indicating a shared ancestry and evolutionary history within the Ortervirales order.
References
Aiewsakun and Katzourakis 2015
Aiewsakun and Katzourakis 2017
Barnard et al., 2006
Brown 1997
Chuong et al., 2017
Coffin et al., 1997
Gifford 2012
Gifford et al., 2018
Goff 2013
Greenwood et al., 2018
Guindon and Gascuel 2003
Guindon et al., 2010
Hayward et al., 2015
Jern et al., 2005
Johnson 2019
Johnson and Telesnitsky 2010
Krupovic et al., 2018
Kvaratskhelia et al., 2014
Lavialle et al., 2013
Leis et al., 1988
Lesbats et al., 2016
Maeda et al., 2008
Overbaugh et al., 2001
Pornillos and Ganser-Pornillos 2019
Rethwilm 2010
Rosenberg and Jolicoeur 1997
Stoye 2012
Telesnitsky and Goff 1997
Vogt 1997
Xiong and Eickbush 1990
Xiong and Eickbush 1990