Note: this is a rough-draft, requires references and will likely be expanded. However this is likely a temporary location.
Evolutionary biologists seem to be largely agreed that at some point there was an RNA World. At this point “life” may have been little more than a than a strand of RNA, perhaps 50 nucleotides in length, similar to a form of Spiegelmann’s monster mentioned by Manfred Eigen in the late 1990s, possibly relying upon naturally-forming mineral compartments concentrate chemical reactions. Incidentally, we also know that under fairly ordinary conditions in the presence of montmorillonite — which is sometimes found near hydrothermal black smokers — and I would presume white smokers) RNA will naturally ligate to form linear polyribonucleotides to roughly 50 nucleotides in length.
Then, although this seems a little more hazy, something possessing a promiscuous reverse transcriptase similar to that of the Mauriceville retroplasmid that would have made possible reverse transcription into DNA for the first time. At point, Koonin argues, reverse transcription and transcription would have played nearly equal roles in the life-cycle of organisms.
Then it is well-known that bacterial type-I and type-II introns act as mobile retroelements capable of lateral transmission. They are essentially mobile retroelements. Spliceosomal introns found in eukaryotic genomes aren’t mobile, but are instead dependent upon the spliceosome for their removal from transcripts. However, it is widely agreed that they descended from early type-II introns.
Based on sequence analysis of the reverse transcriptase it is also clear that long interspersed nuclear repeat elements (LINEs) are closely related to long terminal repeat (LTR) retrotransposons which are likewise closely related to retroviruses. But LINEs appear to be much more closely related to type-II introns as the sequences of their highly conserved reverse transcriptases are virtually identical. Short Interspersed Nuclear Repeats (SINEs) are widely thought to be the result of template-switching during LINE reverse transcription.
Both LINEs and SINEs appear to be largely responsible for widespread repeats in our genome, which beyond a length of ten or so repetitions of their motif become subject to hypermutation — with hypermutations becoming more probable as they increase in length. As such single (e.g., AAA..) and double (GAGAGA…) may result in a certain tunability of regulatory regions and introns.
Likewise triple repeats appear to result in a tunability of protein coding regions. Point mutations break up sequences when the lengths of such repeats become more deleterious to the life of an organism than beneficial due to their ability to produce variation in later generations and consequent adaptability to different environments. Likewise, repeats promote recombination, including intrachromosomal and interchromosomal rearrangements. And while this makes cancer more likely, it lends more plasticity and evolveability to life, and even results in copy number variation (CNV) among genes.
However, the general consensus is that LTR-retrotransposons are derived from LINEs, and in fact they would appear to a chimeric product of a LINE and DNA transposon fusion. And the difference between LTR-retrotransposons and retroviruses is essentially the presence of an envelope (ENV) gene. Actually several lines of evidence point toward this — including similarities in the harpoon-like structure and function of ENV genes shared between highly disparate viruses.
The envelope gene would make possible the lateral transmission of exogenous retroviruses. Furthermore, the decay of the ENV gene would appear to result in the formation of endogenous retroviruses, with approximately 30,000 endogenous retroviruses from approximately 200 different families existing per haploid genome in humans, and more broadly with retroelements being responsible for 49% of our genome. Endogenous retroviruses would then be the result of exogenous retroviruses entering the genome of the germline cells, being passed vertically from generation to generation and take on a neutral or even symbiotic role as their ENV genes degenerate.
Interestingly the Major Histocompatible Complex (MHC) responsible for the adaptive immune system is dense in retroelements. The remnants of endogenous retroviruses, LINEs and SINEs And I remember reading at one point that a region of the V(J)D domain that is responsible for the somatic hypermutation that makes it possible for B and T cells to recognize new antigens includes the fusion of a LINE and a SINE. Don’t know much about this, however.
In placental mammals endogenous retroviruses appear to play an integral role by being responsible for creating a barrier to the mother’s immune system. As the developing embryo is only half related to the mother, the immune system would attack it in the same way as it attacks organ transplants — in the absence of some agent that suppresses this response. As such placental mammals wouldn’t have been possible without the presence of such endogenous retroviruses — and mammals would still be laying eggs.
Three such endogenous retroviruses play this role in our own species. However, it is my understanding that the strains of endogenous retroviruses that play this role in different species oftentimes show a great deal variation. It appears that new strains of exogenous retroviruses become endogenized, then essentially push out the more established, domesticated endogenous retroviruses and take over this role.
Likewise, it appears that both L1s and endogenous retroviruses play an integral role in the regulation of early embryonic development. It is during this time that we see a burst of activity where genes in three endogenous retroviruses become expressed in at least a dozen different organs. And in fact both endogenous retroviruses and L1s appear to play a role during peri-implantation. In humans, mice and sheep — and likely nearly all placental mammals.