De Novo Annotation Approaches},
+ year = {2011},
+ month = {01},
+ volume = {6},
+ url = {http://dx.doi.org/10.1371%2Fjournal.pone.0016526},
+ pages = {e16526},
+ abstract = {
+ Transposable elements (TEs) are mobile, repetitive DNA sequences that are almost ubiquitous in prokaryotic and eukaryotic genomes. They have a large impact on genome structure, function and evolution. With the recent development of high-throughput sequencing methods, many genome sequences have become available, making possible comparative studies of TE dynamics at an unprecedented scale. Several methods have been proposed for the de novo identification of TEs in sequenced genomes. Most begin with the detection of genomic repeats, but the subsequent steps for defining TE families differ. High-quality TE annotations are available for the Drosophila melanogaster and Arabidopsis thaliana genome sequences, providing a solid basis for the benchmarking of such methods. We compared the performance of specific algorithms for the clustering of interspersed repeats and found that only a particular combination of algorithms detected TE families with good recovery of the reference sequences. We then applied a new procedure for reconciling the different clustering results and classifying TE sequences. The whole approach was implemented in a pipeline using the REPET package. Finally, we show that our combined approach highlights the dynamics of well defined TE families by making it possible to identify structural variations among their copies. This approach makes it possible to annotate TE families and to study their diversification in a single analysis, improving our understanding of TE dynamics at the whole-genome scale and for diverse species.
+ },
+ number = {1},
+ doi = {10.1371/journal.pone.0016526}
+ }]]>
+ Arabidopsis thaliana Junk DNA Reveals a Continuum between Repetitive Elements and Genomic Dark Matter},
+ year = {2014},
+ month = {04},
+ volume = {9},
+ url = {http://dx.doi.org/10.1371%2Fjournal.pone.0094101},
+ pages = {e94101},
+ abstract = {Eukaryotic genomes contain highly variable amounts of DNA with no apparent function. This so-called junk DNA is composed of two components: repeated and repeat-derived sequences (together referred to as the repeatome), and non-annotated sequences also known as genomic dark matter. Because of their high duplication rates as compared to other genomic features, transposable elements are predominant contributors to the repeatome and the products of their decay is thought to be a major source of genomic dark matter. Determining the origin and composition of junk DNA is thus important to help understanding genome evolution as well as host biology. In this study, we have used a combination of tools enabling to show that the repeatome from the small and reducing A. thaliana genome is significantly larger than previously thought. Furthermore, we present the concepts and results from a series of innovative approaches suggesting that a significant amount of the A. thaliana dark matter is of repetitive origin. As a tentative standard for the community, we propose a deep compendium annotation of the A. thaliana repeatome that may help addressing farther genome evolution as well as transcriptional and epigenetic regulation in this model plant.
},
+ number = {4},
+ doi = {10.1371/journal.pone.0094101}
+ }]]>
+