Date of Award
Master of Science
Genetics, Development and Cell Biology
Bioinformatics and Computational Biology
Erik W. Vollbrecht
Transposons, with the ability to integrate into new positions in the genome, can disrupt a gene's function and thereby have been utilized as tools for genome mutagenesis. Critical to improving efficiency of such applications is to elucidate the patterns and preferences of
insertion sites selection. We here focus on understanding target site selection of transposon Ac/Ds, one of the best-characterized transposon systems in plants, by exploring various DNA features and predicting insertion sites.
A package named DnaFVP (DNA Feature Calculation, Visualization and Vector Preparation) was first developed for calculation, visualization and analysis of various DNA features, including nucleotide sequence features and a broad list of structural/physical properties. In addition, this package allows data preparation prior to calculating features and/or preparation of feature vectors for machine learning. It is developed for building a semi-automatic pipeline to explore various DNA features of any collection of genomic DNA sequences of interest and to prepare feature vectors for
further machine learning.
By use of combined nucleotide and structural features with application of the DnaFVP package, we prepared various feature vectors and predicted Ds insertion sites for machine learning. Training datasets included well-evidenced Ds insertion events (1605 events in maize and 2078 events in Arabidopsis) as positive datasets and 2000 random sampled genomic coordinates in genic regions from maize and Arabidopsis as negative datasets. An ROC (Receiver Operating Characteristic) of 0.77 in maize, 0.85 in Arabidopsis, and 0.82 in a combined dataset of maize and Arabidopsis have been achieved. One initially tested dataset in maize shows interesting results. Our prediction may provide further insight to the Ac/Ds transposition mechanism, and facilitate the ease of targeted mutagenesis and gene delivery mediated by transposons.
Kuang, Xianyan, "Computational prediction of Ds transposon insertion sites in plants using DNA structural features" (2011). Graduate Theses and Dissertations. 10452.