Skip navigation
Beta Version

Thesis

Computational Intelligence in the estimation of CRISPR-Cas9 cleavage sites

en_US
40
0

Attachments [ 1 ]

More Details

2019
Sanjeev Kumar
ICAR-INDIAN AGRICULTURAL STATISTICS RESEARCH INSTITUTE ICAR-INDIAN AGRICULTURAL RESEARCH INSTITUTE NEW DELHI
Computational Intelligence in the estimation of CRISPR-Cas9 cleavage sites
M.Sc

CRISPR-Cas9 system is one of the most used genome editing techniques in the recent time. In spite of its high potentiality to modify the specific target genes and region of the genome which are complementary of the designed guide RNA (or sgRNA), still it suffers from the off-target effect. Here, in this study, an attempt has been made to develop models based on three machine learning based techniques (i.e. Artificial Neural Network, Support Vector Machine and Random Forest) for estimation of the CRISPR-Cas9 cleavage sites to be cleaved by a given sgRNA. All these machine learning based models were exclusively developed on the plant dataset. The models were trained on the 70 percent of the collected on-target and off-target dataset of different plant species. Whereas the performance of the model were evaluated on remaining 30 percent of collected data based on following statistics; specificity, sensitivity, accuracy, precision and AUC. All together eleven models were trained based above machine learning techniques. Relative evaluation of these developed models reveals that model based on random forest technique shows better performance. Its area under ROC curve (AUC) was found to be 99.0%. Total six models based on ANN technique (ANN1-Logistic, ANN1-Tanh, ANN1-ReLU, ANN2-Logistic, ANN2-Tanh, and ANN-ReLU) and four SVM models (SVM-Linear, SVM-Polynomial, SVM-Gaussian and SVM-Sigmoid) were trained. The performance of ANN1-ReLU and SVM-Linear model were found to be better among ANN and SVM based models respectively. The best performing developed models were compared with other available off-target prediction tool (CRISTA) exclusively on plant dataset and it was found that our models outperforms the available tool. Keywords: CRISPR, Cas9, sgRNA, genome editing, off-target, Artificial Neural Network, Support Vector Machine, Random Forest, CRISTA.

T-10260

Comments

(Leave your comments here about this item.)

Item Analytics

 
 

Select desired time period