Developing Pattern Recognition and Interpretable Convolutional Neural Network based Frameworks for Identifying Drug resistant and Pan cancer miRNAs from Expression Data

No Thumbnail Available

Date

2026-01-14

Journal Title

Journal ISSN

Volume Title

Publisher

Indian Statistical Institute, Kolkata

Abstract

Micro Ribonucleic Acids (miRNAs) are short length (∼24) non-coding RNAs and are considered as key biomarkers in cancer diagnosis and treatment. They play a vital role in classifying cancer patients from normal ones and drug resistant patients from control ones. The control patients are those who have not received any drug for cancer treatment. The objective is to identify a subset of miRNAs those help in the classification of the patients using expression data. The thesis is comprised of four contributory chapters in addition to an introduction and conclusion. In the first two contributory chapters, computational methods for ranking and selecting miRNAs associated with drug resistance in cancer are introduced. In the fourth and fifth chapters, deep learning based methods are presented for identifying miRNAs for various cancer classes in pan-cancer data. The contributory chapters are as follows:  Selecting drug-resistant miRNAs in cancer using Euclidean distance with fold change based score.  Integrating fuzzy rough set-based entropies for identifying drug resistant miRNAs and classifying cancer patients.  Interpretable convolutional neural network for selecting miRNAs from multiple cancer classes and cancer subtypes through pan-cancer analysis.  Set-theoretic explainable AI-based attribution score for identifying miRNAs in pan-cancer data. In Chapter 1, an introduction to the related problems, literature review, motivation, and the organization of the thesis are provided. In Chapter 2, two methods to predict the miRNAs associated with drug resistance in cancer are presented. While, in the first method, a score is developed using the Euclidean distance with weighted fold change (EDWFC), in the second method, a histogram-based clustering and Euclidean distance with fold change-based ranking (HCEDFCR) is introduced. The EDWFC provides a ranked list of miRNAs for classifying control and drug-resistant patients and the HCEDFCR returns a group of miRNAs associated with drug resistance. The methods are trained with the help of existing biological knowledge. In Chapter 3, two new z score based fuzzy rough relevance and redundancy entropies are developed, and then a weighted framework is introduced to integrate the entropies for ranking and selecting miRNAs. The selected miRNAs are used for classifying the control and drug-resistant patients. In Chapter 4, an interpretable onedimensional convolutional neural network model (ICNNM) is developed and it is optimized in terms of hyperparameters for identifying classes of patients among multiple cancer classes in pan-cancer data. An attribution scores is also introduced using SHapley Additive exPlanations (SHAP) values for interpreting the miRNAs and selecting important miRNAs for each cancer class. In Chapter 5, a multi-objective framework for optimizing hyperparameters of a 1D CNN, called MOHCNN, and a set-theoretic explainable AI-based attribution scores (STEAAS) for miRNA selection are developed. The objectives for optimization are training error, validation error, and the number of training parameters. A set-theoretic explainable AI-based attribution score is developed for identifying miRNAs in various cancers. The score of a miRNA is represented by an ordered pair, where the first part represents the class score of the miRNA, and the second part denotes the reliability score of that miRNA for belonging to the class. The miRNAs with high class scores and reliability scores in a class are selected. All the developed methods are compared with related miRNA and gene selection techniques and popular classifiers. Data from public repositories such as Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA) data) are used. The biological significance of the miRNAs, selected by the developed methods, is established using publicly available web based bioinformatics tools and existing literature.

Description

This thesis is under the supervision of Prof. Shubhra Sankar Ray

Keywords

Pattern Recognition, Explainable AI, miRNA Expression, Drug Resistance, Pan cancer

Citation

165p.

Collections

Endorsement

Review

Supplemented By

Referenced By