Faster markov blanket with tabu search for efficient feature selection of microarray cancer datasets

Date
2019-12-23
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
In the field of medical science, particularly for diseases caused due to genetic reasons, the proper classification of genes is necessary to prescribe a cure for the same. Genes are required to be classified as per any particular characteristic that influences the cancer. Feature selection methods have been recognized as being important in this domain. This process assumes more importance wherein datasets containing a large number of variables (genes) are considered. In this thesis, we focus our work on the Tabu search technique combined with the Markov Blanket algorithm for feature selection and classification of microarray gene expression data. The HITON implementation of the Markov Blanket algorithm for feature selection was implemented and compared with the HITON plus Tabu search method on microarray gene expression datasets. We propose HITON with Tabu search for feature selection algorithm to obtain high classification performance for high dimensional microarray cancer datasets. Higher accuracy was achieved by applying Tabu search with HITON algorithm when tested with three classifiers - KNN, SVM, and NN. The proposed algorithm HITON + Tabu with SVM achieved 99.40% classification accuracy for the Prostate dataset, whereas HITON with SVM gave 98.04% classification accuracy, an increase by 1.36% accuracy using HITON + Tabu algorithm. For Leukemia dataset, HITON + Tabu with KNN achieved 99.83% classification accuracy, whereas HITON with KNN gave 98.81% classification accuracy, an increase by 1.02% accuracy using HITON + Tabu algorithm. For the Lung Cancer dataset, HITON + Tabu with SVM achieved 92.36% classification accuracy, whereas HITON with SVM gives 89% classification accuracy, an increase by 3.36% accuracy using HITON + Tabu algorithm. In addition, the proposed algorithm can be generalized to solve various other optimization problems.
Description
Keywords
Feature Selection, Microarray Data, Markov Blankets, Wrapper Methods, HITON, Tabu Search, Fitness Function, Crossover, Mutation, Cancer Classification, Support Vector Machine, Neural Network, Gene Selection
Citation