A stochastic markov-blanket framework strategy for microarray data

Date

2020-09-14

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Microarray technologies allow examining expression levels for thousands of genes under various experimental conditions. It has provided a new way of biological classification on a genomewide scale. The predictive accuracy is affected by the presence of thousands of noisy or useless genes from the classification point of view. The Key issue of data classification is to identify the smallest possible set of genes that can achieve good predictive accuracy. We applied the Stochastic Multiple Markov Blanket (SMMB) algorithm, which combines both stochastic ensemble strategy inspired by random forests and Bayesian Markov Blanket-based methods. The different classifiers used in this research are K-nearest Neighbour (KNN), Support Vector Machine (SVM), and Naïve Bayes (NB), on cancer microarray datasets: Cell Lymphomas, Prostate Cancer, Leukemia Cancer, Brain Tumor, and Lung Cancer. The algorithm was runs times on the described datasets to find a subset of genes having statistically meaningful conclusions. The five cancer microarray datasets used for the experiments and algorithms were implemented in R Studio. We compared SMMB with Hiton algorithms using both simulated and real datasets.

Description

Keywords

feature selection, microarray data, markov blankets, fitness function, cancer classification, support vector machine, K-nearest neighbour, Bayesian Network, naïve bayes, ensemble modelling gene selection

Citation