An Optimized Feature Selection Method for High Dimensional Data

J. Priyadharshini; C. Kanimozhi

Authors

J. Priyadharshini PG Scholar, Department of Computer Science and Engineering, Anna University, BIT-Campus, Tiruchirappalli, India
C. Kanimozhi Assistant Professor, Department of Computer Science and Engineering, Anna University, BIT-Campus, Tiruchirappalli, India

Keywords:

ACO, Accuracy, Feature selection, GA, Subset

Abstract

High dimensional datasets consists of large number of both relevant and irrelevant features, hence the computational and prediction time to process the dataset increases. Feature selection (FS) extracts the most relevant features which are known as subsets for prediction and the computational time can be reduced. The dataset is taken from National Centre of Biotechnology Information (NCBI), which is a widely used benchmark dataset for feature selection from Microarray Gene Expression. In gene expression data analysis, the problems of cancer classification and gene selections are closely related. Selecting informative genes is essential for classification performance. However, high dimensional dataset causes a high computational cost and over fitting during classification. Thus it is necessary to reduce the dimension of data by feature selection. In this paper mean based Genetic Algorithm (GA) is proposed to select the optimal subsets from the raw dataset based on the mean value of the features and the accuracy of the subset is evaluated using a classifier of Support vector machine (SVM), which reduces the complexity of the model in terms of computational cost and size. The proposed method is compared with the ant colony optimization (ACO) algorithm and the result shown that the proposed method has a better accuracy rate.

Downloads

Download data is not yet available.

An Optimized Feature Selection Method for High Dimensional Data

Authors

Keywords:

Abstract

Downloads

Downloads

Published

Issue

Section

License

How to Cite

Sidebar-1

For Authors

Indexing/Abstracting