Adaptive Multiclassification With Lung Cancer Types Using High‐Dimensional Discriminant Analysis and Machine Learning Methods

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

International Journal of Mathematics and Mathematical Sciences

Abstract

This research investigates adaptive multiclassification methods for classifying lung cancer types using both high‐dimensional discriminant analysis (HDDA) and widely used machine learning (ML) approaches under challenging data conditions, including high dimensionality, multicollinearity, outliers, and imbalanced classes. The dataset consists of 1000 gene expressions as explanatory variables and four lung cancer types as response variables, categorizing the problem as high‐dimensional and imbalanced. HDDA introduces a statistically principled parametrization of the covariance matrix tailored for high‐dimensional data. At the same time, ML methods such as Naïve Bayes, K‐Nearest Neighbors, Support Vector Machine, Artificial Neural Network, and Random Forest offer flexible, data‐driven alternatives. While previous studies have separately investigated either discriminant analysis or ML, there is a lack of comparative studies that evaluate their performance simultaneously under such complex conditions. This study addresses this gap by systematically analyzing both approaches with balanced and imbalanced gene expression data. The central hypothesis of this study is that HDDA, despite being a classical statistical technique, can achieve performance comparable to or complementary with ML methods when applied to gene expression data. Experiments on both original and balanced datasets, across varying subsets of explanatory variables, show that data balancing consistently improves accuracy, precision, and recall. Among ML methods, Random Forest achieves the highest predictive performance on balanced data, while HDDA provides competitive and interpretable results across scenarios.

Description

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By