Research
High-dimensional datasets frequently arise in areas such as social media analysis, natural language processing, image recognition, and bioinformatics. These datasets may contain tens of thousands of features while having only a limited number of samples — sometimes only a few hundred or even fewer. Such high dimensionality can negatively affect classifier performance by increasing the risk of overfitting and significantly raising computational complexity.
Feature selection plays a crucial role in many machine learning applications involving high-dimensional, small-sample data. Identifying the most informative features not only improves predictive performance but also enhances interpretability and supports knowledge discovery across various domains.
Our research focuses on the development of novel feature selection methods as well as the comprehensive evaluation of existing approaches. We assess methods from multiple perspectives, including stability, the ability to correctly identify relevant features, and their impact on classification accuracy.
We also invite you to explore our WkNN-FS feature selection approach.