CMF-AGAwER

This is the implementation of a paper entitled Feature selection using classification error impurity and adaptive genetic algorithm with external repository. The proposed feature selection method (CMF-AGAwER) is a hybrid (ensemble + wrapper). The ensemble section combines the top 50 features ranked by Classification Error Impurity (CEI), a newly proposed frequency-based method, with the top 50 features ranked by Mutual Information (MI) and the top 50 features ranked by Fisher Ratio (FR). The wrapper section utilizes the Adaptive Genetic Algorithm with External Repository (AGAwER). AGAwER's primary contribution lies in its external repository, coupled with an adaptive strategy that adjusts crossover and mutation rates to enhance the exploration capability of genetic algorithms (GA).

Classification Error Impurity (CEI) is a frequency-based filter ranking method, which belongs to a series of frequency-based rankers:

1- Mutual Congestion (MC). Publication year: 2019

2- Sorted Label Interference (SLI). Publication year: 2021

3- Sorted Label Interference-gamma (SLI-gamma). Publication year: 2022

4- Extended Mutual Congestion (EMC). Publication year: 2022

5- Maximum Pattern Recognition (MPR). Publication year: 2024

6- Distance-based Mutual Congestion (DMC). Publication year: 2024 (in preprint)

7- Classification Error Impurity (CEI). Publication year: 2024

##################### Instruction #########################

After loading the corresponding dataset from your local drive:

1- Run lines 63-79 to calculate the summation of samples per label

2- Run lines 130-158 for Classification Error Impurity (CEI) function

Repeat steps 3 and 4 for CEI, MI, and FR. Then, concatenate the corresponding lists of sorted features using features = np.unique(np.concatenate((top_CEI, top_FR, top_MI))). For MI and FR, ensure you use alpha.argsort() instead of alpha.argsort()[::-1]. Alternatively, you can select the corresponding features.npy file for each dataset directly from the 'The top 50 features...' folder, eliminating the need to repeat steps 3 and 4. Later in Step 7, you can create the search space of AGAwER using this features.npy.

 3- Run lines 163-175 to calculate the alpha.npy which assigns a weight to each feature based on CEI

 4- Run lines 182-185 for sorting the features of the dataset based on the corresponding alpha

5- Run lines 200-228 for metrics evaluation using 5-fold stratified cross validation before applying CMF-AGAwER

6- Run lines 268-509 for the corresponding functions of AGAwER

7- Run line 513 and create the search space of AGAwER using the corresponding features.npy

8- Run lines 522-925 for the main body of AGAwER

After line 928, you will find code that calculates the average prediction accuracy using existing filters in the paper, such as SHAP, Random Forest, ReliefF, etc., based on the top N features (where N = 10, 20, 30, 40, 50). Additionally, the code includes functionality for generating illustrations and figures for the paper.

Cite this article

Hossein Nematzadeh, José García-Nieto, Ismael Navas-Delgado, José F. Aldana-Montes, Feature selection using a classification error impurity algorithm and an adaptive genetic algorithm improved with an external repository, Knowledge-Based Systems, Volume 301, 2024, 112345. https://doi.org/10.1016/j.knosys.2024.112345.

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
Datasets		Datasets
The top 50 features selected from each ranker (CEI, MI, and FR) and their concatenation (features.npy)		The top 50 features selected from each ranker (CEI, MI, and FR) and their concatenation (features.npy)
CMF-AGAwER.py		CMF-AGAwER.py
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CMF-AGAwER

About

Releases

Packages

Languages

License

KhaosResearch/CMF-AGAwER

Folders and files

Latest commit

History

Repository files navigation

CMF-AGAwER

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages