Diagnosis of Obesity Level based on Bagging Ensemble Classifier and Feature Selection Methods

Asaad Alzayed, Waheeda Almayyan and Ahmed Al-Hunaiyyan; Asaad Alzayed; Waheeda Almayyan; Ahmed Al-Hunaiyyan

doi:10.5121/ijaia.2022.13203

Volume 13, Number 2

Diagnosis of Obesity Level based on Bagging Ensemble Classifier and Feature Selection Methods

Authors

Asaad Alzayed, Waheeda Almayyan and Ahmed Al-Hunaiyyan, Collage of Business Studies, PAAET, Kuwait

Abstract

In the current era, the amount of data generated from various device sources and business transactions is rising exponentially, and the current machine learning techniques are not feasible for handling the massive volume of data. Two commonly adopted schemes exist to solve such issues scaling up the data mining algorithms and data reduction. Scaling the data mining algorithms is not the best way, but data reduction is feasible. There are two approaches to reducing datasets selecting an optimal subset of features from the initial dataset or eliminating those that contribute less information. Overweight and obesity are increasing worldwide, and forecasting future overweight or obesity could help intervention. Our primary objective is to find the optimal subset of features to diagnose obesity. This article proposes adapting a bagging algorithm based on filter-based feature selection to improve the prediction accuracy of obesity with a minimal number of feature subsets. We utilized several machine learning algorithms for classifying the obesity classes and several filter feature selection methods to maximize the classifier accuracy. Based on the results of experiments, Pairwise Consistency and Pairwise Correlation techniques are shown to be promising tools for feature selection in respect of the quality of obtained feature subset and computation efficiency. Analyzing the results obtained from the original and modified datasets has improved the classification accuracy and established a relationship between obesity/overweight and common risk factors such as weight, age, and physical activity patterns.

Keywords

Data mining, Obesity, Feature reduction, Filter feature selection, Bagging algorithm.