Volume 16, Number 1
"Predictive Modelling of Air Quality Index (AQI) Across Diverse Cities and States of India using Machine Learning: Investigating the Influence of Punjab's Stubble Burning on AQI Variability"
Authors
Kamaljeet Kaur Sidhu1, Habeeb Balogun1 and Dr.Kazeem Oluwakemi Oseni2, 1University of Westminster, United Kingdom, 2University of Bedfordshire, United Kingdom
Abstract
Air pollution is a common and serious problem nowadays and it cannot be ignored as it has harmful impacts on human health. To address this issue proactively, people should be aware of their surroundings, which means the environment where they survive. With this motive, this research has predicted the AQI based on different air pollutant concentrations in the atmosphere. The dataset used for this research has been taken from the official website of CPCB. The dataset has the air pollutant concentration from 22 different monitoring stations in different cities of Delhi, Haryana, and Punjab. This data is checked for null values and outliers. But, the most important thing to note is the correct understanding and imputation of such values rather than ignoring or doing wrong imputation. The time series data has been used in this research which is tested for stationarity using The Dickey-Fuller test. Further different ML models like CatBoost, XGBoost, Random Forest, SVM regressor, time series model SARIMAX, and deep learning model LSTM have been used to predict AQI. For the performance evaluation of different models, I used MSE, RMSE, MAE, and R2. It is observed that Random Forest performed better as compared to other models.
Keywords
Air Quality Index, Stubble burning, Air Pollution, Random Forest Regression, SARIMAX, RMSE, MAE, Time interpolation, Dickey-Fuller Test, Mean Absolute Percentage Error, Imputation, Missing Values, and Outliers