Bayesian-optimized extreme gradient boosting models for classification problems: an experimental analysis of product return case

Dimensions

Bhattacharjee, Biplab, Unni, Kavya and Pratap, Maheshwar (2024) Bayesian-optimized extreme gradient boosting models for classification problems: an experimental analysis of product return case. Journal of Systems and Information Technology. ISSN 1328-7265 (In Press)

[thumbnail of 10-1108_JSIT-06-2020-0120.pdf]

Text
10-1108_JSIT-06-2020-0120.pdf - Published Version
Restricted to Repository staff only
Download (809kB) | Request a copy

Abstract

Purpose
Product returns are a major challenge for e-businesses as they involve huge logistical and operational costs. Therefore, it becomes crucial to predict returns in advance. This study aims to evaluate different genres of classifiers for product return chance prediction, and further optimizes the best performing model.

Design/methodology/approach
An e-commerce data set having categorical type attributes has been used for this study. Feature selection based on chi-square provides a selective features-set which is used as inputs for model building. Predictive models are attempted using individual classifiers, ensemble models and deep neural networks. For performance evaluation, 75:25 train/test split and 10-fold cross-validation strategies are used. To improve the predictability of the best performing classifier, hyperparameter tuning is performed using different optimization methods such as, random search, grid search, Bayesian approach and evolutionary models (genetic algorithm, differential evolution and particle swarm optimization).

Findings
A comparison of F1-scores revealed that the Bayesian approach outperformed all other optimization approaches in terms of accuracy. The predictability of the Bayesian-optimized model is further compared with that of other classifiers using experimental analysis. The Bayesian-optimized XGBoost model possessed superior performance, with accuracies of 77.80% and 70.35% for holdout and 10-fold cross-validation methods, respectively.

Research limitations/implications
Given the anonymized data, the effects of individual attributes on outcomes could not be investigated in detail. The Bayesian-optimized predictive model may be used in decision support systems, enabling real-time prediction of returns and the implementation of preventive measures.

Originality/value
There are very few reported studies on predicting the chance of order return in e-businesses. To the best of the authors’ knowledge, this study is the first to compare different optimization methods and classifiers, demonstrating the superiority of the Bayesian-optimized XGBoost classification model for returns prediction.

Item Type:	Article
Keywords:	Machine learning \| Classification model \| XGBoost \| Product return \| Bayesian optimisation
Subjects:	Social Sciences and humanities > Business, Management and Accounting > Business and International Management Social Sciences and humanities > Business, Management and Accounting > Marketing Social Sciences and humanities > Social Sciences > Social Sciences (General)
JGU School/Centre:	Jindal Global Business School
Depositing User:	Mr. Abid Fakhre Alam
Date Deposited:	05 Sep 2024 16:21
Last Modified:	05 Sep 2024 16:21
Official URL:	https://doi.org/10.1108/JSIT-06-2020-0120
URI:	https://pure.jgu.edu.in/id/eprint/8426

Downloads

Downloads per month over past year

Actions (login required)

: View Item

PlumX

Altmetric