1/2025 - 8 |
An Improved Multi-Imputation Technique Based on Chained Equations and Decision Trees: Application to Wind Energy Conversion SystemsJAFFEL, I.![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Extra paper information in ![]() ![]() ![]() |
Click to see author's profile in ![]() ![]() ![]() |
Download PDF ![]() |
Author keywords
data preprocessing, decision trees, multidimensional signal processing, statistical analysis, wind energy
References keywords
missing(13), data(13), imputation(11), tree(7), detection(7), analysis(7), science(6), fault(6), decision(6), methods(5)
Blue keywords are present in both the references section and the paper title.
About this article
Date of Publication: 2025-02-28
Volume 25, Issue 1, Year 2025, On page(s): 71 - 78
ISSN: 1582-7445, e-ISSN: 1844-7600
Digital Object Identifier: 10.4316/AECE.2025.01008
SCOPUS ID: 86000349532
Abstract
Missing data (MD) is a prevalent issue that researchers and data scientists frequently encounter. It can significantly impact the quality of analyzed data, affecting the relevance of the interpreted results and the inferred conclusions. In response to this challenge, a novel multi-imputation technique that combines Multivariate Imputation by Chained Equation (MICE) with Decision Tree (DT), namely (MICE-DT), is proposed. This developed method was evaluated against several established imputation techniques, including K-Nearest Neighbors (KNN), K-Means clustering, Decision Tree (DT), and MICE, under the assumption of Missing at Random (MAR). The performance of the MICE-DT algorithm, along with the comparative analysis of the studied techniques, was demonstrated on a Wind Energy Conversion System (WEC), yielding satisfactory results. |
References | | | Cited By «-- Click to see who has cited this paper |
[1] A. Kumar, A. K. Dubey, I. S. Ramirez, A. M. del Rio and F. P. Garcia Marquez, "Artificial intelligence techniques for the photovoltaic system: A systematic review and analysis for evaluation and benchmarking," Archives of Computational Methods in Engineering, 2024. [CrossRef] [Web of Science Times Cited 2] [SCOPUS Times Cited 5] [2] M. Guerfel and H. Messaoud, "On the use of DPCA in process fault detection and identification," 8th IEEE-IFAC International Conference on Control, Automation and Diagnosis (ICCAD'24), Paris, France, May 15-17. 2024. [CrossRef] [SCOPUS Times Cited 2] [3] R. Fezai, K. Dhibi, M. Mansouri, M. Trabelsi, M. Hajji, K. Bouzrara, H. Nounou and M. Nounou, "Effective random forest-based fault detection and diagnosis for wind energy conversion systems," IEEE Sensors Journal, vol. 21, no 5, pp. 6914-6921, 2020. [CrossRef] [Web of Science Times Cited 55] [SCOPUS Times Cited 70] [4] M. B, Mohammed, H. S. Zulkafli, M. B. Adam, N. Ali. and I. Baba, "A Comparison of five imputation methods in handling missing data in a continuous frequency table," The 7th International Symposium on Current Progress in Mathematics and Sciences (ISCPMS 2021), Depok, Indonesia, 6 -7 October 2021. [CrossRef] [SCOPUS Times Cited 13] [5] Y. Fatnassi, I. Jaffel, M. Guerfel, H. Messaoud, "Fault detection based on principal component analysis in the context of missing data," 8th IEEE-IFAC International Conference on Control, Automation and Diagnosis (ICCAD'24), Paris, France, May 15-17, 2024. [CrossRef] [SCOPUS Times Cited 1] [6] J. Li, S. Guo, R. Ma, J. He, X. Zhang, D. Rui, Y. Ding, Y. Li, L. Y. Jian, J. Cheng and H. Guo, "Comparison of the effects of imputation methods for missing data in predictive modelling of cohort study datasets," BMC Medical Research Methodology, vol. 24, no. 1, pp. 41, 2024. [CrossRef] [Web of Science Times Cited 20] [SCOPUS Times Cited 24] [7] R. J. Little, "Missing data assumptions," Annual review of statistics and its application, vol. 8, no. 1, pp. 89-107, 2021. [CrossRef] [Web of Science Times Cited 29] [SCOPUS Times Cited 27] [8] D. Protic and M. Stankovic, "XOR-based detector of different decisions on anomalies in the computer network traffic," Science and Technology, vol. 26, no. 3-4, pp. 323-338, 2023. [CrossRef] [Web of Science Times Cited 13] [SCOPUS Times Cited 13] [9] P. Keerin and T. Boongoen, "Improved KNN imputation for missing values in gene expression data," Computers, Materials and Continua, vol. 70, no. 2, pp. 4009-4025, 2021. [CrossRef] [Web of Science Times Cited 14] [SCOPUS Times Cited 25] [10] F. Shahla and T. Gerhard, "Nearest neighbor imputation for categorical data by weighting of attributes," Information Sciences, vol. 592, p. 306-319, 2022. [CrossRef] [Web of Science Times Cited 16] [SCOPUS Times Cited 12] [11] U. Kilic, E. S. Essiz, M. K. Keles, "Binary anarchic society optimization for feature selection," Romanian Journal of Information Science and Technology, vol. 26, no 3-4, pp. 351-364, 2023. [CrossRef] [Web of Science Times Cited 29] [SCOPUS Times Cited 35] [12] A. Dubey, A. Rasool, "Efficient technique of microarray missing data imputation using clustering and weighted nearest neighbor," Scientific Reports, vol. 11, no 1, pp. 24297, 2021. [CrossRef] [Web of Science Times Cited 14] [SCOPUS Times Cited 34] [13] P. S. Raja, K. Thangavel, "Missing value imputation using unsupervised machine learning techniques," Soft Comput, vol. 24, no. 6, pp. 4361-4392, 2020. [CrossRef] [Web of Science Times Cited 50] [SCOPUS Times Cited 71] [14] H. Khan, X. Wang and H. Liu, "Missing value imputation through shorter interval selection driven by Fuzzy C-Means clustering," Computers & Electrical Engineering, vol. 93, pp. 107230, 2021. [CrossRef] [Web of Science Times Cited 22] [SCOPUS Times Cited 29] [15] I.-D. Borlea, R.-E. Precup, F. Dragan, A.-B. Borlea, "Centroid update approach to K-Means clustering," Advances in Electrical and Computer Engineering, vol. 17, no. 4, pp. 3-10, 2017. [CrossRef] [Full Text] [Web of Science Times Cited 29] [SCOPUS Times Cited 40] [16] I. D. Borlea, R. E. Precup and A. B. Borlea, "Improvement of K-Means cluster quality by post processing resulted clusters," Procedia Computer Science, vol. 199, pp. 63-70, 2022. [CrossRef] [Web of Science Times Cited 90] [SCOPUS Times Cited 109] [17] S. Nikfalazar, CH. S. Yeh, S. Bedingfield and H. A. Khorshidi, "Missing data imputation using decision trees and fuzzy clustering with iterative learning," Knowledge and Information Systems, vol. 62, pp. 2419-2437, 2020. [CrossRef] [Web of Science Times Cited 53] [SCOPUS Times Cited 72] [18] Y. Y. Song and L. U. Ying, "Decision tree methods: Applications for classification and prediction," Shanghai archives of psychiatry, vol. 27, no 2, pp. 130, 2015. [CrossRef] [SCOPUS Times Cited 1739] [19] M. R. Danielle, J. Ross and J. G. Kevin, "A multiple imputation approach for handling missing data in classification and regression trees," Journal of Behavioral Data Science, vol. 1, no. 1, pp. 127-153, 2021. [CrossRef] [20] S. T. H. Rizvi, M. Y. Latif, M. S. Amin, A. J. Telmoudi and N. A. Shah, "Analysis of machine learning based imputation of missing data," Cybernetics and Systems, pp. 1-15, 2023. [CrossRef] [Web of Science Times Cited 8] [SCOPUS Times Cited 8] [21] D. Nathaniel and A. Soosai, "A robust network intrusion detection system using random forest based random subspace ensemble to defend against adversarial attacks," Advances in Electrical & Computer Engineering, vol. 23, no. 4, pp. 81-88, 2024. [CrossRef] [Full Text] [SCOPUS Times Cited 1] [22] M. Guerfel, A. Ben Aicha, K. Belkhiria and H. Messaoud, "New PCA-based scheme for process fault detection and identification. Application to the Tennessee Eastman Process," Bulletin of the Polish Academy of Sciences Technical Sciences, vol. 72, no 5, p. e150812, 2024. [CrossRef] [Web of Science Times Cited 1] [SCOPUS Times Cited 1] [23] I. Jaffel, O. Taouali, E. Elaissi and H. Messaoud, "A new online fault detection method based on PCA technique," IMA Journal of Mathematical Control and Information, vol. 31, no 4, p. 487- 499, 2014. [CrossRef] [Web of Science Times Cited 29] [SCOPUS Times Cited 41] [24] E. Slade and M. G. Naylor, "A fair comparison of tree-based and parametric methods in multiple imputation by chained equations," Statistics in medicine, vol. 39, no.8, p. 1156-1166, 2020. [CrossRef] [Web of Science Times Cited 35] [SCOPUS Times Cited 36] [25] N. T. T. Nguyen, H. T. T. Vu, H. L. Hu, K.C. Lin, T. X. Nguyen and H. C. Huang, "Applying classification and regression tree analysis to identify risks of developing sarcopenia in the older population," International journal of older people nursing, vol. 17, no. 6, pp. e12488, 2022. [CrossRef] [Web of Science Times Cited 3] [SCOPUS Times Cited 4] [26] C. Beaulac and J. S. Rosenthal, "BEST: A decision tree algorithm that handles missing values," Computational Statistics, vol. 35, no 3, pp. 1001-1026, 2020. [CrossRef] [Web of Science Times Cited 16] [SCOPUS Times Cited 19] [27] O. F. Althuwaynee, B. Pradhan, H. J. Park and J. H. Lee, "A novel ensemble decision tree-based CHi-squared Automatic Interaction Detection (CHAID) and multivariate logistic regression models in landslide susceptibility mapping," Landslides, vol. 11, pp. 1063-1078, 2014. [CrossRef] [Web of Science Times Cited 133] [SCOPUS Times Cited 163] [28] Q. Ren, H. Zhang, D.Zhang and X. Zhao, "Lithology identification using principal component analysis and particle swarm optimization fuzzy decision tree," Journal of Petroleum Science and Engineering, 220, 111233, 2023. [CrossRef] [Web of Science Times Cited 31] [SCOPUS Times Cited 36] [29] T. Emmanuel, T. Maupong, D. Mpoeleng, S. Thabo, M. Banyatsang and T. Oteng, "A survey on missing data in machine learning," Journal of Big data, vol. 8, pp. 1-37, 2021. [CrossRef] [Web of Science Times Cited 377] [SCOPUS Times Cited 516] [30] A. Kouadri, M. Hajji, M. F. Harkat, A. Kamaleldin, M. Majdi N. Hazem, N. Mohamed, "Hidden Markov model based principal component analysis for intelligent fault diagnosis of wind energy converter systems," Renewable Energy, vol. 150, p. 598-606, 2020. [CrossRef] [Web of Science Times Cited 108] [SCOPUS Times Cited 137] [31] E. Schubert, "Stop using the Elbow criterion for K-Means and how to choose the number of clusters instead," ACM SIGKDD Explorations Newsletter, vol. 25, no. (1), pp. 36-42, 2023. [CrossRef] [32] H. M. Sani, C. Lei, D. Neagu, "Computational complexity analysis of decision tree algorithms," In: Bramer, M., Petridis, M. (eds) Artificial Intelligence XXXV. SGAI 2018. Lecture Notes in Computer Science, vol. 11311. Springer, Cham, 2018. [CrossRef] [Web of Science Times Cited 53] [SCOPUS Times Cited 58] Web of Science® Citations for all references: 1,230 TCR SCOPUS® Citations for all references: 3,341 TCR Web of Science® Average Citations per reference: 37 ACR SCOPUS® Average Citations per reference: 101 ACR TCR = Total Citations for References / ACR = Average Citations per Reference We introduced in 2010 - for the first time in scientific publishing, the term "References Weight", as a quantitative indication of the quality ... Read more Citations for references updated on 2025-04-17 22:29 in 215 seconds. Note1: Web of Science® is a registered trademark of Clarivate Analytics. Note2: SCOPUS® is a registered trademark of Elsevier B.V. Disclaimer: All queries to the respective databases were made by using the DOI record of every reference (where available). Due to technical problems beyond our control, the information is not always accurate. Please use the CrossRef link to visit the respective publisher site. |
Faculty of Electrical Engineering and Computer Science
Stefan cel Mare University of Suceava, Romania
All rights reserved: Advances in Electrical and Computer Engineering is a registered trademark of the Stefan cel Mare University of Suceava. No part of this publication may be reproduced, stored in a retrieval system, photocopied, recorded or archived, without the written permission from the Editor. When authors submit their papers for publication, they agree that the copyright for their article be transferred to the Faculty of Electrical Engineering and Computer Science, Stefan cel Mare University of Suceava, Romania, if and only if the articles are accepted for publication. The copyright covers the exclusive rights to reproduce and distribute the article, including reprints and translations.
Permission for other use: The copyright owner's consent does not extend to copying for general distribution, for promotion, for creating new works, or for resale. Specific written permission must be obtained from the Editor for such copying. Direct linking to files hosted on this website is strictly prohibited.
Disclaimer: Whilst every effort is made by the publishers and editorial board to see that no inaccurate or misleading data, opinions or statements appear in this journal, they wish to make it clear that all information and opinions formulated in the articles, as well as linguistic accuracy, are the sole responsibility of the author.