3/2015 - 5 |
Application of Machine Learning Algorithms for the Query Performance PredictionMILICEVIC, M. , BARANOVIC, M. , ZUBRINIC, K. |
Extra paper information in |
Click to see author's profile in SCOPUS, IEEE Xplore, Web of Science |
Download PDF (1,442 KB) | Citation | Downloads: 1,126 | Views: 2,193 |
Author keywords
machine learning, prediction algorithms, query processing, transaction databases
References keywords
learning(16), performance(13), data(13), prediction(12), machine(12), database(12), systems(10), query(10), workloads(7), francisco(7)
Blue keywords are present in both the references section and the paper title.
About this article
Date of Publication: 2015-08-31
Volume 15, Issue 3, Year 2015, On page(s): 33 - 44
ISSN: 1582-7445, e-ISSN: 1844-7600
Digital Object Identifier: 10.4316/AECE.2015.03005
Web of Science Accession Number: 000360171500005
SCOPUS ID: 84940732050
Abstract
This paper analyzes the relationship between the system load/throughput and the query response time in a real Online transaction processing (OLTP) system environment. Although OLTP systems are characterized by short transactions, which normally entail high availability and consistent short response times, the need for operational reporting may jeopardize these objectives. We suggest a new approach to performance prediction for concurrent database workloads, based on the system state vector which consists of 36 attributes. There is no bias to the importance of certain attributes, but the machine learning methods are used to determine which attributes better describe the behavior of the particular database server and how to model that system. During the learning phase, the system's profile is created using multiple reference queries, which are selected to represent frequent business processes. The possibility of the accurate response time prediction may be a foundation for automated decision-making for database (DB) query scheduling. Possible applications of the proposed method include adaptive resource allocation, quality of service (QoS) management or real-time dynamic query scheduling (e.g. estimation of the optimal moment for a complex query execution). |
References | | | Cited By «-- Click to see who has cited this paper |
[1] M. Milicevic, M. Baranovic, V. Batos, "QoS control based on query response time prediction", WSEAS Transactions on Computers. 4 (2005), 882-889.
[2] M. Wimmer, V. Nicolescu, D. Gmach, M. Mohr, A. Kemper, H. Krcmar, "Evaluation of Adaptive Computing Concepts for Classical ERP Systems and Enterprise Services", Proceedings of IEEE Joint Conference on E-Commerce Technology and Enterprise Computing, E-Commerce and E-Services (CEC'06 and EEE'06), San Francisco, California, June 26-29, 352-355. [CrossRef] [SCOPUS Times Cited 7] [3] B. Shneiderman, Designing the User Interface: Strategies for Effective Human-Computer Interaction, 3rd ed., Addison-Wesley, Reading, MA, 1998. [4] R. B. Miller, "Response time in man-computer conversational transactions", Proceedings of AFIPS Fall Joint Computer Conference, Vol. 33, 1968, 267-277. [CrossRef] [5] J. Nielsen, Usability Engineering, Morgan Kaufmann, San Francisco, 1994. [6] R. McNab, Y. Wang, I.H. Witten, C. Gutwin, "Predicting query times", Proceedings of the 21st Annual international ACM SIGIR Conference on Research and Development in information Retrieval SIGIR '98. ACM Press, New York, 1998, 355-356. [CrossRef] [SCOPUS Record] [7] S. Heisig, S. Moyle, "Using model trees to characterize computer resource usage", Proceedings of WOSS 2004, 80-84. [CrossRef] [SCOPUS Times Cited 7] [8] P. Dinda, D. O'Hallaron, "The Statistical Properties of Host Load, Fourth Workshop on Languages", Compilers and Run-time Systems for Scalable Computers (LCR 98), Pittsburgh, 1998. [CrossRef] [SCOPUS Times Cited 16] [9] P. Dinda, D. O'Hallaron, "An Evaluation of Linear Models for Host Load Prediction", Proc. 8th IEEE Symposium on High-Performance, Distributed Computing (HPDC-8), Redondo Beach, 1999. [CrossRef] [10] P. Dinda, D. O'Hallaron, "Host load prediction using linear models", Cluster Computing, 2000. [CrossRef] [11] P. Dinda, "A Prediction-based Real-time Scheduling Advisor", Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002. [CrossRef] [SCOPUS Times Cited 58] [12] M. Andreolini, S. Casolari, "Load prediction models in web-based systems", Proceedings of the 1st international Conference on Performance Evaluation Methodolgies and Tools, New York: ACM Press, 2006. [CrossRef] [SCOPUS Times Cited 39] [13] W. Xu, X. Zhu, S. Singhal, Z. Wang, "Predictive Control for Dynamic Resource Allocation in Enterprise Data Centers", 10th IEEE/IFIP In Network Operations and Management Symposium, (2006). 115-126. [14] R. Vilalta, C.V. Apte, J.L. Hellerstein, S. Ma, S.M. Weiss, "Predictive algorithms in the management of computer systems", IBM Systems Journal Vol. 41, No 3, 2002. [CrossRef] [Web of Science Times Cited 52] [SCOPUS Times Cited 85] [15] S. Cronen-Townsend, Y. Zhou, W.B. Croft, "Predicting query performance", Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Tampere, Finland, 2002. [CrossRef] [16] B. He, I. Ounis, "Query performance prediction. Information Systems", Special Issue for the String Processing and Information Retrieval (SPIRE2004), 2005. [17] C. Hauff, L. Azzopardi, D. Hiemstra, "The combination and evaluation of query performance prediction methods", Lecture Notes in Computer Science, Volume 5478, 2009. 301-312. [CrossRef] [SCOPUS Times Cited 3] [18] J. Perez-Iglesias, L. Araujo, "Evaluation of Query Performance Prediction Methods by Range", Proceedings of the 17th edition of the Symposium on String Processing and Information Retrieval, 2010. [CrossRef] [SCOPUS Times Cited 2] [19] N. Tomov, E.W. Dempster, M.H. Williams, J.B. King, A. Burger, "Approximate Estimation of Transaction Response Time", Comput. Journal, 42(3) (1999).241-250. [CrossRef] [Web of Science Times Cited 7] [SCOPUS Times Cited 7] [20] D.A. Menasce, R. Dodge, D. Barbara, "Preserving QoS of E-Commerce Sites through Self-Tuning: A Performance Model Approach", Proceedings of 2001 ACM Conf. E-Commerce, ACM Press, 2001. 224-234. [CrossRef] [21] D.A. Menasce, "Automatic QoS Control", IEEE Internet Computing 7(1) (2003). str. 92-95. [22] P. Martin, W. Powley, H. Li, K. Romanufa, "Managing database server performance to meet QoS requirements in electronic commerce systems", International Journal on Digital Libraries 3(4), 2002, 316-324. [CrossRef] [SCOPUS Times Cited 15] [23] B. Mozafari, C. Curino, A. Jindal, S.l Madden, "Performance and resource modeling in highly-concurrent OLTP workloads", Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data (SIGMOD '13). ACM, New York, 2013, 301-312. [CrossRef] [SCOPUS Times Cited 89] [24] B.Mozafari, C. Curino, S. Madden, "DBSeer: Resource and Performance Prediction for Building a Next Generation Database Cloud", CIDR, 2013. [25] M. Ahmad, I. Bowman, "Predicting system performance for multi-tenant database workloads", Proceedings of the Fourth International Workshop on Testing Database Systems (DBTest '11). ACM, New York, 2011. [CrossRef] [SCOPUS Times Cited 12] [26] C. Gupta, A. Mehta, U. Dayal, PQR: "Predicting Query Execution Times for Autonomous Workload Management", Proceedings of the 2008 International Conference on Autonomic Computing, 2008.13-22. [CrossRef] [SCOPUS Times Cited 79] [27] A. Mehta, C. Gupta, U. Dayal, "BI batch manager: a system for managing batch workloads on enterprise data-warehouses", Proceedings of the 11th international conference on Extending database technology: Advances in database technology, 2008. [CrossRef] [SCOPUS Times Cited 18] [28] A. Ganapathi, H.A. Kuno, U. Dayal, J. L. Wiener, A. Fox, M. I. Jordan, D. A. Patterson, "Predicting Multiple Metrics for Queries: Better Decisions Enabled by Machine Learning", Proceedings of the 2009 IEEE International Conference on Data Engineering, 2009. pp 592-603. [CrossRef] [Web of Science Times Cited 139] [SCOPUS Times Cited 263] [29] S. Krompass, A. Scholz, M. Albutiu, H. Kuno, J. Wiener, U. Dayal, A. Kemper, "Quality of Service-Enabled Management of Database Workloads", IEEE Data Engineering Bulletin, Special Issue on Testing and Tuning of Database Systems, 31(1), 2008. [30] S. Krompass, H.A. Kuno, K. Wilkinson, U. Dayal, A. Kemper, "Adaptive query scheduling for mixed database workloads with multiple objectives", Proceedings of the Third International Workshop on Testing Database Systems, DBTest 2010. [CrossRef] [SCOPUS Times Cited 5] [31] M. Akdere, U. Çetintemel, M. Riondato, E. Upfal, S. Zdonik, "Learning-based Query Performance Modeling and Prediction", Proceedings of the 2012 IEEE 28th International Conference on Data Engineering (ICDE '12). IEEE Computer Society, USA, 390-401. [CrossRef] [Web of Science Times Cited 108] [SCOPUS Times Cited 172] [32] M. Ahmad, A. Aboulnaga, S. Babu, "Query interactions in database workloads", Proceedingsof the Int. Workshop on Testing DatabaseSystems (DBTest), 2009. [CrossRef] [SCOPUS Times Cited 20] [33] M. Ahmad, S. Duan, A. Aboulnaga, S. Babu, "Interaction-aware prediction of business intelligence workload completion times", International Conference on Data Engineering (ICDE), 2010, 413-416. [CrossRef] [Web of Science Times Cited 4] [SCOPUS Times Cited 13] [34] J. Duggan, U. Cetintemel, O. Papaemmanouil, E. Upfal, "Performance prediction for concurrent database workloads", SIGMOD, 2011. [CrossRef] [SCOPUS Times Cited 116] [35] Y. Lingyun, I. Foster, J. M. Schopf, "Homeostatic and tendency-based CPU load predictions", Parallel and distributed processing Symposium, 2003. [CrossRef] [SCOPUS Times Cited 112] [36] H. Li, D. Groep, L. Wolters, "Efficient response time predictions by exploiting application and resource state similarities", 6th International Workshop on Grid Computing (GRID 2005), 2005. [CrossRef] [SCOPUS Times Cited 33] [37] W. Smith, I. T. Foster, V. E. Taylor, "Predicting application run times with historical information", Journal of Parallel Distrib. Comput., 64(9) (2004).1007-1016. [CrossRef] [Web of Science Times Cited 71] [SCOPUS Times Cited 103] [38] J. R. Quinlan, "Learning with Continuous Classes", Proceedings of Fifth Australian Joint Conf. Artificial Intelligence, Australia,1992. [39] Y. Wang, I.H. Witten, "Inducing model trees for continuous classes", Proceedings of Poster Papers, 9th European Conference on Machine Learning, Prague, Czech, 1997. [40] G. Holmes, M. Hall, E. Frank, "Generating Rule Sets from Model Trees", Twelfth Australian Joint Conference on Artificial Intelligence, 1-12, 1999. [CrossRef] [SCOPUS Times Cited 211] [41] I. H. Witten, E. Frank, Data Mining: Practical machine learning tools and techniques, 2nd Edition, Morgan Kaufmann, San Francisco, 2005. [CrossRef] [SCOPUS Times Cited 690] [42] D. Aha,D. Kibler, M. Albert, "Instance-based learning algorithms", Machine Learning 6 (1991), 37-66. [CrossRef] [43] D. Rumelhart, G. Hinton, R. Williams, "Learning Internal Representations by Error Propagation", Parallel Distributed Processing Vol.1 (1986), Cambridge, MA, MIT Press. 318-362. [CrossRef] [SCOPUS Times Cited 187] [44] J. Han, M. Kamber, Data Mining. Morgan Kaufmann, San Francisco, CA, 2001. [45] S. K. Shevade, S. S. Keerthi, C. Bhattacharyya, K. R. K. Murthy, "Improvements to the SMO Algorithm for SVM Regression", IEEE Transactions on Neural Networks, 1999. [CrossRef] [Web of Science Times Cited 587] [SCOPUS Times Cited 754] [46] R. Kohavi, "A study of cross-validation and bootstrap for accuracy estimation and model selection", Proceedings of the 14th International Joint Conference on Artificial Intelligence, Morgan Kaufmann, San Francisco, 1995. [47] V. Barnett, T. Lewis, Outliers in Statistical Data, 2nd ed., John Wiley & Sons, 1987. [48] E. Knorr, R. Ng, "A unified notion of outliers: Properties and computation", Proceedings of 1997 Int. Conf. Knowledge Discovery and Data Mining (KDD'97), Newport Beach, CA, 1997. [49] E. Knorr, R. Ng, "Algorithms for mining distance-based outliers in large datasets", Proceedings of 1998 Int. Conf. Very Large Data Bases (VLDB'98), New York, 1998. [50] L. Breiman, "Bagging Predictors", Machine Learning, 24(2), 1996, 123-140. [CrossRef]10.1007/BF00058655 [51] Y. Freund, R.E. Schapire, "Experiments with a new boosting algorithm", Proceedings of the Thirteenth International Conference on Machine Learning / editor L. Saitta. Bari, Italy. San Francisco: Morgan Kaufmann, 1996, 148-156. [52] R. E. Schapire, Y. Freund, P. Bartlett, W. S. Lee, "Boosting the margin: A new explanation for the effectiveness of voting methods", Proceedings of the Fourteenth International Conference on Machine Learning / D. H. Fisher, editor. Nashville, TN. San Francisco: Morgan Kaufmann, 1997, 322-330. [CrossRef] [Web of Science Times Cited 257] [SCOPUS Times Cited 332] [53] E. Frank, Y. Wang, S. Inglis, G. Holmes, I. H. Witten, "Using model trees for classification", Machine Learning, 32 (1998), 63-76. [54] T. Fawcett, "ROC Graphs: Notes and Practical Considerations for Data Mining Researchers", Technical Report HPL-2003-4, HP Labs, 2003. [55] I. Kononenko, "Estimating attributes: analysis and extensions of Relief", Proceedings of the European Conference on Machine Learning: ECML-94. / De Raedt, L., Bergadano, F., editors. Springer Verlag, 1994. 171-182. [CrossRef] [SCOPUS Times Cited 2378] [56] M. Robnik Sikonja, I. Kononenko, "An adaptation of Relief for attribute estimation on regression", Proceedings of 14th International Conference on Machine Learning ICML'97 / D.Fisher editor. Nashville, TN. 1997. [57] H. Liu, J. Li, L. Wong, "A Comparative Study on Feature Selection and Classification Methods Using Gene Expression Profiles and Proteomic Patterns", Proceedings of 13th International Conference on Genome Informatics (GIW02), Tokyo, Japan, 2002. [58] M. A. Hall, "Correlation-based feature selection machine learning", Ph.D. Thesis, Department of Computer Science, University of Waikato, Hamilton, New Zealand, 1998. [59] M. A. Hall, L. A. Smith, "Feature selection for machine learning: Comparing a correlation-based filter approach to the wrapper", Proceedings of the 22nd Australasian Computer Science Conference, 1999. Web of Science® Citations for all references: 1,225 TCR SCOPUS® Citations for all references: 5,826 TCR Web of Science® Average Citations per reference: 20 ACR SCOPUS® Average Citations per reference: 97 ACR TCR = Total Citations for References / ACR = Average Citations per Reference We introduced in 2010 - for the first time in scientific publishing, the term "References Weight", as a quantitative indication of the quality ... Read more Citations for references updated on 2024-11-19 06:34 in 247 seconds. Note1: Web of Science® is a registered trademark of Clarivate Analytics. Note2: SCOPUS® is a registered trademark of Elsevier B.V. Disclaimer: All queries to the respective databases were made by using the DOI record of every reference (where available). Due to technical problems beyond our control, the information is not always accurate. Please use the CrossRef link to visit the respective publisher site. |
Faculty of Electrical Engineering and Computer Science
Stefan cel Mare University of Suceava, Romania
All rights reserved: Advances in Electrical and Computer Engineering is a registered trademark of the Stefan cel Mare University of Suceava. No part of this publication may be reproduced, stored in a retrieval system, photocopied, recorded or archived, without the written permission from the Editor. When authors submit their papers for publication, they agree that the copyright for their article be transferred to the Faculty of Electrical Engineering and Computer Science, Stefan cel Mare University of Suceava, Romania, if and only if the articles are accepted for publication. The copyright covers the exclusive rights to reproduce and distribute the article, including reprints and translations.
Permission for other use: The copyright owner's consent does not extend to copying for general distribution, for promotion, for creating new works, or for resale. Specific written permission must be obtained from the Editor for such copying. Direct linking to files hosted on this website is strictly prohibited.
Disclaimer: Whilst every effort is made by the publishers and editorial board to see that no inaccurate or misleading data, opinions or statements appear in this journal, they wish to make it clear that all information and opinions formulated in the articles, as well as linguistic accuracy, are the sole responsibility of the author.