Deep Reinforcement Learning-Based UAV Path Planning Algorithm in Agricultural Time-Constrained Data Collection

doi:10.4316/AECE.2023.02012

2/2023 - 12

View TOC | « Previous Article | Next Article »

Deep Reinforcement Learning-Based UAV Path Planning Algorithm in Agricultural Time-Constrained Data Collection

CAI, M. , FAN, S. , XIAO, G. , HU, K.

View the paper record and citations in

Click to see author's profile in

SCOPUS,

IEEE Xplore,

Web of Science

Download PDF (1,786 KB) | Citation | Downloads: 522 | Views: 835

Author keywords
adaptive exploration, deep reinforcement learning, Markov decision process, path planning, reward function

References keywords
data(10), learning(9), internet(9), communications(9), collection(9), time(8), control(8), system(7), reinforcement(7), networks(6)
Blue keywords are present in both the references section and the paper title.

About this article
Date of Publication: 2023-05-31
Volume 23, Issue 2, Year 2023, On page(s): 101 - 108
ISSN: 1582-7445, e-ISSN: 1844-7600
Digital Object Identifier: 10.4316/AECE.2023.02012
Web of Science Accession Number: 001009953400012
SCOPUS ID: 85164343239

Abstract

Full text preview

In the Agricultural Internet of Things (AgIoT), Unmanned Aerial Vehicles (UAVs) can be used to collect sensor data. Thus, UAVs must plan the appropriate data collection paths so that sensors can collect the data under different positions and generate time-constrained data. Therefore, this paper proposes a UAV path planning algorithm based on Deep Reinforcement Learning (DRL), which jointly optimizes location, energy, and time deadline to maximize the data-energy ratio. The path planning process is modeled and decomposed into a Markov Decision Process (MDP), and then a Prioritized Experience Replay Double Deep Q Network (PER-DDQN) model is used to calculate the optimal solution. Furthermore, a time-constrained reward function and an improved adaptive upper confidence bound (UCB) exploration function are proposed to balance exploration and exploitation in the DRL algorithm, affording the developed algorithm to converge quickly and smoothly. The simulations demonstrate that compared with traditional methods, the proposed algorithm presents better path selection during the data collection process, lower execution time, and a higher data-energy ratio. Our algorithm promotes the use of UAV in AgIoT.

References

Cited By «-- Click to see who has cited this paper

[1] P. Tokekar, J. V. Hook, D. Mulla, V. Isler, "Sensor planning for a symbiotic UAV and UGV system for precision agriculture," IEEE Transactions on Robotics, 2016, 32(6): 1498-1511.
[CrossRef] [Web of Science Times Cited 300] [SCOPUS Times Cited 384]

[2] Kaur P, Kumar R, Kumar M, "A healthcare monitoring system using random forest and internet of things (IoT)". Multimedia Tools and Applications, 2019, 78: 19905-19916.
[CrossRef] [Web of Science Times Cited 119] [SCOPUS Times Cited 221]

[3] Ouyang, F., Cheng, H., Lan, Y, "Automatic delivery and recovery system of Wireless Sensor Networks (WSN) nodes based on UAV for agricultural applications," Computers and electronics in agriculture, 2019, 162: 31-43.
[CrossRef] [Web of Science Times Cited 37] [SCOPUS Times Cited 43]

[4] M. Mozaffari, W. Saad, M. Bennis, M. Debbah, "Mobile unmanned aerial vehicles (UAVs) for energy-efficient Internet of Things communications," IEEE Transactions on Wireless Communications, 2017, 16(11): 7574-7589.
[CrossRef] [Web of Science Times Cited 717] [SCOPUS Times Cited 856]

[5] Z. Wei, M. Zhu, N. Zhang, L. Wang, Y. Zou, Z. Meng, Z. Feng, "UAV-assisted data collection for internet of things: A survey," IEEE Internet of Things Journal, 2022, 9(17): 15460-15483.
[CrossRef] [Web of Science Times Cited 92] [SCOPUS Times Cited 128]

[6] Li, X., Tan, J., Liu, A., Vijayakumar, "A novel UAV-enabled data collection scheme for intelligent transportation system through UAV speed control." IEEE Transactions on Intelligent Transportation Systems 22.4 (2020): 2100-2110.
[CrossRef] [Web of Science Times Cited 96] [SCOPUS Times Cited 118]

[7] A. Sungheetha, R. Sharma, "Real time monitoring and fire detection using internet of things and cloud based drones," Journal of Soft Computing Paradigm (JSCP), 2020, 2(03): 168-174.
[CrossRef]

[8] K. Li, W. Ni, E. Tovar, M. Guizani, "Joint flight cruise control and data collection in UAV-aided internet of things: An onboard deep reinforcement learning approach," IEEE Internet of Things Journal, 2020, 8(12): 9787-9799.
[CrossRef] [Web of Science Times Cited 45] [SCOPUS Times Cited 46]

[9] J. Liu, X. Wang, B. Bai, H. Dai, "Age-optimal trajectory planning for UAV-assisted data collection," IEEE INFOCOM 2018-IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), IEEE, 2018, pp. 553-558.
[CrossRef] [SCOPUS Times Cited 191]

[10] S. Aggarwal, N. Kumar, "Path planning techniques for unmanned aerial vehicles: A review, solutions, and challenges," Computer Communications, 2020, 149: 270-299.
[CrossRef] [Web of Science Times Cited 415] [SCOPUS Times Cited 578]

[11] G. Rigatos, P. Siano, D. Selisteanu, R. E. Precup, "Nonlinear optimal control of oxygen and carbon dioxide levels in blood," Intelligent Industrial Systems, 2017, 3: 61-75.
[CrossRef]

[12] H. Ucgun, I. Okten, U. Yuzgec, M. Kesler, "Test platform and graphical user interface design for vertical take-off and landing drones," Science and Technology, 2022, 25(3): 350-367.

[13] R. E. Precup, S. Preitl, J. K. Tar, M. L. Tomescu, M. TakÃ¡cs, P. Korondi, P. Baranyi, "Fuzzy control system performance enhancement by iterative learning control," IEEE Transactions on Industrial Electronics, 2008, 55(9): 3461-3475.
[CrossRef] [Web of Science Times Cited 80] [SCOPUS Times Cited 104]

[14] I. A. Zamfirache, R. E. Precup, R. C. Roman, E. M. Petriu, "Neural Network-based control using Actor-Critic Reinforcement Learning and Grey Wolf Optimizer with experimental servo system validation," Expert Systems with Applications, 2023, 225: 120112.
[CrossRef] [Web of Science Times Cited 62] [SCOPUS Times Cited 60]

[15] Z. Yang, C. Pan, K. Wang, M. Shikh-Bahaei, "Energy efficient resource allocation in UAV-enabled mobile edge computing networks," IEEE Transactions on Wireless Communications, 2019, 18(9): 4576-4589.
[CrossRef] [Web of Science Times Cited 296] [SCOPUS Times Cited 332]

[16] J. Zhang, L. Zhou, F. Zhou, B. C. Seet, H. Zhang, Z. Cai, J. Wei, "Computation-efficient offloading and trajectory scheduling for multi-UAV assisted mobile edge computing," IEEE Transactions on Vehicular Technology, 2019, 69(2): 2114-2125.
[CrossRef] [Web of Science Times Cited 148] [SCOPUS Times Cited 164]

[17] O. Ghdiri, W. Jaafar, S. Alfattani, J. B. Abderrazak, H. Yanikomeroglu, "Offline and online UAV-enabled data collection in time-constrained IoT networks," IEEE Transactions on Green Communications and Networking, 2021, 5(4): 1918-1933.
[CrossRef] [Web of Science Times Cited 31] [SCOPUS Times Cited 37]

[18] M. Samir, S. Sharafeddine, C. M. Assi, T. M. Nguyen, A. Ghrayeb, "UAV trajectory planning for data collection from time-constrained IoT devices," IEEE Transactions on Wireless Communications, 2019, 19(1): 34-46.
[CrossRef] [Web of Science Times Cited 259] [SCOPUS Times Cited 292]

[19] O. Ghdiri, W. Jaafar, S. Alfattani, J. B. Abderrazak, H. Yanikomeroglu, "Energy-efficient multi-UAV data collection for IoT networks with time deadlines," IEEE Global Communications Conference, 2020, pp. 1-6.
[CrossRef] [Web of Science Times Cited 18] [SCOPUS Times Cited 29]

[20] S. Shen, K. Yang, K. Wang, G. Zhang, H. Mei, "Number and Operation Time Minimization for Multi-UAV-Enabled Data Collection System With Time Windows," IEEE Internet of Things Journal, 2021, 9(12): 10149-10161.
[CrossRef] [Web of Science Times Cited 17] [SCOPUS Times Cited 21]

[21] K. Liu, J. Zheng, "UAV Trajectory Optimization for Time-Constrained Data Collection in UAV-Enabled Environmental Monitoring Systems," IEEE Internet of Things Journal, 2022, 9(23): 24300-24314.
[CrossRef] [Web of Science Times Cited 49] [SCOPUS Times Cited 60]

[22] K. Arulkumaran, M. P. Deisenroth, M. Brundage, A. A. Bharath, "Deep reinforcement learning: A brief survey," IEEE Signal Processing Magazine, 2017, 34(6): 26-38.
[CrossRef] [Web of Science Times Cited 1970] [SCOPUS Times Cited 2404]

[23] J. Chen, Q. Wu, Y. Xu, N. Qi, X. Guan, Y. Zhang, Z. Xue, "Joint task assignment and spectrum allocation in heterogeneous UAV communication networks: A coalition formation game-theoretic approach," IEEE Transactions on Wireless Communications, 2020, 20(1): 440-452.
[CrossRef] [Web of Science Times Cited 70] [SCOPUS Times Cited 88]

[24] J. Chen, F. Ye, Y. Li, "Travelling salesman problem for UAV path planning with two parallel optimization algorithms," IEEE 2017 progress in electromagnetics research symposium-fall (PIERS-FALL), 2017, pp. 832-837.
[CrossRef] [SCOPUS Times Cited 53]

[25] L. P. Kaelblin, M. L. Littman, A. W. Moore, "Reinforcement learning: A survey," Journal of artificial intelligence research, 1996, 4: 237-285.
[CrossRef] [Web of Science Times Cited 4259] [SCOPUS Times Cited 5812]

[26] B. Kiumarsi, F. L. Lewis, H. Modares, A. Karimpour, M. B. Naghibi-Sistani, "Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics," Automatica, 2014, 50(4): 1167-1175.
[CrossRef] [Web of Science Times Cited 382] [SCOPUS Times Cited 454]

[27] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, D. Hassabis, "Human-level control through deep reinforcement learning," nature, 2015, 518(7540): 529-533.
[CrossRef] [Web of Science Times Cited 16095] [SCOPUS Times Cited 20838]

[28] H. V. Hasselt, A. Guez, D. Silver, "Deep reinforcement learning with double q-learning," In Proceedings of the AAAI conference on artificial intelligence, 2016, 30(1).
[CrossRef]

[29] R. Y. Chen, S. Sidor, P. Abbeel, J. Schulman, "Ucb exploration via q-ensembles," arXiv preprint, 2017, 1706.01502.
[CrossRef]

[30] T. Schaul, J. Quan, I. Antonoglou, D. Silver, "Prioritized experience replay," arXiv preprint, 2015, 1511.05952.
[CrossRef]

[31] P. Kumar, T. Amgoth, C. S. R. Annavarapu, "ACO-based mobile sink path determination for wireless sensor networks under non-uniform data constraints," Applied Soft Computing, 2018, 69: 528-540.
[CrossRef] [Web of Science Times Cited 105] [SCOPUS Times Cited 130]

References Weight

Web of Science® Citations for all references: 25,662 TCR
SCOPUS® Citations for all references: 33,443 TCR

Web of Science® Average Citations per reference: 802 ACR
SCOPUS® Average Citations per reference: 1,045 ACR

TCR = Total Citations for References / ACR = Average Citations per Reference

We introduced in 2010 - for the first time in scientific publishing, the term "References Weight", as a quantitative indication of the quality ... Read more

Citations for references updated on 2024-10-18 19:23 in 203 seconds.

Note¹: Web of Science® is a registered trademark of Clarivate Analytics.
Note²: SCOPUS® is a registered trademark of Elsevier B.V.
Disclaimer: All queries to the respective databases were made by using the DOI record of every reference (where available). Due to technical problems beyond our control, the information is not always accurate. Please use the CrossRef link to visit the respective publisher site.

Copyright ©2001-2024
Faculty of Electrical Engineering and Computer Science
Stefan cel Mare University of Suceava, Romania

All rights reserved: Advances in Electrical and Computer Engineering is a registered trademark of the Stefan cel Mare University of Suceava. No part of this publication may be reproduced, stored in a retrieval system, photocopied, recorded or archived, without the written permission from the Editor. When authors submit their papers for publication, they agree that the copyright for their article be transferred to the Faculty of Electrical Engineering and Computer Science, Stefan cel Mare University of Suceava, Romania, if and only if the articles are accepted for publication. The copyright covers the exclusive rights to reproduce and distribute the article, including reprints and translations.

Permission for other use: The copyright owner's consent does not extend to copying for general distribution, for promotion, for creating new works, or for resale. Specific written permission must be obtained from the Editor for such copying. Direct linking to files hosted on this website is strictly prohibited.

Disclaimer: Whilst every effort is made by the publishers and editorial board to see that no inaccurate or misleading data, opinions or statements appear in this journal, they wish to make it clear that all information and opinions formulated in the articles, as well as linguistic accuracy, are the sole responsibility of the author.

Menu:

Deep Reinforcement Learning-Based UAV Path Planning Algorithm in Agricultural Time-Constrained Data Collection