Inpresent time, Wireless Sensor Network is facing a lot of critical challengeslike energy efficiency, throughput, network lifetime, localization, securityand data dissipation. Machine learning offers various techniques to overcomethese issues in supervised as well as in unsupervised manner. Reinforcementlearning which a sister branch of machine learning is also offering simple androbust approaches to present WSN issues in order to improve complete networkperformances. Reinforcement learning has various components like State, Action,Agent and Reward system which deals with unknown dynamic environments to findoptimum solution by learning. This paper has carried out extensive survey ofmore than ten years with complete observation in connection of RL and WSN toprovide better opportunities to perform significant research towards latest WSNchallenges.
Keywords: ReinforcementLearning, Wireless Sensor Network, Q-Learning, Agent, State, Action andReward. IIntroductionThe Wireless Sensor Network is the key for variousexisting and upcoming technologies in critical research domain and thrustareas. WSN provides dynamic environments which are most suitable for criticalapplications. This is because of their vital capabilities of sensing andtransmitting several environmental variables towards destination but in WSNstill there are a lot of challenges and constraints like energy efficiency,data routing, network lifetime, packet delivery, network congestion and end toend delays. Nowadays there is a great need of such kind of algorithms which mayimprove their performance based upon specific and extempore requirements.Therefore it is highly needed and recommended to overcome these challenges bydeveloping efficient protocols, smart algorithms and learning techniques. Nowadays reinforcement learning algorithms becomes a newsolution for optimizing resource utilization and minimizes the energyconsumption in wireless sensor nodes.
These algorithms have been used to predictnew state by knowing the present state while interacting with the unknownenvironment, the probability of changing the new state and the reward ofchanging. The combination of WSN and reinforcement learning algorithm are usedto find the optimal actions towards various WSN challenges. This surveyrepresents the various available techniques of reinforcement learning toovercome the existing issues of wireless sensor network.
Section I Introduction Section II Reinforcement Learning Section III Techniques of Reinforcement Learning for WSN Section IV Open Issues Section IV Conclusion and Future Work Section V References 2 Reinforcement Learning Reinforcement Learning is unsupervised innature. It enables machines and software agents to perform smartly, in order toincrease the performance. To achieve the signal of reinforcement learningmostly feedback system for requiring reward is essential for learning agent toperform its best. Generally, a learning agent is deciding the perfect action toget its present state.
Whenever this is repeated, the problem iscalled Markov Decision Process. Finally, Reinforcement Learning containsvarious components like state, action, reward and discount rate 01 to achievebest optimum solution towards WSN issues.2.
1 Q-Learning GenerallyQ-learning, keeps learning agent which observes the state, event andsubsequently provides reward based upon the available dynamic environment. Thelearning generates the action after sensing any event which leads to change thestatus of state. The fig 01 represents the overall working of learning agent attime t. The q-table maintains the rewards which successfully update the q-valuefunction to generate the q-policy.
Figure 01 RL Agent inits Environment 2.2 Q-Learning: MathematicalApproach Herestate, event, action and reward are s, e, a and r as well as learning rate by ?and discount factor by ?. The negative reward denotes cost whenever it is high,cost is going less. Cost is presented reward therefore. The Q-value of achoosed action with state–event pair at time t is worked in following way: (1-?) ? ? max (01) Where0? ? ? 1 and 0 ? ? ? 1, here the agent leaveits last Q-value, and change it with the latest one. The maximum value ofreward makes Q-value updated every time.
When ?= 1 then most of the time discounted reward lower than the immediate reward. RLgives an optimal policy ? which present reward or value function by selectingthe highest Q-value in following manner: (02)3Reinforcement Learning Techniques for Wireless Sensor NetworkTraditionallyReinforcement Learning approach has been used for various purposes in WSN toachieve network lifetime as well as performance enhancement. A briefobservation of Reinforcement Learning algorithms in the context of WSN ispresented in table 01. Table 01: Reinforcement Learning Techniques used forWSN References Observation Christopher J.C.
H. Watkins(01) Basic Review About Q-Learning.(01) Leslie Pack Kaelbling, Michael L. Littman, Andrew W.
Moore(02) Basics of RL Model, RL Behavior, Learning Automata, Delayed Awards, Policy iteration, and Model free methods.(02) Yu-Han Chang, Tracey Ho and Leslie Pack Kaelbling(03) This paper presents a rich new domain for multi agent reinforcement learning and establishes several first results in this area.(03) Jamal N. Al-Karaki, The Hashemite University Ahmed E. Kamal, Iowa State University(04) This paper presents Routing Challenges And Design Issues In WSNs And Routing Protocols of WSNs.(04) Aram Galstyan, Bhaskar Krishnamachari, Kristina Lerman(05) Author presented an efficient mechanism for emergent coordination Between autonomous nodes based on game dynamics.(05) Z Liu , I Elhanany(06) Here RL-MAC has been introduced for WSN which contains a RL framework.
(06) N i k l a s w i r s t r ö m(07) In this Master’s thesis Author explore how the nodes of a WSN can use policies for self-configuration.(07) Ping Wang Ting Wang(08) Here, author present a novel routing scheme, AdaR that adaptively learns an optimal routing strategy, depending on multiple optimization goals.(08) Vladimir Dyo, Cecilia Mascolo(09) Here, author proposes an algorithm for energy efficient node discovery in sparsely connected mobile wireless sensor networks.
Here duty cycle is explored to take profit of temporal patterns.(09) Anna Egorova-Forster ,Amy L. Murphy(10) Here, author focus to create less cost routes with saving of available resources with the help of exchanging information among nodes.
(10) Anna Forster,University of Lugano, Switzerland(11) This paper present the link among learning ,MANETand WSN.(11) Anna Forstery and Amy L. Murphy(12) Here, author presents routing efficiency to various available mobile sinks.(12) Anna Forster,Amy L. Murphy(13) This work describes a study of applying reinforcement learning to balance energy expenditure in wireless sensor Networks.
(13) Kunal Shah, and Mohan Kumar Sensor Logic Inc(14) Here, Author presented Distributed Independent Reinforcement Learning (DIRL), a Q-learning based framework to enable autonomous self-learning.(14) Mihail Mihaylov,Karl Tuyls and Ann Nowé(15) Here, author presents a reinforcement learning algorithm to explore the network lifetime of Wireless Sensor Network (WSN) and decreased reported latency in existing manner.(15) Anna Forster And Amy L. Murphy(16) Here, author presents CLIQUE, which gives freedom to sensor node whether they want to become cluster head or not based upon machine learning approach.
(16) Somayeh Kianpisheh and Nasrolah Moghadam Charkari(17) Here, author presents that Wireless sensor networks are composed of small nodes with limited battery life and computational ability. Energy reduction in these networks is an important issue to extend network lifetime. Dynamic power management is a technique to conserve energy.
(17) Raghavendra V. Kulkarni, Anna Förster, and Ganesh Kumar Venayagamoorthy(18) Here, author presents an extensive survey of CI applications to various problems in WSNs from various research areas and publication venues is presented in the paper. Besides, a discussion on advantages and disadvantages of CI algorithms over traditional WSN solutions is offered. Here CI techniques is presented here for WSNs.(18) Kok-Lim Alvin Yau a, Peter Komisarczuk a,b, PaulD.
Teal a(19) In this article, Author advocates the use of reinforcement learning (RL) to achieve context awareness and intelligence.(19) Anju Arya and Amita Malik(20) This paper provides a brief overview of the routing protocols using Reinforcement learning approach for WSNs.(20) Varun K. Sharma,Shiv Shankar Prasad Shukla, & Varun Singh(21) In this paper Author have proposed a tailored Q-Learning Algorithm for routing scheme in wireless sensor network. This work is a updated form of available Q-Learning technique for WSN which targets to finish the convergence problem.(21) Kok-Lim Alvin Yau · Hock Guan Goh · David Chieng · Kae Hsiang Kwong(22) This article presents the basics of RL and WSN which leads to available RL algorithms with mathematical theorems to solve better the issues of WSN.
(22) Wenjing Guo1, Cairong Yan, Yanglan Gan and Ting Lu(23) To enhance the network lifetime of WSNs, author proposes a intelligent routing algorithm named RLLO. RLLO makes uses of the superiority of reinforcement learning (RL) and considers residual energy and hop count to define the reward function. It is to uniformly distribute the energy.(23) Mr. Ankit B. Patel and Dr.
Hitesh B. Shah(24) Here, Author implements routing strategies using Q-routing algorithms and compared them for the energy efficient aspects. His aim is to develop an energy efficient shortest path Q-routing algorithm using Reinforcement Learning.(24) Ibrahim Mustapha, Borhanuddin Mohd Ali , Mohd Fadlee A. Rasid, Aduwati Sali ,and Hafizal Mohamad(25) Here , Author propose a reinforcement learning-based energy optimized solution for sensor node based upon clustering algorithms.(25) Feng-Cheng Chang And Hsiang-Cheh Huang(26) Here, Author reviews the works that are related to intelligent sensor network.(26) Mohammad Abdulaziz Alwadi (27) Here, Author presents a novel integrated framework for achieving energy efficiency by proposing three stages of modeling from data.(27) Thien T.
T. Le and Sangman Moh(28) Here, Author present a reinforcement-learning-based communication range control (RL-CRC) algorithm to adaptively adjust the communication range at each sensor node while ensuring the network connectivity in dynamic WSNs.(28) Maksura Mahjabeen(29) Here, author consider the sensor nodes as intelligent agents that will adapt the next task by observed application behavior by using cooperative reinforcement learning where we introduce a reward function based on our specific condition that includes energy efficiency and performance.
(29) Gabriel Martins Dias, Maddalena Nurchis and Boris Bellalta(30) Here, Author proposes a dynamic sampling rate adaptation scheme based on reinforcement learning, able to tune sensors sampling interval on-the-fly, according to environmental conditions and application requirements.(30) 4 Open Issues This section listed variousopen issues which are available for research studies: · The messageexchange overhead. · Routing in WSNusing Reinforcement Learning. · Security concernsof Reinforcement Learning.· Agent Learningcost.
· Robustness ofReinforcement Learning.· EnergyConsumption.· Balance betweenConvergence and Power. · Efficient SelfLearning Algorithm for WSN.5Conclusions and Future ScopeReinforcement learning is easy to deploy in wireless sensornetwork to improve complete network lifetime.
The components of Reinforcementlearning include State, Action, Agent and Reward must be explore and definedwell in advance. From the perspective of WSN, this survey has presented variousreviews for the enhancement of Reinforcement learning features and algorithms.The Reinforcement learning algorithms provide a view that how they can be mostsuitable and useful in WSN. The future scope is lying in these facts that howReinforcement learning can be more useful by applying their techniquesefficiently to read unknown environment from the point of view of WSN.