An Efficient Framework For Load Balancing In Software-Defined Networks Using Twin Delayed Deep Deterministic Policy Gradient (TD3) Algorithm
DOI:
https://doi.org/10.64252/gg39vw08Keywords:
Load balancing, Reinforcement learning, TD3, SDN.Abstract
In an SDN, load balancers are indispensable for optimizing resource utilization, decreasing latency, and improving the quality of service. In recent years, the majority of conventional heuristic or rule-based approaches have not exhibited the ability to adapt to the progressively complex and dynamic nature of network traffic. This paper proposes the use of the Twin Delayed Deep Deterministic Policy Gradient (TD3), a sophisticated reinforcement learning technique, to address the load balancing issue in SDNs. In this paradigm, the SDN environment is depicted as a continuous-state, continuous-action Markov Decision Process (MDP), in which the agent acquires the optimal flow allocation policies through network interaction. TD3's dual Q-networks, delayed policy updates, and target policy smoothing provide superior stability and sample efficiency in comparison to conventional Deep Q-learning techniques. To make learning easier, the reward function considers factors like connection usage, flow delay, and load fairness. The superior
performance of the proposed TD3 based load balancing for SDN has been showcased through the simulation analysis on the basis of congestion, average latency and throughput.