TY - GEN
T1 - An Adaptive Online Parameter Control Algorithm for Particle Swarm Optimization Based on Reinforcement Learning
AU - Liu, Yaxian
AU - Lu, Hui
AU - Cheng, Shi
AU - Shi, Yuhui
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/6
Y1 - 2019/6
N2 - Parameter control is critical to the performance of any evolutionary algorithm (EA). In this paper, we propose a Q-Learning-based Particle Swarm Optimization (QLPSO) algorithm, which uses the Reinforcement Learning (RL) to train the parameters in Particle Swarm Optimization (PSO) algorithm. The core of the QLPSO algorithm is a three-dimensional Q table which consists of a state plane and an action axis. The state plane includes the state of the particles in both of the decision space and the objective space. The action axis controls the exploration and exploitation of particles by setting different parameters. The Q table can help particles to select actions according to their states. Besides, the Q table should be updated by reward function which is designed according to the performance change of particles and the number of iterations. The main difference between the QLPSO algorithms for single-objective and multi-objective optimization lies in the evaluation of the solution performance. In single-objective optimization, we only compare the fitness values of solutions, while in multi-objective optimization, we need to discuss the dominant relationship between solutions with the help of Pareto front. The performance of QLPSO is tested based on 6 single-objective and 5 multi-objective benchmark functions. The experiment results reveal the competitive performance of QLPSO compared with other algorithms.
AB - Parameter control is critical to the performance of any evolutionary algorithm (EA). In this paper, we propose a Q-Learning-based Particle Swarm Optimization (QLPSO) algorithm, which uses the Reinforcement Learning (RL) to train the parameters in Particle Swarm Optimization (PSO) algorithm. The core of the QLPSO algorithm is a three-dimensional Q table which consists of a state plane and an action axis. The state plane includes the state of the particles in both of the decision space and the objective space. The action axis controls the exploration and exploitation of particles by setting different parameters. The Q table can help particles to select actions according to their states. Besides, the Q table should be updated by reward function which is designed according to the performance change of particles and the number of iterations. The main difference between the QLPSO algorithms for single-objective and multi-objective optimization lies in the evaluation of the solution performance. In single-objective optimization, we only compare the fitness values of solutions, while in multi-objective optimization, we need to discuss the dominant relationship between solutions with the help of Pareto front. The performance of QLPSO is tested based on 6 single-objective and 5 multi-objective benchmark functions. The experiment results reveal the competitive performance of QLPSO compared with other algorithms.
KW - optimization problem
KW - parameter control
KW - particle swarm optimization
KW - reinforcement learning
UR - https://www.scopus.com/pages/publications/85071308630
U2 - 10.1109/CEC.2019.8790035
DO - 10.1109/CEC.2019.8790035
M3 - 会议稿件
AN - SCOPUS:85071308630
T3 - 2019 IEEE Congress on Evolutionary Computation, CEC 2019 - Proceedings
SP - 815
EP - 822
BT - 2019 IEEE Congress on Evolutionary Computation, CEC 2019 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2019 IEEE Congress on Evolutionary Computation, CEC 2019
Y2 - 10 June 2019 through 13 June 2019
ER -