Campus Units
Electrical and Computer Engineering
Document Type
Conference Proceeding
Conference
11th International Conference on Wireless Communications and Signal Processing (WCSP)
Publication Version
Accepted Manuscript
Link to Published Version
https://doi.org/10.1109/WCSP.2019.8928124
Publication Date
2019
Journal or Book Title
2019 11th International Conference on Wireless Communications and Signal Processing (WCSP)
DOI
10.1109/WCSP.2019.8928124
Conference Title
11th International Conference on Wireless Communications and Signal Processing (WCSP)
Conference Date
October 23-25, 2019
City
Xi'an, China
Abstract
This work presents two reinforcement learning (RL) architectures, which mimic rational humans in the way of analyzing the available information and making decisions. The proposed algorithms are called selector-actor-critic (SAC) and tuner-actor-critic (TAC). They are obtained by modifying the well known actor-critic (AC) algorithm. SAC is equipped with an actor, a critic, and a selector. The role of the selector is to determine the most promising action at the current state based on the last estimate from the critic. TAC is model based, and consists of a tuner, a model-learner, an actor, and a critic. After receiving the approximated value of the current state-action pair from the critic and the learned model from the model-learner, the tuner uses the Bellman equation to tune the value of the current state-action pair. Then, this tuned value is used by the actor to optimize the policy. We investigate the performance of the proposed algorithms, and compare with AC algorithm to show the advantages of the proposed algorithms using numerical simulations.
Rights
© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Copyright Owner
IEEE
Copyright Date
2019
Language
en
File Format
application/pdf
Recommended Citation
Masadeh, Ala'eddin; Wang, Zhengdao; and Kamal, Ahmed, "Selector-Actor-Critic and Tuner-Actor-Critic Algorithms for Reinforcement Learning" (2019). Electrical and Computer Engineering Conference Papers, Posters and Presentations. 104.
https://lib.dr.iastate.edu/ece_conf/104
Comments
This is a manuscript of a proceeding published as Masadeh, Ala'eddin, Zhengdao Wang, and Ahmed E. Kamal. "Selector-Actor-Critic and Tuner-Actor-Critic Algorithms for Reinforcement Learning." In 2019 11th International Conference on Wireless Communications and Signal Processing (WCSP). DOI: 10.1109/WCSP.2019.8928124. Posted with permission.