Learning Rational Expectations via Policy Gradient: An Economic Analysis
44 Pages Posted: 8 May 2019 Last revised: 29 Jul 2020
Date Written: April 5, 2019
This paper conducts marginal analysis and refines the updating rules for the adaptive learning models (e.g., Erev & Roth 1998) based on approaches in computer science. We propose Policy Gradient Reinforcement Learning (PGRL) to simulate the equilibration process of a decentralized market economy, where utility maximizers use only received payoffs to coordinate actions towards rational expectations. For each learning experience, the adaptive rules not only reinforce the chosen action by the marginal gain of learning, but also adjust each foregone choice by its marginal opportunity cost. Consequently, players exhibit risk-seeking behavior during earlier exploration but are risk averse later due to diminishing marginal utility of learning. The effectiveness of the refined rules is demonstrated through a call market simulation that generates diverse and complex dynamics.
Keywords: Rational expectations; Game Theory; Reinforcement Learning; Policy Gradient Theorem; Stochastic Gradient Ascent; Call Market
JEL Classification: B53, C73, D81, D83
Suggested Citation: Suggested Citation