Learning Rational Expectations via Policy Gradient: An Economic Analysis

44 Pages Posted: 8 May 2019 Last revised: 29 Jul 2020

See all articles by Zhongzhi Lawrence He

Zhongzhi Lawrence He

Brock University, Goodman School of Business

Date Written: April 5, 2019


This paper conducts marginal analysis and refi nes the updating rules for the adaptive learning models (e.g., Erev & Roth 1998) based on approaches in computer science. We propose Policy Gradient Reinforcement Learning (PGRL) to simulate the equilibration process of a decentralized market economy, where utility maximizers use only received payoffs to coordinate actions towards rational expectations. For each learning experience, the adaptive rules not only reinforce the chosen action by the marginal gain of learning, but also adjust each foregone choice by its marginal opportunity cost. Consequently, players exhibit risk-seeking behavior during earlier exploration but are risk averse later due to diminishing marginal utility of learning. The effectiveness of the refi ned rules is demonstrated through a call market simulation that generates diverse and complex dynamics.

Keywords: Rational expectations; Game Theory; Reinforcement Learning; Policy Gradient Theorem; Stochastic Gradient Ascent; Call Market

JEL Classification: B53, C73, D81, D83

Suggested Citation

He, Zhongzhi Lawrence, Learning Rational Expectations via Policy Gradient: An Economic Analysis (April 5, 2019). Available at SSRN: https://ssrn.com/abstract=3366768 or http://dx.doi.org/10.2139/ssrn.3366768

Zhongzhi Lawrence He (Contact Author)

Brock University, Goodman School of Business ( email )

500 Glenridge Avenue
St. Catherine's, Ontario L2S 3A1

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Abstract Views
PlumX Metrics