Learning the Minimal Representation of a Dynamic System from Transition Data

44 Pages Posted: 18 Feb 2021 Last revised: 23 Apr 2021

See all articles by Mohammed Amine Bennouna

Mohammed Amine Bennouna

Massachusetts Institue of Technology (MIT) - Operations Research Center

Dessislava Pachamanova

Babson College

Georgia Perakis

Massachusetts Institute of Technology (MIT) - Sloan School of Management

Omar Skali Lami

Massachusetts Institute of Technology (MIT) - Operations Research Center

Date Written: January 10, 2021

Abstract

This paper proposes a novel framework for learning a concise MDP model of a continuous state space dynamic system from given observed transition data. Most existing methods in offline reinforcement learning construct functional approximations of the value or the transition and reward functions, requiring complex and often not interpretable function approximators. Our approach instead relies on partitioning the system's feature space into regions constituting states of a finite deterministic MDP representing the system. We discuss what is theoretically the minimal MDP representation that preserves the values, and therefore the optimal policy, of the dynamic system. We define formally the problem of learning such a concise representation from transition data without exploration. To solve this problem, we introduce an in-sample property on partitions of the feature space we name coherence, and show that if the class of possible partitions is of finite VC dimension, any coherent partition with the transition data converges to the minimal representation of the system with provable finite-sample PAC convergence guarantees. This theoretical insight motivates our Minimal Representation Learning (MRL) algorithm that constructs from transition data an MDP representation that approximates the minimal representation of the system. We illustrate the effectiveness of the proposed framework through numerical experiments.

Keywords: offline reinforcement learning, statistical learning, off-policy evaluation, data-driven decision making, state representation learning, MDP state aggregation

Suggested Citation

Bennouna, Mohammed Amine and Pachamanova, Dessislava and Perakis, Georgia and Skali Lami, Omar, Learning the Minimal Representation of a Dynamic System from Transition Data (January 10, 2021). Available at SSRN: https://ssrn.com/abstract=3785547 or http://dx.doi.org/10.2139/ssrn.3785547

Mohammed Amine Bennouna (Contact Author)

Massachusetts Institue of Technology (MIT) - Operations Research Center ( email )

77 Massachusetts Avenue
Bldg. E 40-149
Cambridge, MA 02139
United States

HOME PAGE: http://https://www.mit.edu/~amineben/

Dessislava Pachamanova

Babson College ( email )

Babson Park, MA 02157
United States
781-235-1200 (Phone)
781-239-6414 (Fax)

Georgia Perakis

Massachusetts Institute of Technology (MIT) - Sloan School of Management ( email )

100 Main Street
E62-565
Cambridge, MA 02142
United States

Omar Skali Lami

Massachusetts Institute of Technology (MIT) - Operations Research Center ( email )

77 Massachusetts Avenue
Bldg. E 40-149
Cambridge, MA 02139
United States

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
116
Abstract Views
339
rank
284,710
PlumX Metrics