Reward Reports for Reinforcement Learning

Towards Documentation and Understanding of Dynamic Machine Learning Systems

‼️ HIRING! We are hiring an intern(s) to help build out Reward Reports! If interested in ethical AI, documentation, or sociotechnical considerations of RL systems, please contact thomaskrendlgilbert at gmail dot com.

Building on the documentation frameworks for “model cards” and “datasheets” proposed by Mitchell et al. and Gebru et al., we argue the need for Reward Reports for AI systems. In a whitepaper recently published by the Center for Long-Term Cybersecurity, we introduced Reward Reports as living documents for proposed RL deployments that demarcate design choices. However, many questions remain about the applicability of this framework to different RL applications, roadblocks to system interpretability, and the resonances between deployed supervised machine learning systems and the sequential decision-making utilized in RL. At a minimum, Reward Reports are an opportunity for RL practitioners to deliberate on these questions and begin the work of deciding how to resolve them in practice.

To learn more about Reward Reports, see the Reward Reports paper, the CLTC RL Risks Whitepaper , or the github repo with a template, contribution guide, and more!