Reward tampering problems and solutions in reinforcement learning: a causal influence diagram perspective
Citation
Everitt, T, Hutter, M, Kumar, R et al. 2021, 'Reward tampering problems and solutions in reinforcement learning: a causal influence diagram perspective', Synthese, vol. 198, pp. 1-33.