Alignment and Reward Hacking
Examine the technical specifics of reinforcement learning, specification gaming, and the risks of agents maximizing rewards in unintended ways.
Examine the technical specifics of reinforcement learning, specification gaming, and the risks of agents maximizing rewards in unintended ways.