Personal Universes: Yampolskiy's Strangest Answer to the AI Alignment Problem
First, the thing this is all in service of. The AI alignment problem is the challenge of making a powerful AI system reliably pursue what we actually want it to pursue - getting its goals, values, and behaviour to line up with human intentions, and to stay lined up even as the system becomes more capable than the people supervising it. It sounds simple and is not: we struggle to state our own values precisely, those values conflict between people, and an AI optimising hard for a slightly-wrong objective can produce outcomes nobody asked for. The multi-agent version - aligning one system with all of humanity at once, rather than a single person - is harder still, and it is the specific version Personal Universes is trying to dodge. ...