Meta: Over the past few months, we've held a seminar series on the Simulators theory by janus. As the theory is actively under development, the purpose of the series is to discover central structures and open problems. Our aim with this sequence is to share some of our discussion with a broader audience, and to encourage new research on the questions we uncover. Below, we outline the broader rationale and shared assumptions of the participants of the seminar.

Aligning AI is a crucial task that needs to be addressed as AI systems rapidly become more capable. The core part of the alignment problem involves "deconfusing," which entails conceptual work and engineering, identifying unknown unknowns, and transitioning from philosophy to mathematics to algorithms to engineering. The problem is complex because we have to reason about something that doesn't yet exist. However, this does not mean that we should ignore evidence as it emerges. It is essential to carefully consider the GPT paradigm as it is being developed and implemented. One feasible-seeming approach is "accelerating alignment," which involves leveraging AI as it is developed to help solve the challenging problems of alignment. This is not a novel idea, as it has been previously suggested in concepts such as seed AI, nanny AI, and iterated amplification and distillation (IDA).

cop