Next steps

💡

What now? What can you do to contribute and work on alignment?

Here are our thoughts.

Here, we give a few different ways to begin onboarding to work on alignment.

Current research directions and future work

One high-value thing to do would be to take one of the ideas listed in a "future work" section on Research Directions and start working on it. Many of the current methods in alignment research are still in their infancy, and there is a lot of low-hanging fruit to pick that will improve current methods and suggest which methods are worth working on. On that page, we’ve also linked relevant literature and sources that can be used to help you with the problem.

In particular, we view W2SG as a worthwhile path to pursue, and we highly recommend investigating ways to contribute to this research direction.

Novel alignment methods

Alignment is still in its early stages, and it’s unclear whether our current methods (or slight modifications thereof) will be sufficient to align AGI. Hence, spending time developing entirely new paradigms for alignment is also extremely high-value.

If you have specific thoughts here, writing up your ideas clearly and getting feedback from others (e.g., by posting to places like AlignmentForum) is worth it.

Measurements and benchmarks

There is currently no clear measure of “success” or even “progress” for AI alignment; as it is incredibly difficult to understand how much meaningful progress we are making.

A good proposal for such a metric that covers the engineering, ethical, and/or governmental difficulties of super alignment could be very useful in:

Ensuring we work on what’s important, and
Increasing the field’s motivation and tractability.

For example, a high-quality “MMLU for Alignment” would greatly contribute to alignment research.

Dipping your toes in mech interpretability

Research Directions Contact Us