What is AI Alignment?

The research field of ensuring AI behavior aligns with human intentions and values. Misaligned AI might ‘follow instructions but do the wrong thing’ — like being asked to increase website traffic and launching a DDoS attack. RLHF and Constitutional AI are current mainstream alignment methods.