What is Attention Mechanism?
Allows models to ‘pay attention’ to other relevant words in the input when processing each word. Like when reading ‘bank’, you judge from context whether it means ‘riverbank’ or ‘financial institution’. This is the core technology behind Transformers.