Attention Guidance: Guiding Attention for Self-Supervised Learning with Transformers

Date:

Presented our work on using attention guidance, in which we use intuitive priors to modify self-attention heads in Transformers to get faster convergence and better performance. [Slides]