Attention Guidance: Guiding Attention for Self-Supervised Learning with Transformers
Date:
Presented our work on using attention guidance, in which we use intuitive priors to modify self-attention heads in Transformers to get faster convergence and better performance. [Slides]