Rethinking Transformers Through Duality Principles

 

Financing: Internal Funding KU Leuven (KU Leuven)

Project reference Nr.: C14/24/103
Start: 2024-10-01
End: 2028-09-30

Description:

The AI renaissance has been propelled by groundbreaking deep learning models, with the transformer architecture and its attention mechanism at the forefront. Notably, transformers are the foundation for famous models like ChatGPT and BERT and have witnessed lesser-known successes in computer vision, robotics, etc. Our goal is to develop a framework using generalized duality so as to understand and improve performance and efficiency of transformer architectures and the associated training procedures. This framework will improve current transformer architectures, apply to other deep learning models and even motivate entirely new architectures. We aim to unlock the full potential of transformers, making them more accessible to a wider range of users and applications, thereby democratizing the benefits of this technology, reducing the dependency on immense resources and data and fostering broader innovation and application across various domains of AI research.


 

SMC people involved in the project: