JerryYin777 / Cross-Layer-Attention

Self Reproduction Code of Paper "Reducing Transformer Key-Value Cache Size with Cross-Layer Attention (MIT CSAIL)
12Updated 7 months ago

Alternatives and similar repositories for Cross-Layer-Attention:

Users that are interested in Cross-Layer-Attention are comparing it to the libraries listed below