jaketae / alibi

PyTorch implementation of Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
25Updated 2 years ago

Related projects: