NVlabs / hymba
☆58Updated this week
Alternatives and similar repositories for hymba:
Users that are interested in hymba are comparing it to the libraries listed below
- Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"☆93Updated last month
- PyTorch implementation of models from the Zamba2 series.☆159Updated this week
- Collection of autoregressive model implementation☆67Updated this week
- Pytorch implementation of the PEER block from the paper, Mixture of A Million Experts, by Xu Owen He at Deepmind☆112Updated 3 months ago
- ☆51Updated last month
- Implementation of Infini-Transformer in Pytorch☆104Updated last month
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024☆233Updated this week
- ☆39Updated 10 months ago
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"☆85Updated 3 months ago
- Block Transformer: Global-to-Local Language Modeling for Fast Inference (Official Code)☆135Updated last month
- Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Mode…☆79Updated 2 months ago
- Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters☆104Updated 2 months ago