kyegomez / Python-Package-Template
A easy, reliable, fluid template for python packages complete with docs, testing suites, readme's, github workflows, linting and much much more
☆141Updated this week
Related projects ⓘ
Alternatives and complementary repositories for Python-Package-Template
- ☆175Updated this week
- Implementation of Infini-Transformer in Pytorch☆104Updated last month
- LoRA and DoRA from Scratch Implementations☆188Updated 8 months ago
- PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"☆134Updated this week
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆49Updated 7 months ago
- Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters☆104Updated last month
- Implementation of the Llama architecture with RLHF + Q-learning☆156Updated 10 months ago
- Quick implementation of nGPT, learning entirely on the hypersphere, from NvidiaAI☆234Updated this week
- Implementation of MoE Mamba from the paper: "MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts" in Pytorch and Ze…☆83Updated this week
- Griffin MQA + Hawk Linear RNN Hybrid☆85Updated 6 months ago
- ☆89Updated last week
- Block Transformer: Global-to-Local Language Modeling for Fast Inference (Official Code)☆133Updated last month
- A framework to study AI models in Reasoning, Alignment, and use of Memory (RAM).☆141Updated this week
- Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch☆474Updated 2 weeks ago
- Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modeling☆167Updated this week
- Just some miscellaneous utility functions / decorators / modules related to Pytorch and Accelerate to help speed up implementation of new…☆117Updated 3 months ago
- Repository for StripedHyena, a state-of-the-art beyond Transformer architecture☆267Updated 8 months ago
- Pytorch implementation of the PEER block from the paper, Mixture of A Million Experts, by Xu Owen He at Deepmind☆111Updated 2 months ago
- Understand and test language model architectures on synthetic tasks.☆161Updated 6 months ago
- Build high-performance AI models with modular building blocks☆412Updated this week
- ☆227Updated 2 months ago
- Official Repository of The Mamba in the Llama: Distilling and Accelerating Hybrid Models☆169Updated this week
- The official implementation of the paper "What Matters in Transformers? Not All Attention is Needed".☆120Updated 2 weeks ago
- A MAD laboratory to improve AI architecture designs 🧪☆95Updated 6 months ago
- Awesome list of papers that extend Mamba to various applications.☆127Updated last month
- Code for exploring Based models from "Simple linear attention language models balance the recall-throughput tradeoff"☆212Updated 2 months ago
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024☆202Updated last week
- [NeurIPS 24 Spotlight] MaskLLM: Learnable Semi-structured Sparsity for Large Language Models☆108Updated last week
- Code for Adam-mini: Use Fewer Learning Rates To Gain More https://arxiv.org/abs/2406.16793☆322Updated last week
- Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorch☆291Updated 4 months ago