kyegomez / Python-Package-TemplateLinks
A easy, reliable, fluid template for python packages complete with docs, testing suites, readme's, github workflows, linting and much much more
☆178Updated 2 months ago
Alternatives and similar repositories for Python-Package-Template
Users that are interested in Python-Package-Template are comparing it to the libraries listed below
Sorting:
- LoRA and DoRA from Scratch Implementations☆204Updated last year
- Official implementation of the paper: "ZClip: Adaptive Spike Mitigation for LLM Pre-Training".☆127Updated this week
- ☆178Updated 6 months ago
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆55Updated last year
- An extension of the nanoGPT repository for training small MOE models.☆152Updated 3 months ago
- open source alpha evolve☆64Updated last month
- Efficient LLM Inference over Long Sequences☆378Updated 3 weeks ago
- Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch☆521Updated last month
- Quick implementation of nGPT, learning entirely on the hypersphere, from NvidiaAI☆285Updated 3 weeks ago
- PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"☆171Updated 2 months ago
- Repo for "Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture"☆554Updated 6 months ago
- PyTorch implementation of models from the Zamba2 series.☆182Updated 5 months ago
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.☆198Updated 11 months ago
- Attempt to make multiple residual streams from Bytedance's Hyper-Connections paper accessible to the public☆85Updated last week
- Pytorch implementation of the PEER block from the paper, Mixture of A Million Experts, by Xu Owen He at Deepmind☆127Updated 10 months ago
- ☆174Updated 5 months ago
- working implimention of deepseek MLA☆42Updated 5 months ago
- Get down and dirty with FlashAttention2.0 in pytorch, plug in and play no complex CUDA kernels☆106Updated last year
- Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models☆317Updated 4 months ago
- Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"☆239Updated 4 months ago
- Code for Adam-mini: Use Fewer Learning Rates To Gain More https://arxiv.org/abs/2406.16793☆422Updated last month
- Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)☆158Updated 2 months ago
- minimal GRPO implementation from scratch☆90Updated 3 months ago
- Awesome list of papers that extend Mamba to various applications.☆133Updated 2 weeks ago
- Collection of autoregressive model implementation☆85Updated 2 months ago
- ☆190Updated last week
- An open source implementation of LFMs from Liquid AI: Liquid Foundation Models☆168Updated last month
- ☆286Updated 2 months ago
- An open source implementation of LFMs from Liquid AI: Liquid Foundation Models☆98Updated 8 months ago
- Data preparation code for Amber 7B LLM☆91Updated last year