cg123 / bitnet
Modeling code for a BitNet b1.58 Llama-style model.
☆22Updated 4 months ago
Related projects: ⓘ
- ☆48Updated 6 months ago
- ☆77Updated last month
- The simplest, fastest repository for training/finetuning medium-sized xLSTMs.☆38Updated 3 months ago
- ☆68Updated 2 months ago
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.☆158Updated 2 months ago
- ☆55Updated 9 months ago
- Set of scripts to finetune LLMs☆36Updated 5 months ago
- ☆109Updated last month
- ☆50Updated last month
- Low-Rank adapter extraction for fine-tuned transformers model☆154Updated 4 months ago
- ☆75Updated 3 weeks ago
- Repository for the paper Stream of Search: Learning to Search in Language☆70Updated last month
- Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters☆94Updated 2 weeks ago
- Spherical Merge Pytorch/HF format Language Models with minimal feature loss.☆107Updated last year
- an implementation of Self-Extend, to expand the context window via grouped attention☆117Updated 8 months ago
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆123Updated 6 months ago
- Official Repository of The Mamba in the Llama: Distilling and Accelerating Hybrid Models☆130Updated this week
- ☆29Updated 2 weeks ago
- A repository for research on medium sized language models.☆71Updated 3 months ago
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆31Updated 3 months ago
- ☆26Updated last year
- Small and Efficient Mathematical Reasoning LLMs☆69Updated 7 months ago
- Prune transformer layers☆60Updated 3 months ago
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆46Updated 5 months ago
- Unofficial Implementation of Evolutionary Model Merging☆33Updated 5 months ago
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆81Updated last year
- Just a bunch of benchmark logs for different LLMs☆112Updated last month
- QuIP quantization☆41Updated 6 months ago
- Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"☆87Updated 8 months ago
- GPT-2 small trained on phi-like data☆65Updated 7 months ago