kyegomez / BitNetLinks

Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch

☆1,880

Alternatives and similar repositories for BitNet

Users that are interested in BitNet are comparing it to the libraries listed below

Sorting:

Beomi / BitNet-Transformers
0️⃣1️⃣🤗 BitNet-Transformers: Huggingface Transformers Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" i…
☆307Updated last year
SakanaAI / evolutionary-model-merge
Official repository of Evolutionary Optimization of Model Merging Recipes
☆1,368Updated 10 months ago
jiaweizzhao / GaLore
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
☆1,610Updated 11 months ago
Vahe1994 / AQLM
Official Pytorch repository for Extreme Compression of Large Language Models via Additive Quantization https://arxiv.org/pdf/2401.06118.p…
☆1,296Updated 2 months ago
AnswerDotAI / fsdp_qlora
Training LLMs with QLoRA + FSDP
☆1,527Updated 11 months ago
huggingface / optimum-nvidia
☆1,004Updated 8 months ago
myshell-ai / JetMoE
Reaching LLaMA2 Performance with 0.1M Dollars
☆985Updated last year
redotvideo / mamba-chat
Mamba-Chat: A chat LLM based on the state-space model architecture 🐍
☆932Updated last year
mobiusml / hqq
Official implementation of Half-Quadratic Quantization (HQQ)
☆881Updated last month
pytorch / ao
PyTorch native quantization and sparsity for training and inference
☆2,392Updated this week
intel / intel-extension-for-transformers
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Pl…
☆2,166Updated last year
microsoft / Samba
[ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
☆915Updated 5 months ago
likejazz / llama3.np
llama3.np is a pure NumPy implementation for Llama 3 model.
☆990Updated 5 months ago
MDK8888 / GPTFast
Accelerate your Hugging Face Transformers 7.6-9x. Native to Hugging Face and PyTorch.
☆686Updated last year
ridgerchu / matmulfreellm
Implementation for MatMul-free LM.
☆3,032Updated 2 months ago
mit-han-lab / llm-awq
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
☆3,289Updated 2 months ago
meta-pytorch / gpt-fast
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
☆6,112Updated last month
AI-Hypercomputer / maxtext
A simple, performant and scalable Jax LLM!
☆1,923Updated this week
pytorch / torchtitan
A PyTorch native platform for training generative AI models
☆4,504Updated this week
lucidrains / self-rewarding-lm-pytorch
Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI
☆1,399Updated last year
Lightning-AI / lightning-thunder
PyTorch compiler that accelerates training and inference. Get built-in optimizations for performance, memory, parallelism, and easily wri…
☆1,413Updated this week
meta-pytorch / torchtune
PyTorch native post-training library
☆5,523Updated this week
microsoft / VPTQ
VPTQ, A Flexible and Extreme low-bit quantization algorithm
☆657Updated 5 months ago
huggingface / optimum-quanto
A pytorch quantization backend for optimum
☆991Updated last month
OpenGVLab / OmniQuant
[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
☆853Updated 4 months ago
arcee-ai / mergekit
Tools for merging pretrained large language models.
☆6,352Updated 3 weeks ago
facebookresearch / schedule_free
Schedule-Free Optimization in PyTorch
☆2,217Updated 4 months ago
casper-hansen / AutoAWQ
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
☆2,254Updated 5 months ago
mit-han-lab / TinyChatEngine
TinyChatEngine: On-Device LLM Inference Library
☆897Updated last year
Cornell-RelaxML / quip-sharp
☆558Updated 11 months ago