nkotak / 1.58BitNetLinks

Experimental BitNet Implementation

☆65

Alternatives and similar repositories for 1.58BitNet

Users that are interested in 1.58BitNet are comparing it to the libraries listed below

Sorting:

rafacelente / bllama
1.58-bit LLaMa model
☆81Updated last year
astramind-ai / BitMat
An efficent implementation of the method proposed in "The Era of 1-bit LLMs"
☆153Updated 7 months ago
pranavjad / tinyllama-bitnet
Train your own small bitnet model
☆71Updated 7 months ago
VITA-Group / Q-GaLore
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.
☆199Updated 10 months ago
cognitivecomputations / laserRMT
This is our own implementation of 'Layer Selective Rank Reduction'
☆238Updated last year
arcee-ai / PruneMe
Automated Identification of Redundant Layer Blocks for Pruning in Large Language Models
☆236Updated last year
thomasgauthier / LoRD
Low-Rank adapter extraction for fine-tuned transformers models
☆171Updated last year
thooton / muse
Let's create synthetic textbooks together :)
☆75Updated last year
Entropy-xcy / bitnet158
☆69Updated last year
qwopqwop200 / gptqlora
GPTQLoRA: Efficient Finetuning of Quantized LLMs with GPTQ
☆103Updated 2 years ago
jadechip / nanoXLSTM
The simplest, fastest repository for training/finetuning medium-sized xLSTMs.
☆41Updated last year
keeeeenw / MicroLlama
Micro Llama is a small Llama based model with 300M parameters trained from scratch with $500 budget
☆150Updated last year
hahnyuan / PB-LLM
PB-LLM: Partially Binarized Large Language Models
☆152Updated last year
OpenGVLab / EfficientQAT
[ACL 2025 Main] EfficientQAT: Efficient Quantization-Aware Training for Large Language Models
☆270Updated last week
joey00072 / ohara
Collection of autoregressive model implementation
☆85Updated last month
AlpinDale / QuIP-for-Llama
Code for paper: "QuIP: 2-Bit Quantization of Large Language Models With Guarantees" adapted for Llama models
☆35Updated last year
VatsaDev / NanoPhi-alpha
GPT-2 small trained on phi-like data
☆66Updated last year
IST-DASLab / qmoe
Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".
☆275Updated last year
cognitivecomputations / grokadamw
☆130Updated 9 months ago
FasterDecoding / BitDelta
☆197Updated 5 months ago
Digitous / LLM-SLERP-Merge
Spherical Merge Pytorch/HF format Language Models with minimal feature loss.
☆123Updated last year
hydrallm / llama-moe-v1
☆95Updated last year
kroggen / mamba.c
Inference of Mamba models in pure C
☆186Updated last year
JD-P / minihf
MiniHF is an inference, human preference data collection, and fine-tuning tool for local language models. It is intended to help the user…
☆172Updated last week
cognitivecomputations / spectrum
☆121Updated last month
chu-tianxiang / QuIP-for-all
QuIP quantization
☆52Updated last year
BlinkDL / nanoRWKV
RWKV in nanoGPT style
☆189Updated 11 months ago
flawedmatrix / mamba-ssm
Implementation of mamba with rust
☆85Updated last year
Cornell-RelaxML / quip-sharp
☆536Updated 7 months ago
vikhyat / mixtral-inference
inference code for mixtral-8x7b-32kseqlen
☆99Updated last year