cg123 / bitnetLinks

Modeling code for a BitNet b1.58 Llama-style model.

☆25

Alternatives and similar repositories for bitnet

Users that are interested in bitnet are comparing it to the libraries listed below

Sorting:

OpenEvaByte / evabyte
EvaByte: Efficient Byte-level Language Models at Scale
☆103Updated 3 months ago
BlinkDL / modded-nanogpt-rwkv
RWKV-7: Surpassing GPT
☆94Updated 8 months ago
serp-ai / Parameter-Efficient-MoE
Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks
☆31Updated last year
arcee-ai / DAM
☆53Updated 8 months ago
RWKV / ZeroCoT
https://x.com/BlinkDL_AI/status/1884768989743882276
☆28Updated 3 months ago
QuixiAI / grokadamw
☆134Updated 11 months ago
euclaise / supertrainer2000
☆49Updated last year
Zyphra / Zyda_processing
☆37Updated last year
joey00072 / ohara
Collection of autoregressive model implementation
☆86Updated 3 months ago
recursal / GoldFinch-paper
GoldFinch and other hybrid transformer components
☆46Updated last year
TRI-ML / linear_open_lm
A repository for research on medium sized language models.
☆78Updated last year
RobertCsordas / moe
Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"
☆38Updated last month
Mihaiii / backtrack_sampler
An easy-to-understand framework for LLM samplers that rewind and revise generated tokens
☆140Updated 5 months ago
Zyphra / tree_attention
Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters
☆127Updated 8 months ago
Alex-Gurung / ReasoningNCP
Official repo for Learning to Reason for Long-Form Story Generation
☆68Updated 3 months ago
chu-tianxiang / QuIP-for-all
QuIP quantization
☆54Updated last year
ElleLeonne / Lightning-ReLoRA
A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.
☆33Updated last year
jadechip / nanoXLSTM
The simplest, fastest repository for training/finetuning medium-sized xLSTMs.
☆41Updated last year
VITA-Group / Q-GaLore
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.
☆198Updated last year
kaiokendev / cutoff-len-is-context-len
Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit
☆63Updated 2 years ago
samchaineau / llm_slerp_generation
Repo hosting codes and materials related to speeding LLMs' inference using token merging.
☆36Updated 2 weeks ago
casper-hansen / OpenCoconut
OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.
☆173Updated 6 months ago
RobertCsordas / moeut
☆83Updated 11 months ago
minyoungg / LTE
☆68Updated last year
LegallyCoder / mamba-hf
Implementation of the Mamba SSM with hf_integration.
☆56Updated 11 months ago
nanowell / Q-Sparse-LLM
My Implementation of Q-Sparse: All Large Language Models can be Fully Sparsely-Activated
☆33Updated 11 months ago
RobertCsordas / moe_attention
Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"
☆98Updated 10 months ago
tval2 / contextual-pruning
Library to facilitate pruning of LLMs based on context
☆32Updated last year
EduardTalianu / EntropixLab
entropix style sampling + GUI
☆26Updated 9 months ago
minosvasilias / simple_grpo
Simple GRPO scripts and configurations.
☆59Updated 5 months ago