Experimental BitNet Implementation
☆74Nov 27, 2025Updated 3 months ago
Alternatives and similar repositories for 1.58BitNet
Users that are interested in 1.58BitNet are comparing it to the libraries listed below
Sorting:
- ☆70Mar 1, 2024Updated 2 years ago
- Train your own small bitnet model☆78Oct 20, 2024Updated last year
- A simple and minimal open source implementation of "Introducing LFM2: The Fastest On-Device Foundation Models on the Market" from Liquid …☆23Feb 9, 2026Updated 3 weeks ago
- ☆17Feb 29, 2024Updated 2 years ago
- 0️⃣1️⃣🤗 BitNet-Transformers: Huggingface Transformers Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" i…☆312Mar 17, 2024Updated last year
- Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch☆1,896Feb 6, 2026Updated 3 weeks ago
- ☆29Feb 27, 2024Updated 2 years ago
- Fork of Flame repo for training of some new stuff in development☆19Updated this week
- An efficent implementation of the method proposed in "The Era of 1-bit LLMs"☆155Oct 15, 2024Updated last year
- Fast and memory-efficient exact attention☆29Dec 2, 2024Updated last year
- PyTorch Implementation of the paper "MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training"☆26Feb 9, 2026Updated 3 weeks ago
- Implementation for robust ViT and scaled attention☆21Apr 4, 2025Updated 11 months ago
- Simple LLM inference server☆20Jun 13, 2024Updated last year
- LLM checkpointing for DeepSpeed/Megatron☆25Nov 30, 2025Updated 3 months ago
- ☆24Sep 25, 2024Updated last year
- ☆27May 3, 2024Updated last year
- ☆27Jul 11, 2024Updated last year
- parallel LSTM from paper Were RNNs All We Needed?.☆29Oct 11, 2024Updated last year
- quick playground to animate pippin☆14Nov 11, 2024Updated last year
- Implementation of the paper: "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" from Google in pyTO…☆58Feb 9, 2026Updated 3 weeks ago
- Universal Neurons in GPT2 Language Models☆30May 28, 2024Updated last year
- Minimal implementation of the Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models paper (ArXiv 20232401.01335)☆29Mar 1, 2024Updated 2 years ago
- Official implementation of the paper "You Do Not Fully Utilize Transformer's Representation Capacity"☆32May 28, 2025Updated 9 months ago
- ☆34Sep 10, 2024Updated last year
- OpenAI GPT hosted Agent Framework for Windows and MacOS☆36Jul 8, 2024Updated last year
- Let's create synthetic textbooks together :)☆76Jan 29, 2024Updated 2 years ago
- Landing repository for the paper "Softpick: No Attention Sink, No Massive Activations with Rectified Softmax"☆88Sep 12, 2025Updated 5 months ago
- Best Movie App with Ionic 4 using The Movie DB API☆16May 24, 2019Updated 6 years ago
- Scrape xnxx[dot]com with python.☆10Dec 30, 2020Updated 5 years ago
- An unofficial implementation of "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆36Jun 7, 2024Updated last year
- Spring 2021 Short Course materials☆12Dec 2, 2024Updated last year
- Convolutional Channel-wise Competitive Learning for the Forward-Forward Algorithm. AAAI 2024☆11Jun 27, 2024Updated last year
- A blueprint for next-gen AI. Project Infinity uses a token-efficient, Codified Agent Protocol to create specialized, secure, and imaginat…☆25Oct 2, 2025Updated 5 months ago
- Multi-Layer Key-Value sharing experiments on Pythia models☆34Jun 14, 2024Updated last year
- [ICML 2023] "Outline, Then Details: Syntactically Guided Coarse-To-Fine Code Generation", Wenqing Zheng, S P Sharan, Ajay Kumar Jaiswal, …☆43Nov 9, 2023Updated 2 years ago
- Collection of autoregressive model implementation☆85Feb 23, 2026Updated last week
- Stop messing around with finicky sampling parameters and just use DRµGS!☆360Jun 1, 2024Updated last year
- Extract streaming data from text using prefix completion.☆10Oct 6, 2024Updated last year
- Discord Docsbot, Built on bgent☆11Jun 17, 2024Updated last year