LiqunMa / FBI-LLMView external linksLinks
FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive Distillation
☆51Aug 24, 2025Updated 5 months ago
Alternatives and similar repositories for FBI-LLM
Users that are interested in FBI-LLM are comparing it to the libraries listed below
Sorting:
- XVERSE-MoE-A36B: A multilingual large language model developed by XVERSE Technology Inc.☆38Sep 12, 2024Updated last year
- Official Implementation of FastKV: Decoupling of Context Reduction and KV Cache Compression for Prefill-Decoding Acceleration☆29Nov 22, 2025Updated 2 months ago
- ☆119Jan 8, 2026Updated last month
- Information Bottleneck in DNN with PyTorch☆15Jul 6, 2023Updated 2 years ago
- Code for paper: "Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines"☆11Oct 11, 2024Updated last year
- ☆11Apr 3, 2023Updated 2 years ago
- Official PyTorch implementation of CD-MOE☆12Mar 29, 2025Updated 10 months ago
- [ACL 2025 Main] EfficientQAT: Efficient Quantization-Aware Training for Large Language Models☆327Nov 26, 2025Updated 2 months ago
- ☆12May 22, 2022Updated 3 years ago
- ☆16Dec 9, 2023Updated 2 years ago
- CASS: Nvidia to AMD Transpilation with Data, Models, and Benchmark☆34Jun 24, 2025Updated 7 months ago
- [EMNLP 2024] Quantize LLM to extremely low-bit, and finetune the quantized LLMs☆15Jul 18, 2024Updated last year
- Official repository for the paper "NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks". This rep…☆60Oct 31, 2024Updated last year
- ☆67Mar 30, 2025Updated 10 months ago
- ☆35Dec 22, 2025Updated last month
- Structured Binary Neural Networks for Image Recognition☆18Nov 18, 2021Updated 4 years ago
- Official Pytorch Implementation of Paper "DarwinLM: Evolutionary Structured Pruning of Large Language Models"☆20Feb 21, 2025Updated 11 months ago
- BESA is a differentiable weight pruning technique for large language models.☆17Mar 4, 2024Updated last year
- GRadient-INformed MoE☆264Sep 25, 2024Updated last year
- ☆49Mar 14, 2025Updated 11 months ago
- The predecessor of CiteLab.☆18Feb 3, 2026Updated last week
- ☆15Jun 4, 2024Updated last year
- Official Repo for SparseLLM: Global Pruning of LLMs (NeurIPS 2024)☆67Mar 27, 2025Updated 10 months ago
- [ICML 2024] BiLLM: Pushing the Limit of Post-Training Quantization for LLMs☆228Jan 11, 2025Updated last year
- Tender: Accelerating Large Language Models via Tensor Decompostion and Runtime Requantization (ISCA'24)☆25Jul 4, 2024Updated last year
- ☆20Mar 6, 2022Updated 3 years ago
- [ACL 2024] A novel QAT with Self-Distillation framework to enhance ultra low-bit LLMs.☆134May 16, 2024Updated last year
- [ICML 2024] Official Implementation of SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks☆39Feb 4, 2025Updated last year
- Low-Rank Llama Custom Training☆23Mar 27, 2024Updated last year
- Generative Modeling with Bayesian Sample Inference☆24May 17, 2025Updated 8 months ago
- [COLM 2025] Official PyTorch implementation of "Quantization Hurts Reasoning? An Empirical Study on Quantized Reasoning Models"☆67Jul 8, 2025Updated 7 months ago
- A simple pytorch implementation of Differentiable Architecture Search (DARTS)☆22Aug 27, 2019Updated 6 years ago
- [CVPR 2024] Official implementation for "A&B BNN: Add&Bit-Operation-Only Hardware-Friendly Binary Neural Network"☆24Dec 5, 2025Updated 2 months ago
- [ICLR 2025] Official Pytorch Implementation of "Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN" by Pengxia…☆29Jul 24, 2025Updated 6 months ago
- This is the repo for our paper "Mr-Ben: A Comprehensive Meta-Reasoning Benchmark for Large Language Models"☆51Oct 31, 2024Updated last year
- This repo contains the code for studying the interplay between quantization and sparsity methods☆26Feb 26, 2025Updated 11 months ago
- [ICLR'25] ARB-LLM: Alternating Refined Binarizations for Large Language Models☆28Aug 5, 2025Updated 6 months ago
- FireQ: Fast INT4-FP8 Kernel and RoPE-aware Quantization for LLM Inference Acceleration☆20Jun 27, 2025Updated 7 months ago
- Official Implementation for "In-Context Reinforcement Learning for Variable Action Spaces"☆91Feb 11, 2024Updated 2 years ago