FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive Distillation
☆51Aug 24, 2025Updated 6 months ago
Alternatives and similar repositories for FBI-LLM
Users that are interested in FBI-LLM are comparing it to the libraries listed below
Sorting:
- The official implementation of Bi-Mamba☆14Oct 22, 2025Updated 4 months ago
- XVERSE-MoE-A36B: A multilingual large language model developed by XVERSE Technology Inc.☆39Sep 12, 2024Updated last year
- ☆120Jan 8, 2026Updated last month
- Information Bottleneck in DNN with PyTorch☆15Jul 6, 2023Updated 2 years ago
- Official PyTorch implementation of CD-MOE☆12Mar 29, 2025Updated 11 months ago
- ☆11Apr 3, 2023Updated 2 years ago
- [ACL 2025 Main] EfficientQAT: Efficient Quantization-Aware Training for Large Language Models☆328Nov 26, 2025Updated 3 months ago
- World's Smallest Vision-Language Model☆33Apr 7, 2024Updated last year
- ☆12May 22, 2022Updated 3 years ago
- Code for "Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective"☆33May 9, 2024Updated last year
- [EMNLP 2024] Quantize LLM to extremely low-bit, and finetune the quantized LLMs☆15Jul 18, 2024Updated last year
- CASS: Nvidia to AMD Transpilation with Data, Models, and Benchmark☆34Jun 24, 2025Updated 8 months ago
- Official repository for the paper "NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks". This rep…☆60Oct 31, 2024Updated last year
- [ICML 2023] This project is the official implementation of our accepted ICML 2023 paper BiBench: Benchmarking and Analyzing Network Binar…☆56Mar 4, 2024Updated 2 years ago
- ☆67Mar 30, 2025Updated 11 months ago
- ☆35Dec 22, 2025Updated 2 months ago
- Structured Binary Neural Networks for Image Recognition☆18Nov 18, 2021Updated 4 years ago
- BESA is a differentiable weight pruning technique for large language models.☆17Mar 4, 2024Updated 2 years ago
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆37Oct 9, 2025Updated 4 months ago
- GRadient-INformed MoE☆264Sep 25, 2024Updated last year
- BinaryViT: Pushing Binary Vision Transformers Towards Convolutional Models☆37Feb 4, 2024Updated 2 years ago
- Some tools for neuronal image analysis☆41Updated this week
- ☆14Jun 4, 2024Updated last year
- Official Pytorch Implementation of Paper "DarwinLM: Evolutionary Structured Pruning of Large Language Models"☆20Feb 21, 2025Updated last year
- The predecessor of CiteLab.☆18Feb 3, 2026Updated last month
- Official Repo for SparseLLM: Global Pruning of LLMs (NeurIPS 2024)☆67Mar 27, 2025Updated 11 months ago
- The official implementation of BiViT: Extremely Compressed Binary Vision Transformers☆16Jun 18, 2023Updated 2 years ago
- ☆21Mar 7, 2024Updated last year
- [ICML 2024] BiLLM: Pushing the Limit of Post-Training Quantization for LLMs☆228Jan 11, 2025Updated last year
- Tender: Accelerating Large Language Models via Tensor Decompostion and Runtime Requantization (ISCA'24)☆31Jul 4, 2024Updated last year
- ☆20Mar 6, 2022Updated 4 years ago
- ☆52Jul 18, 2024Updated last year
- [ICML 2024 Oral] Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs☆122Jul 4, 2025Updated 8 months ago
- [ACL 2024] A novel QAT with Self-Distillation framework to enhance ultra low-bit LLMs.☆133May 16, 2024Updated last year
- [ICML 2024] Official Implementation of SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks☆39Feb 4, 2025Updated last year
- ☆43Jul 10, 2024Updated last year
- Low-Rank Llama Custom Training☆23Mar 27, 2024Updated last year
- Generative Modeling with Bayesian Sample Inference☆24May 17, 2025Updated 9 months ago
- Elucidated Dataset Condensation (NeurIPS 2024)☆20Oct 5, 2024Updated last year