RUCAIBox / QuantizedEmpirical
☆14Updated last year
Alternatives and similar repositories for QuantizedEmpirical:
Users that are interested in QuantizedEmpirical are comparing it to the libraries listed below
- Implementation of "Decoding-time Realignment of Language Models", ICML 2024.☆18Updated 9 months ago
- Codebase for Instruction Following without Instruction Tuning☆33Updated 6 months ago
- [NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejection☆40Updated 5 months ago
- The code and data for the paper JiuZhang3.0☆43Updated 10 months ago
- [ICLR 2025] MiniPLM: Knowledge Distillation for Pre-Training Language Models☆34Updated 4 months ago
- ☆13Updated 5 months ago
- [ICML 2023] "Data Efficient Neural Scaling Law via Model Reusing" by Peihao Wang, Rameswar Panda, Zhangyang Wang☆14Updated last year
- [EMNLP 2023]Context Compression for Auto-regressive Transformers with Sentinel Tokens☆24Updated last year
- [ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal…☆48Updated 2 years ago
- Code for preprint "Metadata Conditioning Accelerates Language Model Pre-training (MeCo)"☆36Updated last week
- Automatic prompt optimization framework for multi-step agent tasks.☆28Updated 4 months ago
- Official Code Repository for [AutoScale–Automatic Prediction of Compute-optimal Data Compositions for Training LLMs]☆12Updated 2 months ago
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning. COLM 2024 Accepted Paper☆30Updated 10 months ago
- ☆13Updated 4 months ago
- LongSpec: Long-Context Speculative Decoding with Efficient Drafting and Verification☆42Updated last month
- The source code of "Merging Experts into One: Improving Computational Efficiency of Mixture of Experts (EMNLP 2023)":☆36Updated 11 months ago
- [ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization☆31Updated last month
- Towards Systematic Measurement for Long Text Quality☆34Updated 6 months ago
- ☆34Updated last year
- ☆14Updated 2 years ago
- ☆17Updated 4 months ago
- ☆17Updated last month
- [NAACL 2025] A Closer Look into Mixture-of-Experts in Large Language Models☆47Updated last month
- [ICLR 2024] CLEX: Continuous Length Extrapolation for Large Language Models☆76Updated last year
- ☆30Updated 6 months ago
- ☆16Updated 8 months ago
- Offical code repository for PromptMix: A Class Boundary Augmentation Method for Large Language Model Distillation, EMNLP 2023☆12Updated last year
- ☆36Updated 6 months ago
- SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Model https://arxiv.org/pdf/2411.02433☆24Updated 3 months ago
- [ACL 2023] Solving Math Word Problems via Cooperative Reasoning induced Language Models (LLMs + MCTS + Self-Improvement)☆48Updated last year