SqueezeAILab / LLM2LLM
[ACL 2024] LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement
☆181Updated last year
Alternatives and similar repositories for LLM2LLM:
Users that are interested in LLM2LLM are comparing it to the libraries listed below
- Offical Repo for "Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale"☆230Updated last month
- ☆264Updated 8 months ago
- Reformatted Alignment☆115Updated 6 months ago
- [EMNLP 2024 (Oral)] Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA☆119Updated 4 months ago
- Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]☆137Updated 5 months ago
- [ACL 2024] AutoAct: Automatic Agent Learning from Scratch for QA via Self-Planning☆218Updated 2 months ago
- [ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning☆147Updated 6 months ago
- Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]☆545Updated 3 months ago
- Generative Judge for Evaluating Alignment☆232Updated last year
- Unofficial implementation of AlpaGasus☆90Updated last year
- [Neurips2024] Source code for xRAG: Extreme Context Compression for Retrieval-augmented Generation with One Token☆131Updated 9 months ago
- ☆142Updated 9 months ago
- ☆92Updated 3 months ago
- InsTag: A Tool for Data Analysis in LLM Supervised Fine-tuning☆248Updated last year
- ACL 2024 | LooGLE: Long Context Evaluation for Long-Context Language Models☆181Updated 5 months ago
- [COLING 2025] ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios☆65Updated 4 months ago
- 🐋 An unofficial implementation of Self-Alignment with Instruction Backtranslation.☆137Updated 9 months ago
- AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark☆133Updated 3 months ago
- [EMNLP 2024] LongAlign: A Recipe for Long Context Alignment of LLMs☆247Updated 3 months ago
- [EMNLP 2023] Adapting Language Models to Compress Long Contexts☆298Updated 6 months ago
- Code implementation of synthetic continued pretraining☆97Updated 2 months ago
- Repository for Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions, ACL23☆200Updated 9 months ago
- [ICML 2024] Selecting High-Quality Data for Training Language Models☆162Updated 9 months ago
- Implementation of paper Data Engineering for Scaling Language Models to 128K Context☆456Updated last year
- ☆312Updated 6 months ago
- [EMNLP 2023] The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning☆236Updated last year
- A highly capable 2.4B lightweight LLM using only 1T pre-training data with all details.☆167Updated 2 weeks ago
- Evaluating LLMs' multi-round chatting capability via assessing conversations generated by two LLM instances.☆148Updated last year
- Hammer: Robust Function-Calling for On-Device Language Models via Function Masking☆64Updated last month
- Official github repo for AutoDetect, an automated weakness detection framework for LLMs.☆42Updated 9 months ago