tigerchen52 / awesome_role_of_small_models
a curated list of the role of small models in the LLM era
☆99Updated 7 months ago
Alternatives and similar repositories for awesome_role_of_small_models:
Users that are interested in awesome_role_of_small_models are comparing it to the libraries listed below
- Code for the EMNLP 2024 paper "Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps"☆120Updated 8 months ago
- Harnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language Models☆105Updated this week
- LongEmbed: Extending Embedding Models for Long Context Retrieval (EMNLP 2024)☆134Updated 6 months ago
- Official repository for paper "ReasonIR Training Retrievers for Reasoning Tasks".☆112Updated last week
- This is the official repository for Inheritune.☆111Updated 2 months ago
- Official Implementation of "Reasoning Language Models: A Blueprint"☆58Updated 2 months ago
- Code for "Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free"☆68Updated 6 months ago
- [ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization☆35Updated 2 months ago
- ☆36Updated 3 months ago
- [ICML 2025] Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale☆244Updated 2 weeks ago
- Code implementation of synthetic continued pretraining☆107Updated 4 months ago
- EMNLP'23 survey: a curation of awesome papers and resources on refreshing large language models (LLMs) without expensive retraining.☆133Updated last year
- A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.☆85Updated last year
- ☆147Updated last year
- This is the code repo for our paper "Autonomously Knowledge Assimilation and Accommodation through Retrieval-Augmented Agents".☆106Updated 6 months ago
- ☆16Updated last year
- [ICLR 2025] InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized Rationales☆90Updated 3 months ago
- Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]☆139Updated 6 months ago
- [NeurIPS 2023] This is the code for the paper `Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias`.☆150Updated last year
- What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective☆63Updated 2 months ago
- Improving Text Embedding of Language Models Using Contrastive Fine-tuning☆64Updated 9 months ago
- [NeurIPS 2024] Knowledge Circuits in Pretrained Transformers☆145Updated 2 months ago
- FuseAI Project☆85Updated 3 months ago
- Official implementation for 'Extending LLMs ’ Context Window with 100 Samples'☆77Updated last year
- DSBench: How Far are Data Science Agents from Becoming Data Science Experts?☆50Updated 2 months ago
- The code and data of DPA-RAG, accepted by WWW 2025 main conference.☆60Updated 3 months ago
- Self-Evolved Diverse Data Sampling for Efficient Instruction Tuning☆79Updated last year
- [ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning☆150Updated 8 months ago
- ☆35Updated 6 months ago
- [ACL 2024] LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement☆181Updated last year