tigerchen52 / awesome_role_of_small_modelsView external linksLinks
a curated list of the role of small models in the LLM era
☆111Sep 23, 2024Updated last year
Alternatives and similar repositories for awesome_role_of_small_models
Users that are interested in awesome_role_of_small_models are comparing it to the libraries listed below
Sorting:
- ☆16Jul 23, 2024Updated last year
- Official implementation of Self-Taught Agentic Long Context Understanding (ACL 2025).☆12Sep 22, 2025Updated 4 months ago
- ROUTE: Robust Multitask Tuning and Collaboration for Text-to-SQL (ICLR 2025 Pytorch Code)☆17May 15, 2025Updated 9 months ago
- This repo contains code for the paper "Both Text and Images Leaked! A Systematic Analysis of Data Contamination in Multimodal LLM"☆17Oct 17, 2025Updated 3 months ago
- [ICML 2025] Code for "R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts"☆19Mar 10, 2025Updated 11 months ago
- Code for RATIONALYST: Pre-training Process-Supervision for Improving Reasoning https://arxiv.org/pdf/2410.01044☆35Oct 3, 2024Updated last year
- Open deep learning compiler stack for cpu, gpu and specialized accelerators☆19Updated this week
- ☆25Dec 13, 2024Updated last year
- The source code of [WWW 2025] MoDiCF☆12Jul 12, 2025Updated 7 months ago
- A scalable automated alignment method for large language models. Resources for "Aligning Large Language Models via Self-Steering Optimiza…☆20Nov 21, 2024Updated last year
- Implementation and datasets for "Training Language Models to Generate Quality Code with Program Analysis Feedback"☆40Jul 21, 2025Updated 6 months ago
- Vocabulary Parallelism☆25Mar 10, 2025Updated 11 months ago
- ☆14Mar 20, 2025Updated 10 months ago
- ☆14Feb 2, 2025Updated last year
- ☆22Dec 17, 2024Updated last year
- Fantastic Data Engineering for Large Language Models☆93Dec 29, 2024Updated last year
- NeurIPS 2024 tutorial on LLM Inference☆47Dec 10, 2024Updated last year
- ☆16Sep 4, 2025Updated 5 months ago
- MetaLadder: Ascending Mathematical Solution Quality via Analogical-Problem Reasoning Transfer (EMNLP 2025)☆11Apr 18, 2025Updated 9 months ago
- ☆23Sep 19, 2024Updated last year
- Hugging Face Inference Toolkit used to serve transformers, sentence-transformers, and diffusers models.☆90Jan 9, 2026Updated last month
- Fast and Slow Generating: An Empirical Study on Large and Small Language Models Collaborative Decoding.☆13Nov 19, 2024Updated last year
- MARSHAL: Incentivizing Multi-Agent Reasoning via Self-Play with Strategic LLMs☆36Updated this week
- Formalizing Multimedia Recommendation through Multimodal Deep Learning, accepted in ACM Transactions on Recommender Systems.☆19Jul 2, 2024Updated last year
- Implementation for the paper "Fictitious Synthetic Data Can Improve LLM Factuality via Prerequisite Learning"☆11Jan 10, 2025Updated last year
- [NAACL'25 🏆 SAC Award] Official code for "Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expert…☆14Feb 4, 2025Updated last year
- Layer-Condensed KV cache w/ 10 times larger batch size, fewer params and less computation. Dramatic speed up with better task performance…☆157Apr 7, 2025Updated 10 months ago
- Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).☆251Mar 15, 2024Updated last year
- RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best…☆59Mar 17, 2025Updated 10 months ago
- The Code and Script of "David's Slingshot: A Strategic Coordination Framework of Small LLMs Matches Large LLMs in Data Synthesis"☆35Jun 13, 2025Updated 8 months ago
- WanJuan-CC是以CommonCrawl为基础,经过数据抽取,规则清洗,去重,安全过滤,质量清洗等步骤得到的高质量数据。☆14Apr 18, 2024Updated last year
- ☆19Jun 4, 2025Updated 8 months ago
- ☆31Nov 18, 2025Updated 2 months ago
- ☆21Jul 21, 2025Updated 6 months ago
- KV Cache Steering for Inducing Reasoning in Small Language Models☆46Jul 24, 2025Updated 6 months ago
- ☆34Jul 23, 2024Updated last year
- ☆302Jul 10, 2025Updated 7 months ago
- Official implementation of "OpenCity3D: What do Vision-Language Models know about Urban Environments?" @ WACV2025☆16Nov 24, 2024Updated last year
- ☆21Jul 18, 2024Updated last year