[ICLR 2026 🔥] Dr.LLM: Dynamic Layer Routing in LLMs
☆52Apr 24, 2026Updated last month
Alternatives and similar repositories for dr-llm
Users that are interested in dr-llm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Multi-Agent LLM Evaluation Docs: https://maseval.readthedocs.io/☆34May 31, 2026Updated last week
- ☆11May 9, 2023Updated 3 years ago
- [NAACL'25 🏆 SAC Award] Official code for "Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expert…☆16Feb 4, 2025Updated last year
- Official implementation of Latent-SFT: teaching LLMs to reason with vocabulary-space latent chains.☆51May 18, 2026Updated 3 weeks ago
- LiteGPT: A 124M Small Language Model (SLM) pre-trained on FineWeb and fine-tuned on Alpaca.☆34Dec 16, 2025Updated 5 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- A tool for extracting plain text from Wikipedia dumps☆15Sep 13, 2018Updated 7 years ago
- All my experiments with the various transformers and various transformer frameworks available☆14Apr 30, 2021Updated 5 years ago
- ☆18Jul 24, 2023Updated 2 years ago
- Official implementation of "Reasoning by Superposition: A Theoretical Perspective on Chain of Continuous Thought" (NeurIPS 2025)☆40Oct 8, 2025Updated 8 months ago
- Source code of "Leaky Thoughts: Large Reasoning Models Are Not Private Thinkers" EMNLP 2025☆17Jan 12, 2026Updated 4 months ago
- Code used to create the Linked WikiText-2 dataset☆16May 22, 2023Updated 3 years ago
- Source code for the paper "Do Deep Neural Network Solutions form a Star Domain?"☆12May 26, 2024Updated 2 years ago
- 2022龙芯杯个人赛三等奖作品☆14Oct 11, 2023Updated 2 years ago
- ☆28May 27, 2024Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆10Nov 6, 2024Updated last year
- Code of EMNLP 2025 paper 'UltraIF: Advancing Instruction Following from the Wild'.☆21Apr 3, 2025Updated last year
- [ICML2025] Official Repo for Paper "Optimizing Temperature for Language Models with Multi-Sample Inference"☆23Feb 16, 2025Updated last year
- Can LLMs Predict Their Own Failures? Self-Awareness via Internal Circuits☆45Jan 8, 2026Updated 5 months ago
- Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval (NeurIPS'21)☆48Dec 27, 2021Updated 4 years ago
- Token-free Language Modeling with ByGPT5 & Friends!☆12Jul 18, 2025Updated 10 months ago
- NSCSCC 2020 - Yet Another MIPS Processor☆14Aug 7, 2021Updated 4 years ago
- [KDD 2026] Voxlect: A Speech Foundation Model Benchmark for Modeling Dialects and Regional Languages Around the Globe☆33Aug 10, 2025Updated 10 months ago
- Tunisian Arabish Corpus☆12Mar 12, 2024Updated 2 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- A Large-Scale Dataset for Paraphrased Reading Comprehension☆15Jul 16, 2023Updated 2 years ago
- Source code of NAACL 2025 Findings "Scaling Up Membership Inference: When and How Attacks Succeed on Large Language Models"☆16Dec 16, 2025Updated 5 months ago
- Code for paper "Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System"☆72Nov 14, 2024Updated last year
- Data and code for paper "ODSum: New Benchmarks for Open Domain Multi-Document Summarization"☆11Sep 20, 2024Updated last year
- [ICML 2025] RocketKV: Accelerating Long-Context LLM Inference via Two-Stage KV Cache Compression☆47Aug 7, 2025Updated 10 months ago
- Curated list of Moroccans publishing in the most prestigious AI conferences☆11Oct 14, 2024Updated last year
- Launch programs on multiple hosts. (多机启动程序)☆14Jul 4, 2023Updated 2 years ago
- NSCSCC “龙芯杯” 2024 个人赛 LoongArch 赛道三等奖☆18Aug 17, 2024Updated last year
- ☆19Jul 24, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Code for NeurIPS'23 paper "A Bayesian Approach To Analysing Training Data Attribution In Deep Learning"☆17Jan 12, 2024Updated 2 years ago
- Source code of "TRAP: Targeted Random Adversarial Prompt Honeypot for Black-Box Identification", ACL2024 (findings)☆14Nov 20, 2024Updated last year
- Diacritization of Arabic texts☆11Apr 13, 2016Updated 10 years ago
- LAReQA is a challenging benchmark for evaluating language agnostic answer retrieval from a multilingual candidate pool. This repository c…☆14May 19, 2020Updated 6 years ago
- Code for ICASSP 2024 Paper: RECAP: Retrieval-Augmented Audio Captioning☆15Jun 23, 2024Updated last year
- [CVPR 2024] KEPP: Why Not Use Your Textbook? Knowledge-Enhanced Procedure Planning of Instructional Videos☆12Sep 24, 2024Updated last year
- ☆46Sep 30, 2025Updated 8 months ago