β79Nov 19, 2024Updated last year
Alternatives and similar repositories for inference_scaling
Users that are interested in inference_scaling are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official Code Repository for [AutoScaleπ: Scale-Aware Data Mixing for Pre-Training LLMs] Published as a conference paper at **COLM 2025*β¦β13Aug 8, 2025Updated 8 months ago
- β21Jun 27, 2024Updated last year
- β109Jul 15, 2025Updated 9 months ago
- (ICML 2024) Alphazero-like Tree-Search can guide large language model decoding and trainingβ286May 26, 2024Updated last year
- β15Mar 20, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Automatic prompt optimization framework for multi-step agent tasks.β37Nov 12, 2024Updated last year
- β43Dec 14, 2024Updated last year
- Official Implementation for [ICLR26] DefensiveKV: Taming the Fragility of KV Cache Eviction in LLM Inferenceβ43Mar 28, 2026Updated 3 weeks ago
- The rule-based evaluation subset and code implementation of Omni-MATHβ27Dec 23, 2024Updated last year
- Curation of resources for LLM mathematical reasoning, most of which are screened by @tongyx361 to ensure high quality and accompanied witβ¦β154Jul 12, 2024Updated last year
- Code repo for "CritiPrefill: A Segment-wise Criticality-based Approach for Prefilling Acceleration in LLMs".β16Sep 15, 2024Updated last year
- A series of technical report on Slow Thinking with LLMβ764Aug 13, 2025Updated 8 months ago
- A new dataset of difficult graduate-level applied mathematics problems; evaluations demonstrate that leading LLMs currently exhibit low aβ¦β28Feb 14, 2025Updated last year
- β24Dec 9, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial β’ AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- β20Nov 3, 2024Updated last year
- Repository for the COLM 2025 paper SpecDec++: Boosting Speculative Decoding via Adaptive Candidate Lengthsβ18Jul 10, 2025Updated 9 months ago
- [TMLR] Process Reward Models That Thinkβ85Nov 29, 2025Updated 4 months ago
- Is In-Context Learning Sufficient for Instruction Following in LLMs? [ICLR 2025]β32Jan 23, 2025Updated last year
- β23Mar 7, 2025Updated last year
- [ICLR 2025] ELICIT: LLM Augmentation Via External In-context Capabilityβ14Mar 11, 2025Updated last year
- Repo of paper "Free Process Rewards without Process Labels"β172Mar 14, 2025Updated last year
- Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning [ICML 2024]β21May 2, 2024Updated last year
- β52Mar 17, 2025Updated last year
- Serverless GPU API endpoints on Runpod - Bonus Credits β’ AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasksβ266May 5, 2025Updated 11 months ago
- Repository for the paper Stream of Search: Learning to Search in Languageβ153Feb 3, 2025Updated last year
- [NeurIPS'24] Official code for *π―DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*β121Dec 10, 2024Updated last year
- This is the repository that contains the source code for the Self-Evaluation Guided MCTS for online DPO.β329Jan 29, 2026Updated 2 months ago
- [EMNLP'24] LongHeads: Multi-Head Attention is Secretly a Long Context Processorβ31Apr 8, 2024Updated 2 years ago
- [EMNLP '23] Discriminator-Guided Chain-of-Thought Reasoningβ50Oct 11, 2024Updated last year
- β971Jan 23, 2025Updated last year
- β27May 30, 2025Updated 10 months ago
- β112Sep 25, 2024Updated last year
- Simple, predictable pricing with DigitalOcean hosting β’ AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- FastCuRL: Curriculum Reinforcement Learning with Stage-wise Context Scaling for Efficient LLM Reasoning (EMNLP 2025)β58Oct 10, 2025Updated 6 months ago
- [ICLR 2025] 𧬠RegMix: Data Mixture as Regression for Language Model Pre-training (Spotlight)β189Feb 17, 2025Updated last year
- A unified suite for generating elite reasoning problems and training high-performance LLMs, including pioneering attention-free architectβ¦β133Jan 31, 2026Updated 2 months ago
- B-STAR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasonersβ86May 21, 2025Updated 10 months ago
- [ICML 2024] One Prompt is Not Enough: Automated Construction of a Mixture-of-Expert Prompts - TurningPoint AIβ31Sep 25, 2024Updated last year
- [NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejectionβ54Oct 29, 2024Updated last year
- Uncertainty quantification for in-context learning of large language modelsβ15Apr 1, 2024Updated 2 years ago