[ICML 2026] Reasoning in Parallelism via Self-Distilled RL
☆110Feb 5, 2026Updated 3 months ago
Alternatives and similar repositories for Native-Parallel-Reasoner
Users that are interested in Native-Parallel-Reasoner are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The offical repo for "LLMs Improving LLMs: Agentic Discovery for Test-Time Scaling"☆108May 15, 2026Updated last week
- FROM $f(x)$ AND $g(x)$ TO $f(g(x))$: LLMs Learn New Skills in RL by Composing Old Ones☆67Jan 26, 2026Updated 3 months ago
- Measuring how well CLI agents like Claude Code or Codex CLI can post-train base LLMs on a single H100 GPU in 10 hours☆329Updated this week
- [ICLR 2024 Spotlight] Social Reward: Evaluating and Enhancing Generative AI through Million-User Feedback from an Online Creative Communi…☆12Mar 29, 2024Updated 2 years ago
- Official implementation of our paper "Towards Reasoning in Large Language Models via Multi-Agent Peer Review Collaboration".☆14Nov 18, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- [ICLR'25] "Attention in Large Language Models Yields Efficient Zero-Shot Re-Rankers"☆43Mar 31, 2025Updated last year
- ☆88Apr 28, 2026Updated 3 weeks ago
- Infrastructure as Code for MCP access management☆36May 6, 2026Updated 2 weeks ago
- PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning☆335Feb 5, 2026Updated 3 months ago
- [ICLR 2026] RuleReasoner: Reinforced Rule-based Reasoning via Domain-aware Dynamic Sampling☆38Feb 25, 2026Updated 2 months ago
- [IEEE TNSRE] Mixture of Experts for EEG-Based Seizure Subtype Classification☆12Aug 20, 2024Updated last year
- Large language models designed for formal theorem proving through tool-integrated reasoning.☆34Aug 13, 2025Updated 9 months ago
- UQ: Assessing Language Models on Unsolved Questions☆30Aug 26, 2025Updated 8 months ago
- P1: Mastering Physics Olympiads with Reinforcement Learning☆84Dec 29, 2025Updated 4 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best…☆62Mar 17, 2025Updated last year
- [ICML 2026] Transform Trained Transformer for Accelerating Native 4K Video Generation☆39Dec 16, 2025Updated 5 months ago
- Internal utility libraries for Pkl☆16May 14, 2026Updated last week
- Official repository for the paper Number Cookbook: Number Understanding of Language Models and How to Improve It.☆21Mar 31, 2025Updated last year
- ☆192Dec 18, 2025Updated 5 months ago
- 一步步通关GPU编程☆40May 15, 2026Updated last week
- A Structured Output Benchmark whose 'ground-truth' is actually right☆19Dec 5, 2025Updated 5 months ago
- Multilingual and Multiculture Benchmark and LLM☆36Updated this week
- Financial Services Interest Group☆53Jan 14, 2026Updated 4 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- An official implementation of Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewards☆36Oct 3, 2025Updated 7 months ago
- Codes for the paper "BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping…☆93Jan 29, 2026Updated 3 months ago
- Rethinking the Trust Region in LLM Reinforcement Learning☆54Mar 2, 2026Updated 2 months ago
- ☆22Dec 3, 2025Updated 5 months ago
- ☆18Nov 25, 2023Updated 2 years ago
- Official Implementation of Flash-Searcher: Fast and Effective Web Agents via DAG-Based Parallel Execution☆79Dec 8, 2025Updated 5 months ago
- Official Implementation of wd1☆29Sep 25, 2025Updated 7 months ago
- [CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts☆17Apr 2, 2025Updated last year
- ☆44Apr 28, 2026Updated 3 weeks ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Language Modeling Research Hub, a comprehensive compendium for enthusiasts and scholars delving into the fascinating realm of language mo…☆19Mar 19, 2025Updated last year
- ☆36Oct 23, 2025Updated 6 months ago
- Python library to add support for embedding natural code in Python with shared program state.☆30Jan 20, 2026Updated 4 months ago
- Code for Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation (EVOL-RL).☆50Mar 31, 2026Updated last month
- ☆47Sep 8, 2025Updated 8 months ago
- [EMNLP 2025] The official implementation for paper "Agentic-R1: Distilled Dual-Strategy Reasoning"☆104Apr 21, 2026Updated last month
- [ICML 2025] |TokenSwift: Lossless Acceleration of Ultra Long Sequence Generation☆125May 19, 2025Updated last year