☆33Jan 7, 2025Updated last year
Alternatives and similar repositories for reasoning_generalization
Users that are interested in reasoning_generalization are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [NeurIPS 2025] Bag of Tricks for Inference-time Computation of LLM Reasoning☆16Sep 20, 2025Updated 8 months ago
- Code for Blog Post: Can Better Cold-Start Strategies Improve RL Training for LLMs?☆20Mar 9, 2025Updated last year
- Understanding deep networks and large models.☆28Jan 23, 2026Updated 4 months ago
- FROM $f(x)$ AND $g(x)$ TO $f(g(x))$: LLMs Learn New Skills in RL by Composing Old Ones☆67Jan 26, 2026Updated 4 months ago
- ☆19Mar 25, 2025Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Gemstones: A Model Suite for Multi-Faceted Scaling Laws (NeurIPS 2025)☆34Sep 28, 2025Updated 7 months ago
- [NeurIPS 2025] What Makes a Reward Model a Good Teacher? An Optimization Perspective☆44Sep 18, 2025Updated 8 months ago
- [NeurIPS 2025 Datasets & Benchmarks Track] The Illusion of Progress? A Critical Look at Test-Time Adaptation for Vision-Language Models☆37Oct 26, 2025Updated 7 months ago
- ☆17May 14, 2026Updated last week
- Code for paper "Parameter Efficient Multi-task Model Fusion with Partial Linearization"☆25Sep 13, 2024Updated last year
- Code for the paper "A Mechanistic Interpretation of Arithmetic Reasoning in Language Models using Causal Mediation Analysis"☆20Jun 12, 2025Updated 11 months ago
- ☆39Nov 18, 2025Updated 6 months ago
- This is code to accompany the paper "Accelerating Exploration with Unlabeled Prior Data".☆25Dec 5, 2023Updated 2 years ago
- Official Code for What Makes and Breaks Safety Fine-tuning? A Mechanistic Study (NeurIPS 2024)☆12Oct 31, 2024Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Flax (JAX) implementation of Progressive Growing of GANs for Improved Quality, Stability, and Variation☆12May 24, 2021Updated 5 years ago
- ☆11Oct 25, 2024Updated last year
- Github repository for "Internalizing World Models via Self-Play Finetuning for Agentic RL"☆35Nov 1, 2025Updated 6 months ago
- [CVPR 2023] "TrojViT: Trojan Insertion in Vision Transformers" by Mengxin Zheng, Qian Lou, Lei Jiang☆15Jan 5, 2024Updated 2 years ago
- Code for paper "Concrete Subspace Learning based Interference Elimination for Multi-task Model Fusion"☆14Mar 28, 2024Updated 2 years ago
- Efficient Scaling laws and collaborative pretraining.☆22Sep 18, 2025Updated 8 months ago
- [NeurIPS 2024 Spotlight] EMR-Merging: Tuning-Free High-Performance Model Merging☆81Mar 1, 2025Updated last year
- ☆15Jun 25, 2025Updated 11 months ago
- [NeurIPS'24] Official code for *🎯DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*☆122Dec 10, 2024Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Code for paper "Towards Efficient Pareto Set Approximation via Weight-Ensembling Mixture of Experts"☆11Sep 13, 2024Updated last year
- KnowRL: Exploring Knowledgeable Reinforcement Learning for Factuality☆45May 19, 2026Updated last week
- The code implementation of MuScleLoRA (Accepted in ACL 2024)☆10Dec 1, 2024Updated last year
- [ACL 2025] Official implementation of the "CoT-ICL Lab" framework☆11May 1, 2026Updated 3 weeks ago
- GenRM-CoT: Data release for verification rationales☆68Oct 16, 2024Updated last year
- Procedural data generators suite for synthetic pretraining and formal reasoning☆40Updated this week
- Minimal (truly) muP implementation, consistent with TP4 and TP5 papers notation☆14Jan 2, 2026Updated 4 months ago
- Few-Shot Relation Extraction with AllenNLP☆12Jan 27, 2019Updated 7 years ago
- Latest Weight Averaging (NeurIPS HITY 2022)☆33Jun 20, 2023Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Repo of paper "Free Process Rewards without Process Labels"☆171Mar 14, 2025Updated last year
- Code for NeurIPS'24 paper 'Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization'☆238Jul 19, 2025Updated 10 months ago
- ☆43Jan 15, 2025Updated last year
- ☆26Jun 10, 2025Updated 11 months ago
- Code for reproducing our paper "Low Rank Adapting Models for Sparse Autoencoder Features"☆17Mar 31, 2025Updated last year
- [ICML 2025] Teaching Language Models to Critique via Reinforcement Learning☆126May 6, 2025Updated last year
- ☆32Nov 30, 2025Updated 5 months ago