FROM $f(x)$ AND $g(x)$ TO $f(g(x))$: LLMs Learn New Skills in RL by Composing Old Ones
☆67Jan 26, 2026Updated 3 months ago
Alternatives and similar repositories for RL-Compositionality
Users that are interested in RL-Compositionality are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICLR 26] The official code repository for the paper "Mirage or Method? How Model–Task Alignment Induces Divergent RL Conclusions".☆17Feb 9, 2026Updated 3 months ago
- Code for "What really matters in matrix-whitening optimizers?"☆24Oct 31, 2025Updated 6 months ago
- [AAAI 2026] ReCode: Reinforced Code Knowledge Editing for API Updates☆25Jul 1, 2025Updated 10 months ago
- [CVPR2024 highlight] Generalized Large-Scale Data Condensation via Various Backbone and Statistical Matching (G-VBSM)☆28Oct 9, 2024Updated last year
- ☆13Nov 21, 2025Updated 6 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Toolathlon-Gym for testing AI agents real-world tool-use capabilities across diverse MCP servers.☆122Apr 2, 2026Updated last month
- ☆33Jan 7, 2025Updated last year
- Official Code for What Makes and Breaks Safety Fine-tuning? A Mechanistic Study (NeurIPS 2024)☆12Oct 31, 2024Updated last year
- A Framework for LLM-based Multi-Agent Reinforced Training and Inference☆510Apr 14, 2026Updated last month
- Flax (JAX) implementation of Progressive Growing of GANs for Improved Quality, Stability, and Variation☆12May 24, 2021Updated 4 years ago
- ROS2 Bag file parsing☆10Mar 14, 2020Updated 6 years ago
- [ACL 2025] How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training☆48Jul 18, 2025Updated 10 months ago
- Measuring the Signal to Noise Ratio in Language Model Evaluation☆29Aug 19, 2025Updated 9 months ago
- ☆64Mar 30, 2026Updated last month
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Official PyTorch implementation and models for paper "Diffusion Beats Autoregressive in Data-Constrained Settings". We find diffusion mod…☆126Jan 10, 2026Updated 4 months ago
- 6,080-param transformer achieving 100% accuracy on 10-digit addition. Trained from scratch in 10 minutes.☆22Feb 19, 2026Updated 3 months ago
- ☆33Oct 15, 2025Updated 7 months ago
- ☆30Apr 28, 2026Updated 3 weeks ago
- [ICML 2026] Reasoning in Parallelism via Self-Distilled RL☆110Feb 5, 2026Updated 3 months ago
- General Reasoner: Advancing LLM Reasoning Across All Domains [NeurIPS25]☆225Nov 27, 2025Updated 5 months ago
- We introduce new approach, Token Reduction using CLIP Metric (TRIM), aimed at improving the efficiency of MLLMs without sacrificing their…☆22Jan 11, 2026Updated 4 months ago
- ☆10Nov 6, 2024Updated last year
- Towards Better Graph Representation Learning with Parameterized Decomposition & Filtering☆13Aug 22, 2023Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- AHN: Artificial Hippocampus Networks for Efficient Long-Context Modeling☆177Oct 17, 2025Updated 7 months ago
- Prioritize Alignment in Dataset Distillation☆21Dec 3, 2024Updated last year
- Official Implementation of wd1☆29Sep 25, 2025Updated 7 months ago
- Tooling for exact and MinHash deduplication of large-scale text datasets☆83Mar 24, 2026Updated last month
- A meta-repo that watches karpathy/autoresearch and adjacent systems, distills portable patterns for bounded agent-verifier research lo…☆43May 8, 2026Updated last week
- Towards a Unified View of Large Language Model Post-Training☆211Sep 8, 2025Updated 8 months ago
- [ICLR 2026] dParallel: Learnable Parallel Decoding for dLLMs☆61Apr 12, 2026Updated last month
- This is the official repository for NeurIPS 2023 paper "Curriculum Learning for Graph Neural Networks: Which Edges Should We Learn First"☆17Oct 27, 2023Updated 2 years ago
- P1: Mastering Physics Olympiads with Reinforcement Learning☆84Dec 29, 2025Updated 4 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆12Jul 30, 2025Updated 9 months ago
- Code for the paper "Spectrum Guided Topology Augmentation for Graph Contrastive Learning"☆11Jul 18, 2023Updated 2 years ago
- Fork of Flame repo for training of some new stuff in development☆19Apr 24, 2026Updated 3 weeks ago
- ☆22Dec 18, 2025Updated 5 months ago
- My toy model for natural language inference task.☆11Aug 6, 2018Updated 7 years ago
- ☆14Dec 13, 2022Updated 3 years ago
- The official github repo for "Diffusion Language Models are Super Data Learners".☆228Nov 6, 2025Updated 6 months ago