Official Implementation of "Maximum Likelihood Reinforcement Learning (MaxRL)"
☆192May 28, 2026Updated last month
Alternatives and similar repositories for maxrl
Users that are interested in maxrl are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official Implementation for the paper "VisCodex: Unified Multimodal Code Generation via Merging Vision and Coding Models"☆23Aug 14, 2025Updated 10 months ago
- Trust Region Preference Approximation: A simple and stable reinforcement learning algorithm for LLM reasoning☆15Jun 28, 2025Updated last year
- ☆48Sep 15, 2025Updated 9 months ago
- SimKO: Simple Pass@K Policy Optimization☆31Oct 24, 2025Updated 8 months ago
- ☆17Jun 10, 2025Updated last year
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- Official repo for "StreamingVLA: Streaming Vision-Language-Action Model with Action Flow Matching and Adaptive Early Observation"☆27Updated this week
- DP-Rewrite: Towards Reproducibility and Transparency in Differentially Private Text Rewriting☆15Apr 27, 2023Updated 3 years ago
- RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best…☆62Mar 17, 2025Updated last year
- ☆35Feb 10, 2025Updated last year
- webgpu autograd library☆35May 24, 2025Updated last year
- (ICML 2025) Rethinking Chain-of-Thought from the Perspective of Self-Training☆13Feb 15, 2025Updated last year
- Consistent Paths Lead to Truth: Self-Rewarding Reinforcement Learning for LLM Reasoning☆25Jun 25, 2025Updated last year
- [ACL 2025] Adaptive Retrieval without Self-Knowledge? Bringing Uncertainty Back Home☆19May 17, 2025Updated last year
- Code repo for "Harnessing Negative Signals: Reinforcement Distillation from Teacher Data for LLM Reasoning"☆33Jul 25, 2025Updated 11 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Agentic Research and Evaluation Suite☆104Jun 11, 2026Updated 3 weeks ago
- ☆12Mar 31, 2024Updated 2 years ago
- ☆15Apr 25, 2025Updated last year
- ☆73Jun 18, 2026Updated 2 weeks ago
- Implementation of GraphReader paper: https://arxiv.org/abs/2406.14550☆14Oct 21, 2024Updated last year
- ☆49May 9, 2026Updated last month
- The repo contains the code and dataset for the World Models Track of GigaBrain Challenge 2026 CVPR Workshop.☆60Apr 8, 2026Updated 2 months ago
- RecGPT: Generative Pre-training for Text-based Recommendation (ACL 2024)☆42Sep 22, 2024Updated last year
- LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning☆38Apr 4, 2024Updated 2 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Excalibur is a highly opinionated agent harness for the aspiring summoner.☆174Apr 3, 2026Updated 3 months ago
- ☆20Nov 20, 2024Updated last year
- implementation of dualformer☆25Mar 1, 2025Updated last year
- Implementation of ICML 22 Paper: Scaling Structured Inference with Randomization☆13Jul 24, 2022Updated 3 years ago
- a collaborative agent-based workflow designed for NL2Vis task☆20Mar 6, 2025Updated last year
- Pytorch code for NeurIPS 2025 paper "Accurate and Efficient Low-Rank Model Merging in Core Space"☆41Feb 2, 2026Updated 5 months ago
- Code Implementation for "NASH: A Simple Unified Framework of Structured Pruning for Accelerating Encoder-Decoder Language Models" (EMNLP …☆17Oct 17, 2023Updated 2 years ago
- ☆24May 23, 2025Updated last year
- Authors' PyTorch implementation of 'Recomposing the Reinforcement Learning Building-Blocks with Hypernetworks' (HypeRL)☆26Jun 9, 2021Updated 5 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Extending the Context of Pretrained LLMs by Dropping Their Positional Embedding☆219Jan 12, 2026Updated 5 months ago
- Accelerating RL for LLM Reasoning with Optimal Advantage Regression☆41May 30, 2025Updated last year
- [ACL 2025] iAgent: LLM Agent as a Shield between User and Recommender Systems☆32May 23, 2025Updated last year
- ☆15Jun 11, 2025Updated last year
- Robust and Approximate Markov Decision Processes☆11Jul 21, 2017Updated 8 years ago
- this is for fun, ain't it grand!☆21Sep 18, 2025Updated 9 months ago
- Official repo for our AAAI'21 paper, https://arxiv.org/abs/2007.12354☆30Jul 14, 2021Updated 4 years ago