Official Implementation of "Maximum Likelihood Reinforcement Learning (MaxRL)"
☆179May 14, 2026Updated last week
Alternatives and similar repositories for maxrl
Users that are interested in maxrl are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official Implementation for the paper "VisCodex: Unified Multimodal Code Generation via Merging Vision and Coding Models"☆22Aug 14, 2025Updated 9 months ago
- [ICLR 2026] PixNerd: Pixel Neural Field Diffusion☆177Dec 10, 2025Updated 5 months ago
- NeurIPS 2026 paper: The Geometry of Consolidation — follow-up to HIDE and No-Escape.☆106May 5, 2026Updated 2 weeks ago
- SimKO: Simple Pass@K Policy Optimization☆30Oct 24, 2025Updated 7 months ago
- ☆17Jun 10, 2025Updated 11 months ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Official repo for "StreamingVLA: Streaming Vision-Language-Action Model with Action Flow Matching and Adaptive Early Observation"☆25Apr 22, 2026Updated last month
- DP-Rewrite: Towards Reproducibility and Transparency in Differentially Private Text Rewriting☆15Apr 27, 2023Updated 3 years ago
- RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best…☆62Mar 17, 2025Updated last year
- ☆35Feb 10, 2025Updated last year
- [NeurIPS25 Spotlight] Official Implementation for CBSA (Contract-and-Broadcast Self-Attention)☆36Apr 3, 2026Updated last month
- Brain Interpreter and Visualizer Online.☆10Sep 1, 2016Updated 9 years ago
- (ICML 2025) Rethinking Chain-of-Thought from the Perspective of Self-Training☆13Feb 15, 2025Updated last year
- Consistent Paths Lead to Truth: Self-Rewarding Reinforcement Learning for LLM Reasoning☆24Jun 25, 2025Updated 11 months ago
- [ICML 25] "Preference Optimization for Combinatorial Optimization Problems"☆26Jun 6, 2025Updated 11 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A framework bridging cognitive science and LLM reasoning research to diagnose and improve how large language models reason, based on anal…☆39Nov 26, 2025Updated 5 months ago
- ☆14Apr 25, 2025Updated last year
- ☆16Feb 4, 2025Updated last year
- The repo contains the code and dataset for the World Models Track of GigaBrain Challenge 2026 CVPR Workshop.☆59Apr 8, 2026Updated last month
- Excalibur is a highly opinionated agent harness for the aspiring summoner.☆162Apr 3, 2026Updated last month
- Reinforcement Learning, Tutorials in Chinese☆11Jun 9, 2018Updated 7 years ago
- ArXiV Notification Bot which sends you an email with the latest updates!☆17Oct 20, 2023Updated 2 years ago
- LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning☆37Apr 4, 2024Updated 2 years ago
- Marketplace ML experiment - training without backprop☆27Sep 9, 2025Updated 8 months ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- ☆20Nov 20, 2024Updated last year
- ☆574Apr 7, 2026Updated last month
- Pessimistic Value Iteration for Multi-Task Data Sharing in Offline RL☆18Nov 21, 2023Updated 2 years ago
- Implementation of ICML 22 Paper: Scaling Structured Inference with Randomization☆13Jul 24, 2022Updated 3 years ago
- ☆19Jul 18, 2021Updated 4 years ago
- Official code for ACT: Empowering Decision Transformer with Dynamic Programming via Advantage Conditioning (AAAI'24)☆17Feb 10, 2024Updated 2 years ago
- ☆30Oct 8, 2025Updated 7 months ago
- ☆12Feb 25, 2025Updated last year
- a collaborative agent-based workflow designed for NL2Vis task☆20Mar 6, 2025Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆12Jan 31, 2022Updated 4 years ago
- Code Implementation for "NASH: A Simple Unified Framework of Structured Pruning for Accelerating Encoder-Decoder Language Models" (EMNLP …☆17Oct 17, 2023Updated 2 years ago
- ☆24May 23, 2025Updated last year
- ☆22Sep 29, 2024Updated last year
- Extending the Context of Pretrained LLMs by Dropping Their Positional Embedding☆218Jan 12, 2026Updated 4 months ago
- The official implementatation of paper "BOTH2Hands: Inferring 3D Hands from Both Text Prompts and Body Dynamics".☆52Jun 26, 2025Updated 10 months ago
- Official Repo for Error-Free Linear Attention is a Free Lunch: Exact Solution from Continuous-Time Dynamics☆76Mar 26, 2026Updated last month