Official Implementation of "Maximum Likelihood Reinforcement Learning (MaxRL)"
☆172Mar 15, 2026Updated last month
Alternatives and similar repositories for maxrl
Users that are interested in maxrl are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICLR 2026] PixNerd: Pixel Neural Field Diffusion☆176Dec 10, 2025Updated 4 months ago
- ☆44Sep 15, 2025Updated 7 months ago
- Official repo for "StreamingVLA: Streaming Vision-Language-Action Model with Action Flow Matching and Adaptive Early Observation"☆23Apr 22, 2026Updated last week
- SimKO: Simple Pass@K Policy Optimization☆31Oct 24, 2025Updated 6 months ago
- RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best…☆61Mar 17, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆35Feb 10, 2025Updated last year
- (ICML 2025) Rethinking Chain-of-Thought from the Perspective of Self-Training☆13Feb 15, 2025Updated last year
- Code repo for "Harnessing Negative Signals: Reinforcement Distillation from Teacher Data for LLM Reasoning"☆33Jul 25, 2025Updated 9 months ago
- ☆15Apr 16, 2025Updated last year
- ☆70Dec 7, 2025Updated 4 months ago
- ☆14Apr 25, 2025Updated last year
- This is the notebooks for videos in my Bilibili Channel (https://space.bilibili.com/32773300?spm_id_from=333.1007.0.0)☆33Nov 6, 2025Updated 5 months ago
- The repo contains the code and dataset for the World Models Track of GigaBrain Challenge 2026 CVPR Workshop.☆58Apr 8, 2026Updated 3 weeks ago
- ☆16Feb 4, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- a collaborative agent-based workflow designed for NL2Vis task☆19Mar 6, 2025Updated last year
- RecGPT: Generative Pre-training for Text-based Recommendation (ACL 2024)☆41Sep 22, 2024Updated last year
- Reinforcement Learning, Tutorials in Chinese☆11Jun 9, 2018Updated 7 years ago
- LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning☆37Apr 4, 2024Updated 2 years ago
- ☆517Apr 7, 2026Updated 3 weeks ago
- Chain-of-Thought Matters: Improving Long-Context Language Models with Reasoning Path Supervision☆19Apr 1, 2025Updated last year
- Marketplace ML experiment - training without backprop☆27Sep 9, 2025Updated 7 months ago
- ☆20Nov 20, 2024Updated last year
- Implementation of ICML 22 Paper: Scaling Structured Inference with Randomization☆13Jul 24, 2022Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Official code for ACT: Empowering Decision Transformer with Dynamic Programming via Advantage Conditioning (AAAI'24)☆17Feb 10, 2024Updated 2 years ago
- ☆30Oct 8, 2025Updated 6 months ago
- ☆12Jan 31, 2022Updated 4 years ago
- ☆24May 23, 2025Updated 11 months ago
- Official data release for FaceMap, to present in Siggraph Asia 2024☆13Nov 1, 2024Updated last year
- Extending the Context of Pretrained LLMs by Dropping Their Positional Embedding☆217Jan 12, 2026Updated 3 months ago
- Official Repo for Error-Free Linear Attention is a Free Lunch: Exact Solution from Continuous-Time Dynamics☆73Mar 26, 2026Updated last month
- Accelerating RL for LLM Reasoning with Optimal Advantage Regression☆41May 30, 2025Updated 11 months ago
- A videogame made with PyGame turned into an Open AI Gym Learning Environment for Reinforcement Learning agents.☆15Jan 3, 2023Updated 3 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Robust and Approximate Markov Decision Processes☆11Jul 21, 2017Updated 8 years ago
- this is for fun, ain't it grand!☆22Sep 18, 2025Updated 7 months ago
- This repository is for the "LLM-Aligned Geographic Item Tokenization for Local-Life Recommendation".☆17Nov 18, 2025Updated 5 months ago
- [ICML 2025] Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment (https://arxiv.org/abs/2410.02197)☆41Sep 8, 2025Updated 7 months ago
- The official codes of Learning to Decouple the Lights for 3D Face Texture Modeling (NeurIPS'24)☆14Mar 17, 2025Updated last year
- Official repo for our AAAI'21 paper, https://arxiv.org/abs/2007.12354☆30Jul 14, 2021Updated 4 years ago
- Battery charge management environment, designed as a multi-agent scenario with continuous observation and action space, where the agents …☆13Feb 9, 2021Updated 5 years ago