Official Implementation of "Maximum Likelihood Reinforcement Learning (MaxRL)"
☆188May 28, 2026Updated 2 weeks ago
Alternatives and similar repositories for maxrl
Users that are interested in maxrl are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official Implementation for the paper "VisCodex: Unified Multimodal Code Generation via Merging Vision and Coding Models"☆23Aug 14, 2025Updated 10 months ago
- Trust Region Preference Approximation: A simple and stable reinforcement learning algorithm for LLM reasoning☆15Jun 28, 2025Updated 11 months ago
- [ICLR 2026] PixNerd: Pixel Neural Field Diffusion☆178Dec 10, 2025Updated 6 months ago
- NeurIPS 2026 paper: The Geometry of Consolidation — follow-up to HIDE and No-Escape.☆110May 5, 2026Updated last month
- SimKO: Simple Pass@K Policy Optimization☆30Oct 24, 2025Updated 7 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆17Jun 10, 2025Updated last year
- RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best…☆62Mar 17, 2025Updated last year
- ☆35Feb 10, 2025Updated last year
- Brain Interpreter and Visualizer Online.☆10Sep 1, 2016Updated 9 years ago
- webgpu autograd library☆35May 24, 2025Updated last year
- ADAG: Transluce's MLP neuron-level circuit tracing library☆28Apr 10, 2026Updated 2 months ago
- (ICML 2025) Rethinking Chain-of-Thought from the Perspective of Self-Training☆13Feb 15, 2025Updated last year
- Consistent Paths Lead to Truth: Self-Rewarding Reinforcement Learning for LLM Reasoning☆25Jun 25, 2025Updated 11 months ago
- [ICML 25] "Preference Optimization for Combinatorial Optimization Problems"☆28Jun 6, 2025Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- ☆14Jul 21, 2022Updated 3 years ago
- Code repo for "Harnessing Negative Signals: Reinforcement Distillation from Teacher Data for LLM Reasoning"☆33Jul 25, 2025Updated 10 months ago
- ☆16Apr 16, 2025Updated last year
- Agentic Research and Evaluation Suite☆97Jun 5, 2026Updated last week
- A framework bridging cognitive science and LLM reasoning research to diagnose and improve how large language models reason, based on anal…☆40Nov 26, 2025Updated 6 months ago
- ☆15Apr 25, 2025Updated last year
- ☆72Dec 7, 2025Updated 6 months ago
- Implementation of GraphReader paper: https://arxiv.org/abs/2406.14550☆14Oct 21, 2024Updated last year
- This is the notebooks for videos in my Bilibili Channel (https://space.bilibili.com/32773300?spm_id_from=333.1007.0.0)☆34Nov 6, 2025Updated 7 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆50May 9, 2026Updated last month
- RecGPT: Generative Pre-training for Text-based Recommendation (ACL 2024)☆42Sep 22, 2024Updated last year
- ArXiV Notification Bot which sends you an email with the latest updates!☆17Oct 20, 2023Updated 2 years ago
- Excalibur is a highly opinionated agent harness for the aspiring summoner.☆170Apr 3, 2026Updated 2 months ago
- ☆19Nov 20, 2024Updated last year
- Implementation of ICML 22 Paper: Scaling Structured Inference with Randomization☆13Jul 24, 2022Updated 3 years ago
- ☆19Jul 18, 2021Updated 4 years ago
- ☆31Oct 8, 2025Updated 8 months ago
- ☆12Feb 25, 2025Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- a collaborative agent-based workflow designed for NL2Vis task☆20Mar 6, 2025Updated last year
- Pytorch code for NeurIPS 2025 paper "Accurate and Efficient Low-Rank Model Merging in Core Space"☆40Feb 2, 2026Updated 4 months ago
- ☆614Apr 7, 2026Updated 2 months ago
- Official data release for FaceMap, to present in Siggraph Asia 2024☆13Nov 1, 2024Updated last year
- Authors' PyTorch implementation of 'Recomposing the Reinforcement Learning Building-Blocks with Hypernetworks' (HypeRL)☆26Jun 9, 2021Updated 5 years ago
- Extending the Context of Pretrained LLMs by Dropping Their Positional Embedding☆218Jan 12, 2026Updated 5 months ago
- The official implementatation of paper "BOTH2Hands: Inferring 3D Hands from Both Text Prompts and Body Dynamics".☆52Jun 26, 2025Updated 11 months ago