Rainlabuw / rl-enabled-distributed-assignment
Implementation of RL-Enabled Distributed Assignment (REDA)
☆17Updated 9 months ago
Alternatives and similar repositories for rl-enabled-distributed-assignment:
Users that are interested in rl-enabled-distributed-assignment are comparing it to the libraries listed below
- ☆21Updated 2 months ago
- ☆38Updated 9 months ago
- ☆18Updated 7 months ago
- Clean RL implementation using MLX☆30Updated last year
- Exploration using DSPy to optimize modules to maximize performance on the OpenToM dataset☆16Updated last year
- An intelligent code optimization system leveraging AI analysis, automated refactoring, and test generation. Built with DSPy and Gradio, i…☆18Updated 2 months ago
- QAlign is a new test-time alignment approach that improves language model performance by using Markov chain Monte Carlo methods.☆22Updated 2 weeks ago
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆41Updated 10 months ago
- Repo to reproduce the First-Explore paper results☆37Updated 4 months ago
- Generative cellular automaton-like learning environments for RL.☆19Updated 2 months ago
- Implementation of SelfExtend from the paper "LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning" from Pytorch and Zeta☆13Updated 5 months ago
- Latent Large Language Models☆18Updated 8 months ago
- The original Shared Recurrent Memory Transformer implementation☆23Updated 3 months ago
- Collection of LLM completions for reasoning-gym task datasets☆19Updated this week
- Simple GRPO scripts and configurations.☆58Updated 2 months ago
- Official Repository for Task-Circuit Quantization☆15Updated 2 weeks ago
- LLM reads a paper and produce a working prototype☆52Updated 2 weeks ago
- Code for the paper: CodeTree: Agent-guided Tree Search for Code Generation with Large Language Models☆19Updated 3 weeks ago
- Code for "Accelerating Training with Neuron Interaction and Nowcasting Networks" [to appear at ICLR 2025]☆19Updated last month
- Small, simple agent task environments for training and evaluation☆18Updated 5 months ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆39Updated 2 months ago
- Code for Discovering Preference Optimization Algorithms with and for Large Language Models☆61Updated 10 months ago
- ☆33Updated this week
- NeurIPS 2023 - Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer☆42Updated last year
- Source code of "How to Correctly do Semantic Backpropagation on Language-based Agentic Systems" 🤖☆67Updated 4 months ago
- ☆12Updated last month
- Learning to Retrieve by Trying - Source code for Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval☆32Updated 5 months ago
- Learn online intrinsic rewards from LLM feedback☆35Updated 4 months ago
- gzip Predicts Data-dependent Scaling Laws☆34Updated 10 months ago
- ☆20Updated last year