Reflect-RL: Two-Player Online RL Fine-Tuning for LMs
☆18Jul 19, 2025Updated 7 months ago
Alternatives and similar repositories for Reflect-RL
Users that are interested in Reflect-RL are comparing it to the libraries listed below
Sorting:
- ☆15Feb 25, 2026Updated last week
- Testing paligemma2 finetuning on reasoning dataset☆18Dec 28, 2024Updated last year
- PowerBiMIP is an open-source, efficient bilevel mixed-integer programming (BiMIP) solver, with a special focus on applications in power a…☆34Updated this week
- ☆19Sep 22, 2025Updated 5 months ago
- Code and data for Teddy https://arxiv.org/abs/2001.05171.☆15Jun 21, 2022Updated 3 years ago
- Official code implementation for the ACL 2025 paper: 'CoT-based Synthesizer: Enhancing LLM Performance through Answer Synthesis'☆32May 19, 2025Updated 9 months ago
- Learn online intrinsic rewards from LLM feedback☆45Dec 17, 2024Updated last year
- Official implementation of FRAPPE: Infusing World Modeling into Generalist Policies via Multiple Future Representation Alignment☆28Feb 24, 2026Updated last week
- QuESt Planning is a long-term power system capacity expansion planning model that identifies cost-optimal energy storage, generation, and…☆14Feb 4, 2026Updated last month
- ☆10Jul 6, 2023Updated 2 years ago
- ☆19Nov 20, 2025Updated 3 months ago
- paraphase sentence☆11Aug 22, 2025Updated 6 months ago
- Example Systems using PowerDynamics.jl☆12Oct 10, 2022Updated 3 years ago
- An implementation of using rl to control magnetic soft robots.☆10Jul 29, 2024Updated last year
- ☆11Jan 27, 2026Updated last month
- ☆14Nov 19, 2024Updated last year
- Gretchen - An Open-Source Humanoid Robot Development Platform☆11Jul 8, 2019Updated 6 years ago
- Source code for the paper titled: "Unlocking the full potential of smart charging: Addressing paused and delayed charging problems in ele…☆11May 22, 2024Updated last year
- ☆11Aug 20, 2025Updated 6 months ago
- ☆12Mar 15, 2023Updated 2 years ago
- Visualize linear programming at https://lpviz.net☆33Jan 20, 2026Updated last month
- Prototype implementation of an architecture suggested in Robot Dream paper (http://arxiv.org/abs/1603.03007)☆12Jul 3, 2019Updated 6 years ago
- JAX notebook showing how to LoRA + GPTQ arbitrary models☆10Aug 8, 2023Updated 2 years ago
- EMNLP 2022: Analyzing and Evaluating Faithfulness in Dialogue Summarization☆13Mar 20, 2025Updated 11 months ago
- ☆10Aug 13, 2022Updated 3 years ago
- Official code for AAAI'20 paper "Merging Weak and Active Supervision for Semantic Parsing"☆11Dec 8, 2022Updated 3 years ago
- MatchAttention: Matching the Relative Positions for High-Resolution Cross-View Matching☆22Nov 13, 2025Updated 3 months ago
- Hybrid European MV-LV Models for Smart Distribution Network Modelling☆10Dec 11, 2021Updated 4 years ago
- ☆14Jan 8, 2026Updated last month
- Python and Scala APIs for enhanced Spark analytics☆12Mar 15, 2017Updated 8 years ago
- ☆13May 21, 2023Updated 2 years ago
- NeurIPS 2024☆13Oct 29, 2025Updated 4 months ago
- ☆12Mar 1, 2024Updated 2 years ago
- Replaces occurrences of the word 'literally' with 'figuratively'. That's literally all it does.☆45Nov 7, 2014Updated 11 years ago
- An alternative front end for Amazon Mechanical Turk☆12May 13, 2024Updated last year
- Modelling and optimization for microgrids, energy hubs, distribution systems and transmission systems☆11Dec 14, 2022Updated 3 years ago
- ☆13Nov 5, 2025Updated 3 months ago
- ☆10Mar 25, 2024Updated last year
- ☆10Nov 10, 2021Updated 4 years ago