OpenPipe / Summary-RLLinks
Train an agent to generate high quality summaries
☆29Updated last month
Alternatives and similar repositories for Summary-RL
Users that are interested in Summary-RL are comparing it to the libraries listed below
Sorting:
- LLM reads a paper and produce a working prototype☆57Updated 4 months ago
- II-Thought-RL is our initial attempt at developing a large-scale, multi-domain Reinforcement Learning (RL) dataset☆26Updated 4 months ago
- j1-micro (1.7B) & j1-nano (600M) are absurdly tiny but mighty reward models.☆95Updated 3 weeks ago
- A framework for pitting LLMs against each other in an evolving library of games ⚔☆34Updated 3 months ago
- Large Language Model (LLM) powered evaluator for Retrieval Augmented Generation (RAG) pipelines.☆31Updated last year
- Simple examples using Argilla tools to build AI☆53Updated 8 months ago
- Solving data for LLMs - Create quality synthetic datasets!☆150Updated 6 months ago
- A fast, local, and secure approach for training LLMs for coding tasks using GRPO with WebAssembly and interpreter feedback.☆36Updated 4 months ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆55Updated 6 months ago
- The official implementation for paper "Agentic-R1: Distilled Dual-Strategy Reasoning"☆89Updated 3 weeks ago
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆72Updated 4 months ago
- ☆50Updated last week
- Train your own SOTA deductive reasoning model☆104Updated 5 months ago
- ☆66Updated 2 months ago
- Very minimal (and stateless) agent framework☆45Updated 7 months ago
- Challenges for general-purpose web-browsing AI agents☆63Updated 2 months ago
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆173Updated 6 months ago
- This is an open-source version of OpenAI's O1 Model Series by Siraj Raval & O1-Preview☆96Updated 9 months ago
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆146Updated 5 months ago
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…☆82Updated this week
- ☆125Updated 3 weeks ago
- Score LLM pretraining data with classifiers☆55Updated last year
- ☆167Updated 5 months ago
- ☆19Updated 5 months ago
- ☆41Updated 6 months ago
- ☆102Updated 2 months ago
- ⚖️ Awesome LLM Judges ⚖️☆108Updated 3 months ago
- rl from zero pretrain, can it be done? yes.☆193Updated this week
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆111Updated 4 months ago
- ☆73Updated 5 months ago