Pleias / RL-ReasoningLinks
Collection of resources for RL and Reasoning
☆27Updated last year
Alternatives and similar repositories for RL-Reasoning
Users that are interested in RL-Reasoning are comparing it to the libraries listed below
Sorting:
- ☆106Updated 10 months ago
- ☆147Updated last year
- Simple examples using Argilla tools to build AI☆57Updated last year
- Train your own SOTA deductive reasoning model☆107Updated 11 months ago
- ☆142Updated 5 months ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆51Updated last year
- Python library to use Pleias-RAG models☆68Updated 9 months ago
- ☆210Updated 7 months ago
- Source code for the collaborative reasoner research project at Meta FAIR.☆112Updated 9 months ago
- Source code of "How to Correctly do Semantic Backpropagation on Language-based Agentic Systems" 🤖☆76Updated last year
- Banishing LLM Hallucinations Requires Rethinking Generalization☆277Updated last year
- ☆53Updated last year
- A blueprint for AI development, focusing on applied examples of RAG, information extraction, analysis and fine-tuning in the age of LLMs …☆61Updated last year
- ☆67Updated 8 months ago
- An introduction to LLM Sampling☆79Updated last year
- Simple UI for debugging correlations of text embeddings☆305Updated 8 months ago
- code for training & evaluating Contextual Document Embedding models☆202Updated 8 months ago
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)☆92Updated last year
- This is the reproduction repository for my 🤗 Hugging Face blog post on synthetic data☆68Updated last year
- Using open source LLMs to build synthetic datasets for direct preference optimization☆72Updated last year
- ☆80Updated last year
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆115Updated 10 months ago
- SynthGenAI - Package for Generating Synthetic Datasets using LLMs.☆54Updated 2 months ago
- Code for evaluating with Flow-Judge-v0.1 - an open-source, lightweight (3.8B) language model optimized for LLM system evaluations. Crafte…☆84Updated last year
- QAlign is a new test-time alignment approach that improves language model performance by using Markov chain Monte Carlo methods.☆26Updated last month
- ☆56Updated last year
- ☆270Updated 7 months ago
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆150Updated last month
- EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language M…☆250Updated last year
- awesome synthetic (text) datasets☆321Updated last month