Pleias / RL-ReasoningLinks
Collection of resources for RL and Reasoning
☆26Updated 9 months ago
Alternatives and similar repositories for RL-Reasoning
Users that are interested in RL-Reasoning are comparing it to the libraries listed below
Sorting:
- ☆138Updated 3 months ago
- A blueprint for AI development, focusing on applied examples of RAG, information extraction, analysis and fine-tuning in the age of LLMs …☆61Updated 9 months ago
- ☆51Updated 9 months ago
- Python library to use Pleias-RAG models☆66Updated 6 months ago
- Source code for the collaborative reasoner research project at Meta FAIR.☆105Updated 7 months ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆51Updated last year
- ☆146Updated last year
- ☆98Updated 7 months ago
- An introduction to LLM Sampling☆79Updated 11 months ago
- Train LLM on Hugging Face infra☆67Updated last week
- Simple UI for debugging correlations of text embeddings☆300Updated 5 months ago
- code for training & evaluating Contextual Document Embedding models☆200Updated 6 months ago
- ☆80Updated last year
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆113Updated 7 months ago
- Set of scripts to finetune LLMs☆38Updated last year
- Train your own SOTA deductive reasoning model☆107Updated 8 months ago
- Simple examples using Argilla tools to build AI☆56Updated last year
- Source code of "How to Correctly do Semantic Backpropagation on Language-based Agentic Systems" 🤖☆76Updated 11 months ago
- Using open source LLMs to build synthetic datasets for direct preference optimization☆69Updated last year
- Banishing LLM Hallucinations Requires Rethinking Generalization☆275Updated last year
- awesome synthetic (text) datasets☆305Updated last week
- ☆55Updated last year
- Super basic implementation (gist-like) of RLMs with REPL environments.☆255Updated last month
- The first dense retrieval model that can be prompted like an LM☆89Updated 6 months ago
- Let's build better datasets, together!☆264Updated 11 months ago
- Pre-train Static Word Embeddings☆90Updated 2 months ago
- ☆120Updated last year
- Trully flash implementation of DeBERTa disentangled attention mechanism.☆66Updated last month
- EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language M…☆242Updated last year
- Luth is a state-of-the-art series of fine-tuned LLMs for French☆39Updated last month