Pavankunchala / Reinforcement-learning-with-verifable-rewards-LearningsLinks
RLVR Testing and Training
☆23Updated 3 months ago
Alternatives and similar repositories for Reinforcement-learning-with-verifable-rewards-Learnings
Users that are interested in Reinforcement-learning-with-verifable-rewards-Learnings are comparing it to the libraries listed below
Sorting:
- A truly open version of gpt-oss which shows the entire pre-training from scratch☆79Updated 3 months ago
- ☆31Updated 9 months ago
- ☆109Updated 6 months ago
- The code repository of the paper: Competition and Attraction Improve Model Fusion☆167Updated 3 months ago
- ☆62Updated 5 months ago
- AnyModal is a Flexible Multimodal Language Model Framework for PyTorch☆103Updated 11 months ago
- Code for paper https://arxiv.org/abs/2501.00522☆13Updated 7 months ago
- An overview of GRPO & DeepSeek-R1 Training with Open Source GRPO Model Fine Tuning☆37Updated 7 months ago
- ☆122Updated 6 months ago
- Clue inspired puzzles for testing LLM deduction abilities☆45Updated 8 months ago
- OpenPipe Reinforcement Learning Experiments☆32Updated 9 months ago
- ☆159Updated 8 months ago
- Mixture of Cognitive Reasoners: Modular Reasoning with Brain-Like Specialization☆36Updated last month
- Advanced Coding AI Assistant that uses a Gradio interface to stream coding related responses. ChatRAG supports local and API inference an…☆23Updated 7 months ago
- From-scratch implementation of OpenAI's GPT-OSS model in Python. No Torch, No GPUs.☆107Updated last month
- unsloth-5090-multiple☆60Updated 6 months ago
- Streaming Retrieval-Augmented Generation (RAG) agent in Go. It consumes real-time data from Kafka topics, processes it in configurable wi…☆25Updated 6 months ago
- A high quality and fast TTS repository☆111Updated this week
- ☆92Updated last month
- JacQues is a Dash-based interactive web application that facilitates real-time chat and document management.☆22Updated last year
- Exploring retrieval systems for language models☆14Updated 8 months ago
- Train transformer language models with reinforcement learning.☆19Updated 9 months ago
- ☆29Updated 7 months ago
- AI Agent that researches the lives of historical figures and extracts events into structured JSON timelines using LangGraph multi-agent o…☆215Updated 2 months ago
- A real-time shared memory layer for multi-agent LLM systems.☆50Updated 5 months ago
- entropix style sampling + GUI☆27Updated last year
- Enhancing LLMs with LoRA☆193Updated 2 months ago
- ☆57Updated 10 months ago
- Use smol agents to do research and then update csv coumns with its findings.☆41Updated 10 months ago
- Lego for GRPO☆30Updated 6 months ago