Fine-tunes a student LLM using teacher feedback for improved reasoning and answer quality. Implements GRPO with teacher-provided evaluations.
☆54May 7, 2025Updated last year
Alternatives and similar repositories for grpo-llm-evaluator
Users that are interested in grpo-llm-evaluator are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- NeuroBLAST v3 architecture code☆37Jan 6, 2026Updated 4 months ago
- ☆15Apr 26, 2025Updated last year
- this is based on the paper Chain-of-Retrieval Augmented Generation☆15Mar 29, 2025Updated last year
- ☆19Mar 10, 2025Updated last year
- ☆17Feb 1, 2024Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆13Jan 17, 2024Updated 2 years ago
- Multilingual RAG benchmark.☆10Nov 22, 2024Updated last year
- Train and finutune text-to-speech models for Bengali and many other languages!☆18Apr 2, 2025Updated last year
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆60Oct 18, 2025Updated 7 months ago
- Бенчмарк для оценки способности языковых моделей решать математические и физические задачи на русском языке☆22Nov 14, 2025Updated 6 months ago
- ☆15Jan 26, 2025Updated last year
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆151Jan 7, 2026Updated 4 months ago
- Mirror for Java and PHP libraries and text resources to facilitate the use of Inuktitut in its written form on computers and the web☆10Aug 2, 2015Updated 10 years ago
- Local LLM Agent with Guidance☆13May 26, 2023Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Interpreting Learned Search and Planning: Reverse-engineering recurrent convolutional networks (DRC) that play Sokoban☆20Jun 29, 2025Updated 11 months ago
- [TMLR 2025] When Attention Collapses: How Degenerate Layers in LLMs Enable Smaller, Stronger Models☆126Mar 6, 2026Updated 2 months ago
- Implemention based on lightrag and nano-graphrag to connect with psql☆15Oct 28, 2024Updated last year
- Seamless Voice Interactions with LLMs☆12Oct 28, 2023Updated 2 years ago
- [NeurIPS 2024] Low rank memory efficient optimizer without SVD☆33Jul 1, 2025Updated 10 months ago
- ☆24Jan 22, 2025Updated last year
- Generating Easy-to-Understand Referring Expressions for Target Identifications☆18Aug 30, 2019Updated 6 years ago
- [ACL 2025] Knowledge Unlearning for Large Language Models☆49Sep 18, 2025Updated 8 months ago
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆175Jan 16, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Tcurtsni: Reverse Instruction Chat, ever wonder what your LLM wants to ask you?☆23Jun 25, 2024Updated last year
- Репозиторий измеряет качество Yandexgpt, Gigachat, T-Pro, Saiga, Vikhr, Ruadapt на популярных англоязычных бенчмарках: MGSM, MATH, HumanE…☆24Apr 16, 2025Updated last year
- Tiny Agent: Production-Ready LLM Agent SDK for Every Developer☆41Sep 29, 2025Updated 8 months ago
- Hercules: Attributable and Scalable Opinion Summarization (ACL 2023)☆20Nov 8, 2023Updated 2 years ago
- ☆20Aug 1, 2024Updated last year
- ☆17Oct 11, 2023Updated 2 years ago
- An introduction to LLM Sampling☆80Dec 15, 2024Updated last year
- Modify Entropy Based Sampling to work with Mac Silicon via MLX☆49Nov 6, 2024Updated last year
- A minimal WebRTC SFU Implementation☆19Jun 15, 2025Updated 11 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- All the content of my youtube channel : https://youtube.com/@florenzerstling?si=7t10PBr6MDha74PO☆14May 28, 2025Updated last year
- This AI Agent retrieves the latest news articles based on a multi keyword using the Serp API. It processes the results and returns struct…☆11Jan 31, 2025Updated last year
- minimal pytorch implementation of bm25 (with sparse tensors)☆105Oct 28, 2025Updated 7 months ago
- Exploring Applications of GRPO☆252Aug 25, 2025Updated 9 months ago
- Explore training for quantized models☆26Jul 12, 2025Updated 10 months ago
- [AAAI26] LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs☆56Dec 7, 2025Updated 5 months ago
- Easy local FLUX.1 Inference☆10Aug 29, 2024Updated last year