Fine-tunes a student LLM using teacher feedback for improved reasoning and answer quality. Implements GRPO with teacher-provided evaluations.
☆52May 7, 2025Updated 10 months ago
Alternatives and similar repositories for grpo-llm-evaluator
Users that are interested in grpo-llm-evaluator are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- NanoGPT (124M) in 5 minutes☆15Feb 14, 2025Updated last year
- ☆15Apr 26, 2025Updated 11 months ago
- ☆19Mar 10, 2025Updated last year
- ☆17Feb 1, 2024Updated 2 years ago
- ☆13Nov 4, 2025Updated 4 months ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Train and finutune text-to-speech models for Bengali and many other languages!☆18Apr 2, 2025Updated 11 months ago
- A simple one file python script that executes AI processes defined in YML.☆14Mar 26, 2023Updated 3 years ago
- Easy to deploy your LLM(large language model) server with no public address GPU machine.☆15Apr 30, 2024Updated last year
- Interpreting Learned Search and Planning: Reverse-engineering recurrent convolutional networks (DRC) that play Sokoban☆18Jun 29, 2025Updated 9 months ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆59Oct 18, 2025Updated 5 months ago
- ☆16Jan 26, 2025Updated last year
- Local LLM Agent with Guidance☆13May 26, 2023Updated 2 years ago
- [TMLR 2025] When Attention Collapses: How Degenerate Layers in LLMs Enable Smaller, Stronger Models☆125Mar 6, 2026Updated 3 weeks ago
- Seamless Voice Interactions with LLMs☆12Oct 28, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- [NeurIPS 2024] Low rank memory efficient optimizer without SVD☆33Jul 1, 2025Updated 8 months ago
- ☆24Jan 22, 2025Updated last year
- [ACL 2025] Knowledge Unlearning for Large Language Models☆49Sep 18, 2025Updated 6 months ago
- Tcurtsni: Reverse Instruction Chat, ever wonder what your LLM wants to ask you?☆23Jun 25, 2024Updated last year
- chatGPT 'Autonomous Agent' in Node.js, written/runs in Termux. Sandboxed REPL access, Termux:API interface, chain-of-thought Question-Obs…☆16May 12, 2023Updated 2 years ago
- ☆20Aug 1, 2024Updated last year
- An introduction to LLM Sampling☆80Dec 15, 2024Updated last year
- ☆16Apr 29, 2025Updated 11 months ago
- Modify Entropy Based Sampling to work with Mac Silicon via MLX☆49Nov 6, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Official Project Page for HLA: Higher-order Linear Attention (https://arxiv.org/abs/2510.27258)