Fine-tunes a student LLM using teacher feedback for improved reasoning and answer quality. Implements GRPO with teacher-provided evaluations.
☆54May 7, 2025Updated last year
Alternatives and similar repositories for grpo-llm-evaluator
Users that are interested in grpo-llm-evaluator are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- NanoGPT (124M) in 5 minutes☆15Feb 14, 2025Updated last year
- ☆15Apr 26, 2025Updated last year
- this is based on the paper Chain-of-Retrieval Augmented Generation☆14Mar 29, 2025Updated last year
- ☆17Feb 1, 2024Updated 2 years ago
- ☆13Jan 17, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Train and finutune text-to-speech models for Bengali and many other languages!☆18Apr 2, 2025Updated last year
- A curated list of awesome sentiment analysis studies, in which attitude corresponds to the text position conveyed by Subject towards othe…☆19Mar 23, 2026Updated last month
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆60Oct 18, 2025Updated 6 months ago
- ☆15Jan 26, 2025Updated last year
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆150Jan 7, 2026Updated 4 months ago
- [TMLR 2025] When Attention Collapses: How Degenerate Layers in LLMs Enable Smaller, Stronger Models☆126Mar 6, 2026Updated 2 months ago
- Seamless Voice Interactions with LLMs☆12Oct 28, 2023Updated 2 years ago
- [NeurIPS 2024] Low rank memory efficient optimizer without SVD☆33Jul 1, 2025Updated 10 months ago
- Generating Easy-to-Understand Referring Expressions for Target Identifications☆18Aug 30, 2019Updated 6 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Tensorflow tf.metrics tutorial☆12Aug 30, 2018Updated 7 years ago
- [ACL 2025] Knowledge Unlearning for Large Language Models☆50Sep 18, 2025Updated 7 months ago
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆176Jan 16, 2025Updated last year
- Tcurtsni: Reverse Instruction Chat, ever wonder what your LLM wants to ask you?☆23Jun 25, 2024Updated last year
- ☆19May 17, 2025Updated 11 months ago
- Репозиторий измеряет качество Yandexgpt, Gigachat, T-Pro, Saiga, Vikhr, Ruadapt на популярных англоязычных бенчмарках: MGSM, MATH, HumanE…☆23Apr 16, 2025Updated last year
- Learning adapter weights from task descriptions☆19Nov 12, 2023Updated 2 years ago
- Tiny Agent: Production-Ready LLM Agent SDK for Every Developer☆39Sep 29, 2025Updated 7 months ago
- ☆20Aug 1, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- A transformer seq2seq model to generate couplets. 一个写对联的 Transformer 序列到序列模型。☆17Feb 1, 2019Updated 7 years ago
- An introduction to LLM Sampling☆80Dec 15, 2024Updated last year
- Modify Entropy Based Sampling to work with Mac Silicon via MLX☆49Nov 6, 2024Updated last year
- All the content of my youtube channel : https://youtube.com/@florenzerstling?si=7t10PBr6MDha74PO☆14May 28, 2025Updated 11 months ago
- minimal pytorch implementation of bm25 (with sparse tensors)☆104Oct 28, 2025Updated 6 months ago
- This AI Agent retrieves the latest news articles based on a multi keyword using the Serp API. It processes the results and returns struct…☆11Jan 31, 2025Updated last year
- Official Project Page for HLA: Higher-order Linear Attention (https://arxiv.org/abs/2510.27258)☆48Jan 6, 2026Updated 4 months ago
- Modified Beam Search with periodical restart☆12Sep 12, 2024Updated last year
- [ICML 2025] Reward-guided Speculative Decoding (RSD) for efficiency and effectiveness.☆56May 2, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Exploring Applications of GRPO☆252Aug 25, 2025Updated 8 months ago
- Approximating the joint distribution of language models via MCTS☆22Nov 3, 2024Updated last year
- Explore training for quantized models☆26Jul 12, 2025Updated 9 months ago
- ☆28Dec 16, 2025Updated 4 months ago
- eSnyne FPGA is a Powerful Developpement Board from used Antminer S9 Control Board☆22Oct 12, 2023Updated 2 years ago
- Entropy Based Sampling and Parallel CoT Decoding☆17Oct 9, 2024Updated last year
- ☆21Jul 25, 2025Updated 9 months ago