cgftinc / benchmaxLinks
Framework-Agnostic RL Environments for LLM Fine-Tuning
☆37Updated this week
Alternatives and similar repositories for benchmax
Users that are interested in benchmax are comparing it to the libraries listed below
Sorting:
- Pivotal Token Search☆130Updated 3 months ago
 - ☆62Updated 3 months ago
 - An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO☆29Updated last week
 - Lego for GRPO☆30Updated 5 months ago
 - Lightweight toolkit package to train and fine-tune 1.58bit Language models☆95Updated 5 months ago
 - Nexusflow function call, tool use, and agent benchmarks.☆29Updated 10 months ago
 - Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆58Updated 2 weeks ago
 - ☆55Updated 11 months ago
 - The Benefits of a Concise Chain of Thought on Problem Solving in Large Language Models☆22Updated 11 months ago
 - entropix style sampling + GUI☆27Updated last year
 - Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…☆99Updated last week
 - A framework for pitting LLMs against each other in an evolving library of games ⚔☆34Updated 6 months ago
 - ☆49Updated 8 months ago
 - GPT-4 Level Conversational QA Trained In a Few Hours☆65Updated last year
 - II-Thought-RL is our initial attempt at developing a large-scale, multi-domain Reinforcement Learning (RL) dataset☆28Updated 6 months ago
 - Code for our paper PAPILLON: PrivAcy Preservation from Internet-based and Local Language MOdel ENsembles☆59Updated 5 months ago
 - ☆54Updated last year
 - A framework for pitting LLMs against each other in an evolving library of games ⚔☆33Updated 6 months ago
 - Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆31Updated last year
 - Train your own SOTA deductive reasoning model☆109Updated 7 months ago
 - Moatless Testbeds allows you to create isolated testbed environments in a Kubernetes cluster where you can apply code changes through git…☆14Updated 6 months ago
 - Storing long contexts in tiny caches with self-study☆205Updated 2 weeks ago
 - Aana SDK is a powerful framework for building AI enabled multimodal applications.☆53Updated 2 months ago
 - Source code of "How to Correctly do Semantic Backpropagation on Language-based Agentic Systems" 🤖☆76Updated 10 months ago
 - Data preparation code for CrystalCoder 7B LLM☆45Updated last year
 - ☆40Updated 10 months ago
 - Small, simple agent task environments for training and evaluation☆18Updated last year
 - Modified Beam Search with periodical restart☆12Updated last year
 - Efficient non-uniform quantization with GPTQ for GGUF☆52Updated last month
 - [ACL 2025] How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training☆45Updated 3 months ago