uiuc-kang-lab / ELT-BenchLinks
☆17Updated last week
Alternatives and similar repositories for ELT-Bench
Users that are interested in ELT-Bench are comparing it to the libraries listed below
Sorting:
- Verifiers for LLM Reinforcement Learning☆56Updated last month
- DSBench: How Far are Data Science Agents from Becoming Data Science Experts?☆54Updated 3 months ago
- ☆40Updated last week
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…☆60Updated last week
- ☆34Updated last week
- ☆65Updated 2 months ago
- A method for steering llms to better follow instructions☆45Updated last week
- Code for the paper: CodeTree: Agent-guided Tree Search for Code Generation with Large Language Models☆21Updated 2 months ago
- ☆13Updated 5 months ago
- Combining Base and Instruction-Tuned Language Models for Better Synthetic Data Generation☆31Updated 3 months ago
- ☆21Updated 3 months ago
- Codebase accompanying the Summary of a Haystack paper.☆78Updated 8 months ago
- AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories☆15Updated 3 weeks ago
- Aioli: A unified optimization framework for language model data mixing☆27Updated 4 months ago
- Large language models for document ranking.☆54Updated 3 weeks ago
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆67Updated 2 months ago
- ☆24Updated 8 months ago
- ☆41Updated 5 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆57Updated 9 months ago
- ☆41Updated last week
- ☆20Updated 2 weeks ago
- Scalable Meta-Evaluation of LLMs as Evaluators☆42Updated last year
- ☆19Updated this week
- ☆45Updated 2 weeks ago
- Simple repository for training small reasoning models☆31Updated 4 months ago
- Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆91Updated 3 months ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆53Updated 4 months ago
- ReBase: Training Task Experts through Retrieval Based Distillation☆29Updated 4 months ago
- ☆49Updated 7 months ago
- This repository contains popular code generation frameworks such as MapCoder, CodeSIM.☆51Updated last month