nec-research / agentquest
☆25Updated 4 months ago
Alternatives and similar repositories for agentquest:
Users that are interested in agentquest are comparing it to the libraries listed below
- EMNLP 2024 "Re-reading improves reasoning in large language models". Simply repeating the question to get bidirectional understanding for…☆24Updated 2 months ago
- SCREWS: A Modular Framework for Reasoning with Revisions☆27Updated last year
- ☆48Updated 3 months ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆48Updated 7 months ago
- Exploration using DSPy to optimize modules to maximize performance on the OpenToM dataset☆14Updated 11 months ago
- ☆41Updated 2 months ago
- ☆23Updated 5 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆54Updated 5 months ago
- ReBase: Training Task Experts through Retrieval Based Distillation☆28Updated 2 weeks ago
- ☆50Updated 3 months ago
- ☆24Updated last year
- Small, simple agent task environments for training and evaluation☆18Updated 3 months ago
- ☆45Updated 4 months ago
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆31Updated last year
- ☆20Updated last week
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆24Updated 11 months ago
- Mixing Language Models with Self-Verification and Meta-Verification☆101Updated 2 months ago
- ☆11Updated 7 months ago
- A testbed for agents and environments that can automatically improve models through data generation.☆18Updated 2 months ago
- Data preparation code for CrystalCoder 7B LLM☆44Updated 9 months ago
- Track the progress of LLM context utilisation☆53Updated 7 months ago
- Writing Blog Posts with Generative Feedback Loops!☆47Updated 11 months ago
- [EMNLP 2024] A Retrieval Benchmark for Scientific Literature Search☆69Updated 2 months ago
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆48Updated 2 months ago
- The Benefits of a Concise Chain of Thought on Problem Solving in Large Language Models☆21Updated 2 months ago
- Using open source LLMs to build synthetic datasets for direct preference optimization☆57Updated 11 months ago