psunlpgroup / GreaterPromptLinks
GreaterPrompt: A Python Toolkit for Prompt Optimization
☆45Updated 4 months ago
Alternatives and similar repositories for GreaterPrompt
Users that are interested in GreaterPrompt are comparing it to the libraries listed below
Sorting:
- ☆48Updated last year
- Test-time compute in information retrieval☆42Updated last month
- ☆41Updated 7 months ago
- A dataset for training and evaluating LLMs on decision making about "when (not) to call" functions☆33Updated 4 months ago
- ☆52Updated last year
- Official repository for paper "ReasonIR Training Retrievers for Reasoning Tasks".☆197Updated 2 months ago
- ☆20Updated 4 months ago
- This repository contains ScholarQABench data and evaluation pipeline.☆83Updated 3 weeks ago
- Code and Data for "Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering"☆86Updated last year
- ☆74Updated last year
- Framework and toolkits for building and evaluating collaborative agents that can work together with humans.☆96Updated 4 months ago
- Code that accompanies the public release of the paper Lost in Conversation (https://arxiv.org/abs/2505.06120)☆157Updated 2 months ago
- We aim to provide the best references to search, select, and synthesize high-quality and large-quantity data for post-training your LLMs.☆58Updated 11 months ago
- ☆89Updated 3 months ago
- Designing Multi-Agent Systems with Zero Supervision☆97Updated last month
- ☆35Updated last year
- Code and dataset for the emnlp paper titled Instruct and Extract: Instruction Tuning for On-Demand Information Extraction☆53Updated last year
- This is the code repo for our paper "Autonomously Knowledge Assimilation and Accommodation through Retrieval-Augmented Agents".☆107Updated 10 months ago
- ☆38Updated 2 months ago
- ☆17Updated last year
- Code for Search-in-the-Chain: Towards Accurate, Credible and Traceable Large Language Models for Knowledge-intensive Tasks☆57Updated last year
- Contrastive Chain-of-Thought Prompting☆68Updated last year
- ☆19Updated last year
- ☆35Updated 10 months ago
- Complex Function Calling Benchmark.☆129Updated 7 months ago
- The official implementation of "LevelRAG: Enhancing Retrieval-Augmented Generation with Multi-hop Logic Planning over Rewriting Augmented…☆41Updated 4 months ago
- WideSearch: Benchmarking Agentic Broad Info-Seeking☆87Updated 3 weeks ago
- ☆62Updated last year
- ☆51Updated 7 months ago
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆133Updated last year