aounon / llm-rank-optimizerLinks
☆116Updated 4 months ago
Alternatives and similar repositories for llm-rank-optimizer
Users that are interested in llm-rank-optimizer are comparing it to the libraries listed below
Sorting:
- ☆49Updated last year
- The Official Repository for "Bring Your Own Data! Self-Supervised Evaluation for Large Language Models"☆107Updated 2 years ago
- Mixing Language Models with Self-Verification and Meta-Verification☆110Updated last year
- Codebase accompanying the Summary of a Haystack paper.☆79Updated last year
- Official Repo for CRMArena and CRMArena-Pro☆126Updated last month
- Attribute (or cite) statements generated by LLMs back to in-context information.☆308Updated last year
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker☆125Updated last month
- A small library of LLM judges☆308Updated 4 months ago
- LangCode - Improving alignment and reasoning of large language models (LLMs) with natural language embedded program (NLEP).☆48Updated 2 years ago
- ☆79Updated last year
- LLM Attributor: Attribute LLM's Generated Text to Training Data☆69Updated 3 months ago
- Use the OpenAI Batch tool to make async batch requests to the OpenAI API.☆101Updated last year
- CiteME is a benchmark designed to test the abilities of language models in finding papers that are cited in scientific texts.☆48Updated last month
- Finding semantically meaningful and accurate prompts.☆48Updated 2 years ago
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆107Updated 3 months ago
- Inference-time scaling for LLMs-as-a-judge.☆317Updated last month
- Code accompanying "How I learned to start worrying about prompt formatting".☆113Updated 6 months ago
- Official repo of Respond-and-Respond: data, code, and evaluation☆104Updated last year
- Examining how large language models (LLMs) perform across various synthetic regression tasks when given (input, output) examples in their…☆157Updated 2 months ago
- EcoAssistant: using LLM assistant more affordably and accurately☆133Updated last year
- ☆148Updated last year
- ☆43Updated last year
- Official Implementation of InstructZero; the first framework to optimize bad prompts of ChatGPT(API LLMs) and finally obtain good prompts…☆197Updated last year
- Functional Benchmarks and the Reasoning Gap☆90Updated last year
- Resources related to EACL 2023 paper "SwitchPrompt: Learning Domain-Specific Gated Soft Prompts for Classification in Low-Resource Domain…☆52Updated 2 years ago
- NeurIPS 2023 - Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer☆45Updated last year
- Public code repo for paper "SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales"☆112Updated last year
- The GitHub repo for Goal Driven Discovery of Distributional Differences via Language Descriptions☆71Updated 2 years ago
- WorkBench: a Benchmark Dataset for Agents in a Realistic Workplace Setting.☆54Updated last year
- Initiative to evaluate and rank the most popular LLMs across common task types based on their propensity to hallucinate.☆116Updated 4 months ago