D-Star-AI / KITE
KITE (Knowledge-Intensive Task Evaluation) is an end-to-end benchmark for RAG pipelines
☆15Updated 9 months ago
Alternatives and similar repositories for KITE
Users that are interested in KITE are comparing it to the libraries listed below
Sorting:
- Reasoning by Communicating with Agents☆28Updated 2 weeks ago
- OVALChat is a customizable Web app aimed at conducting user studies with chatbots☆28Updated last year
- ☆15Updated 4 months ago
- NeurIPS 2023 - Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer☆43Updated last year
- LLMs as Collaboratively Edited Knowledge Bases☆45Updated last year
- In-Context Alignment: Chat with Vanilla Language Models Before Fine-Tuning☆34Updated last year
- Code, datasets, and checkpoints for the paper "CRAFT Your Dataset: Task-Specific Synthetic Dataset Generation Through Corpus Retrieval an…☆29Updated 8 months ago
- This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…☆31Updated last year
- ☆45Updated 7 months ago
- [SIGIR 2024 (Demo)] CoSearchAgent: A Lightweight Collborative Search Agent with Large Language Models☆24Updated last year
- Comparing retrieval abilities from GPT4-Turbo and a RAG system on a toy example for various context lengths☆35Updated last year
- Query Expension for Better Query Embedding using LLMs☆48Updated 2 months ago
- A library for simplifying fine tuning with multi gpu setups in the Huggingface ecosystem.☆16Updated 6 months ago
- ☆15Updated last year
- ☆25Updated 3 months ago
- Tools for content datamining and NLP at scale☆43Updated 10 months ago
- [EACL 2023] CoTEVer: Chain of Thought Prompting Annotation Toolkit for Explanation Verification☆40Updated 2 years ago
- Testing paligemma2 finetuning on reasoning dataset☆18Updated 4 months ago
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆48Updated last year
- Codebase accompanying the Summary of a Haystack paper.☆78Updated 7 months ago
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆76Updated 6 months ago
- Implementation of the model: "Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models" in PyTorch☆30Updated 3 weeks ago
- ☆29Updated 6 months ago
- ☆16Updated 2 months ago
- 🚀 Automatically convert unstructured data into a high-quality 'textbook' format, optimized for fine-tuning Large Language Models (LLMs)☆26Updated last year
- ☆41Updated 5 months ago
- 👷♂️Minion is Agent's Brain. Minion is designed to execute any type of queries, offering a variety of features that demonstrate its flex…☆14Updated 2 weeks ago
- ☆37Updated 2 years ago
- Nexusflow function call, tool use, and agent benchmarks.☆19Updated 5 months ago
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆54Updated 5 months ago