D-Star-AI / KITE
KITE (Knowledge-Intensive Task Evaluation) is an end-to-end benchmark for RAG pipelines
☆14Updated 7 months ago
Alternatives and similar repositories for KITE:
Users that are interested in KITE are comparing it to the libraries listed below
- Reasoning by Communicating with Agents☆25Updated 5 months ago
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆76Updated 5 months ago
- A framework for high-fidelity retrieval augmented generation in industrial knowledge bases. Integrates jargon identification, context rec…☆29Updated 7 months ago
- NeurIPS 2023 - Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer☆41Updated last year
- Implementation of "SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models"☆27Updated last month
- Lightweight Non-Parametric Embedding Fine-Tuning☆24Updated 6 months ago
- OVALChat is a customizable Web app aimed at conducting user studies with chatbots☆28Updated last year
- ☆41Updated 3 months ago
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆47Updated last year
- Code, datasets, and checkpoints for the paper "CRAFT Your Dataset: Task-Specific Synthetic Dataset Generation Through Corpus Retrieval an…☆27Updated 6 months ago
- ☆45Updated 6 months ago
- ☆18Updated last year
- Measuring RAG solutions throughput and latency☆15Updated 8 months ago
- ☆18Updated last year
- Codebase accompanying the Summary of a Haystack paper.☆75Updated 6 months ago
- ☆11Updated 5 months ago
- ☆62Updated 8 months ago
- ☆15Updated last year
- LLM reads a paper and produce a working prototype☆51Updated 2 weeks ago
- Universal text classifier for generative models☆22Updated 8 months ago
- A library for simplifying fine tuning with multi gpu setups in the Huggingface ecosystem.☆16Updated 5 months ago
- ☆43Updated 9 months ago
- ☆55Updated 3 months ago
- This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…☆31Updated last year
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated 8 months ago
- LLMs as Collaboratively Edited Knowledge Bases☆45Updated last year
- ☆20Updated 2 months ago
- Explore the use of DSPy for extracting features from PDFs 🔎☆39Updated last year
- Official repository for RAGViz: Diagnose and Visualize Retrieval-Augmented Generation [EMNLP 2024]☆82Updated 2 months ago
- A repository for Multi-Meta-RAG: Improving RAG for Multi-Hop Queries using Database Filtering with LLM-Extracted Metadata☆33Updated 7 months ago