vectara / hallucination-leaderboard
Leaderboard Comparing LLM Performance at Producing Hallucinations when Summarizing Short Documents
☆1,606Updated last week
Alternatives and similar repositories for hallucination-leaderboard:
Users that are interested in hallucination-leaderboard are comparing it to the libraries listed below
- A unified evaluation framework for large language models☆2,532Updated last week
- ☆2,343Updated 2 weeks ago
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆1,712Updated 6 months ago
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifi…☆2,448Updated this week
- Open-source tool to visualise your RAG 🔮☆1,106Updated last month
- Tools for merging pretrained large language models.☆5,273Updated last week
- [EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which ach…☆4,879Updated 3 weeks ago
- Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-…☆3,266Updated last week
- A lightweight library for generating synthetic instruction tuning datasets for your data without GPT.☆740Updated last week
- Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.☆2,959Updated this week
- DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. 🤖💤☆969Updated 2 weeks ago
- Training LLMs with QLoRA + FSDP☆1,451Updated 3 months ago
- Automatically evaluate your LLMs in Google Colab☆592Updated 9 months ago
- Automated Evaluation of RAG Systems☆546Updated 3 months ago
- Enforce the output format (JSON Schema, Regex etc) of a language model☆1,708Updated this week
- Fine-tune mistral-7B on 3090s, a100s, h100s☆705Updated last year
- ☆2,852Updated 5 months ago
- Optimizing inference proxy for LLMs☆2,047Updated this week
- A library for advanced large language model reasoning☆1,955Updated this week
- TextGrad: Automatic ''Differentiation'' via Text -- using large language models to backpropagate textual gradients.☆2,088Updated 3 weeks ago
- Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs☆2,362Updated last week
- ☆810Updated 5 months ago
- MTEB: Massive Text Embedding Benchmark☆2,203Updated this week
- Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI☆1,361Updated 10 months ago
- This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai,…☆1,976Updated 8 months ago
- ToRA is a series of Tool-integrated Reasoning LLM Agents designed to solve challenging mathematical reasoning problems by interacting wit…☆1,032Updated 11 months ago
- This repo contains the source code for RULER: What’s the Real Context Size of Your Long-Context Language Models?☆927Updated 3 weeks ago
- ☆1,484Updated this week
- A language for constraint-guided and efficient LLM programming.☆3,824Updated 8 months ago
- Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"☆866Updated 2 weeks ago