patronus-ai / Lynx-hallucination-detectionLinks
β43Updated last year
Alternatives and similar repositories for Lynx-hallucination-detection
Users that are interested in Lynx-hallucination-detection are comparing it to the libraries listed below
Sorting:
- Codebase accompanying the Summary of a Haystack paper.β79Updated 11 months ago
- Lightweight demos for finetuning LLMs. Powered by π€ transformers and open-source datasets.β78Updated 10 months ago
- Mixing Language Models with Self-Verification and Meta-Verificationβ110Updated 9 months ago
- Evaluating LLMs with fewer examplesβ161Updated last year
- β80Updated last week
- β127Updated 11 months ago
- Code repo for "Agent Instructs Large Language Models to be General Zero-Shot Reasoners"β116Updated last year
- Small and Efficient Mathematical Reasoning LLMsβ71Updated last year
- Manage scalable open LLM inference endpoints in Slurm clustersβ271Updated last year
- The code for the paper ROUTERBENCH: A Benchmark for Multi-LLM Routing Systemβ137Updated last year
- β154Updated last year
- Complex Function Calling Benchmark.β132Updated 7 months ago
- The official evaluation suite and dynamic data release for MixEval.β245Updated 10 months ago
- Official repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".β215Updated last month
- Code accompanying "How I learned to start worrying about prompt formatting".β110Updated 3 months ago
- β76Updated 8 months ago
- π§ Compare how Agent systems perform on several benchmarks. ππβ101Updated last month
- β57Updated 11 months ago
- Functional Benchmarks and the Reasoning Gapβ88Updated 11 months ago
- This project studies the performance and robustness of language models and task-adaptation methods.β151Updated last year
- Evaluating LLMs with CommonGen-Liteβ91Updated last year
- β118Updated last year
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo rankerβ115Updated this week
- Let's build better datasets, together!β263Updated 8 months ago
- Open Implementations of LLM Analysesβ106Updated 11 months ago
- Verifiers for LLM Reinforcement Learningβ72Updated 4 months ago
- Source code for the collaborative reasoner research project at Meta FAIR.β102Updated 4 months ago
- awesome synthetic (text) datasetsβ296Updated 2 months ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absoluteβ¦β49Updated last year
- MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents [EMNLP 2024]β183Updated 2 weeks ago