patronus-ai / Lynx-hallucination-detection
β36Updated 9 months ago
Alternatives and similar repositories for Lynx-hallucination-detection:
Users that are interested in Lynx-hallucination-detection are comparing it to the libraries listed below
- Codebase accompanying the Summary of a Haystack paper.β77Updated 7 months ago
- β120Updated 7 months ago
- Lightweight demos for finetuning LLMs. Powered by π€ transformers and open-source datasets.β76Updated 6 months ago
- Small and Efficient Mathematical Reasoning LLMsβ71Updated last year
- Mixing Language Models with Self-Verification and Meta-Verificationβ104Updated 4 months ago
- Maya: An Instruction Finetuned Multilingual Multimodal Model using Ayaβ108Updated 2 months ago
- QAlign is a new test-time alignment approach that improves language model performance by using Markov chain Monte Carlo methods.β23Updated 3 weeks ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignmentβ55Updated 8 months ago
- Evaluating LLMs with fewer examplesβ151Updated last year
- Source code of the paper: RetrievalQA: Assessing Adaptive Retrieval-Augmented Generation for Short-form Open-Domain Question Answering [Fβ¦β62Updated 11 months ago
- Data preparation code for Amber 7B LLMβ89Updated 11 months ago
- β48Updated 5 months ago
- Code for our paper PAPILLON: PrivAcy Preservation from Internet-based and Local Language MOdel ENsemblesβ27Updated 4 months ago
- Evaluating LLMs with CommonGen-Liteβ90Updated last year
- β54Updated this week
- Official repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".β196Updated last week
- β114Updated 2 months ago
- Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flβ¦β72Updated 8 months ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absoluteβ¦β49Updated 9 months ago
- Pre-training code for CrystalCoder 7B LLMβ54Updated 11 months ago
- Code for the EMNLP 2024 paper "Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps"β120Updated 8 months ago
- [EMNLP 2024] A Retrieval Benchmark for Scientific Literature Searchβ83Updated 5 months ago
- The Official Repository for "Bring Your Own Data! Self-Supervised Evaluation for Large Language Models"β108Updated last year
- ReBase: Training Task Experts through Retrieval Based Distillationβ29Updated 3 months ago
- experiments with inference on llamaβ104Updated 10 months ago
- Code repo for "Agent Instructs Large Language Models to be General Zero-Shot Reasoners"β107Updated 7 months ago
- Simple replication of [ColBERT-v1](https://arxiv.org/abs/2004.12832).β80Updated last year
- β28Updated 5 months ago
- WorkBench: a Benchmark Dataset for Agents in a Realistic Workplace Setting.β40Updated 9 months ago
- Complex Function Calling Benchmark.β99Updated 3 months ago