anyscale / factuality-evalLinks
Library for iPython notebooks for evaluating factuality.
☆51Updated 2 years ago
Alternatives and similar repositories for factuality-eval
Users that are interested in factuality-eval are comparing it to the libraries listed below
Sorting:
- Notebooks for training universal 0-shot classifiers on many different tasks☆140Updated last year
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆108Updated 4 months ago
- ☆79Updated last year
- 📚 Datasets and models for instruction-tuning☆238Updated 2 years ago
- TitanML Takeoff Server is an optimization, compression and deployment platform that makes state of the art machine learning models access…☆114Updated 2 years ago
- Supervised instruction finetuning for LLM with HF trainer and Deepspeed☆36Updated 2 years ago
- Topic modeling helpers using managed language models from Cohere. Name text clusters using large GPT models.☆222Updated 3 years ago
- 📝 Reference-Free automatic summarization evaluation with potential hallucination detection☆103Updated 2 years ago
- Domain Adapted Language Modeling Toolkit - E2E RAG☆333Updated last year
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker☆126Updated 3 months ago
- Lite weight wrapper for the independent implementation of SPLADE++ models for search & retrieval pipelines. Models and Library created by…☆34Updated last year
- Mistral + Haystack: build RAG pipelines that rock 🤘☆106Updated 2 years ago
- ☆84Updated 2 years ago
- ☆89Updated 2 years ago
- ☆218Updated last year
- ☆472Updated 2 years ago
- Sample notebooks and prompts for LLM evaluation☆159Updated 3 months ago
- Reimplementation of the task generation part from the Alpaca paper☆119Updated 2 years ago
- Low latency, High Accuracy, Custom Query routers for Humans and Agents. Built by Prithivi Da☆119Updated 10 months ago
- LLM_library is a comprehensive repository serves as a one-stop resource hands-on code, insightful summaries.☆69Updated 2 years ago
- Build Enterprise RAG (Retriver Augmented Generation) Pipelines to tackle various Generative AI use cases with LLM's by simply plugging co…☆117Updated last year
- Initiative to evaluate and rank the most popular LLMs across common task types based on their propensity to hallucinate.☆116Updated 6 months ago
- ☆171Updated 2 months ago
- [EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.☆111Updated last year
- ☆171Updated last year
- Document Q&A on Wikipedia articles using LLMs☆80Updated 2 years ago
- ☆13Updated 2 years ago
- Fast & more realistic evaluation of chat language models. Includes leaderboard.☆190Updated 2 years ago
- ☆207Updated 2 years ago
- Reward Model framework for LLM RLHF☆62Updated 2 years ago