anyscale / factuality-eval
Library for iPython notebooks for evaluating factuality.
☆50Updated last year
Alternatives and similar repositories for factuality-eval
Users that are interested in factuality-eval are comparing it to the libraries listed below
Sorting:
- ☆78Updated 11 months ago
- Check for data drift between two OpenAI multi-turn chat jsonl files.☆37Updated last year
- Supervised instruction finetuning for LLM with HF trainer and Deepspeed☆35Updated last year
- TitanML Takeoff Server is an optimization, compression and deployment platform that makes state of the art machine learning models access…☆114Updated last year
- ☆77Updated 11 months ago
- Leverage your LangChain trace data for fine tuning☆41Updated 9 months ago
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆99Updated last year
- 🤗 Disaggregators: Curated data labelers for in-depth analysis.☆66Updated 2 years ago
- Sample notebooks and prompts for LLM evaluation☆126Updated last week
- 📝 Reference-Free automatic summarization evaluation with potential hallucination detection☆100Updated last year
- Build Enterprise RAG (Retriver Augmented Generation) Pipelines to tackle various Generative AI use cases with LLM's by simply plugging co…☆109Updated 9 months ago
- Writing Blog Posts with Generative Feedback Loops!☆47Updated last year
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated 10 months ago
- A Chainlit App Used to Showcase: Async, Caching, Additional Chainlit Methods, and more!☆11Updated 7 months ago
- Lite weight wrapper for the independent implementation of SPLADE++ models for search & retrieval pipelines. Models and Library created by…☆31Updated 8 months ago
- experiments with inference on llama☆104Updated 11 months ago
- Iterate fast on your RAG pipelines☆23Updated 2 months ago
- ☆78Updated 2 years ago
- Initiative to evaluate and rank the most popular LLMs across common task types based on their propensity to hallucinate.☆109Updated 8 months ago
- A Python library aimed at dissecting and augmenting NER training data.☆58Updated 2 years ago
- Framework for building and maintaining self-updating prompts for LLMs☆63Updated 11 months ago
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker☆109Updated last month
- Testing speed and accuracy of RAG with, and without Cross Encoder Reranker.☆48Updated last year
- Alternate Implementation for Zero Shot Text Classification: Instead of reframing NLI/XNLI, this reframes the text backbone of CLIP models…☆36Updated 3 years ago
- Training and Inference Notebooks for the RedPajama (OpenLlama) models☆18Updated 2 years ago
- ☆88Updated last year
- ☆93Updated last year
- ☆52Updated last year
- Template-based generation of DAG cards from Metaflow classes, inspired by Google cards for machine learning models.☆30Updated 3 years ago
- Using LlamaIndex with Ray for productionizing LLM applications☆71Updated last year