anyscale / factuality-eval
Library for iPython notebooks for evaluating factuality.
☆50Updated last year
Related projects: ⓘ
- ☆72Updated 3 months ago
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆93Updated 5 months ago
- Lite weight wrapper for the independent implementation of SPLADE++ models for search & retrieval pipelines. Models and Library created by…☆27Updated 3 weeks ago
- Leverage your LangChain trace data for fine tuning☆36Updated last month
- 📝 Reference-Free automatic summarization evaluation with potential hallucination detection☆99Updated 8 months ago
- A python package that provides a custom streamlit connection to query data from weaviate, the AI native vector database☆49Updated last month
- ☆71Updated 3 months ago
- Writing Blog Posts with Generative Feedback Loops!☆41Updated 6 months ago
- Supervised instruction finetuning for LLM with HF trainer and Deepspeed☆32Updated last year
- Check for data drift between two OpenAI multi-turn chat jsonl files.☆33Updated 5 months ago
- Notebooks for training universal 0-shot classifiers on many different tasks☆100Updated 5 months ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆48Updated 2 months ago
- ☆75Updated last year
- Web App for generating synthetic data☆45Updated 2 weeks ago
- Retrieval Augmented Generation applications☆27Updated 11 months ago
- Chunk your text using gpt4o-mini more accurately☆37Updated last month
- Classy-fire is multiclass text classification approach leveraging OpenAI LLM model APIs optimally using clever parameter tuning and promp…☆72Updated 9 months ago
- Initiative to evaluate and rank the most popular LLMs across common task types based on their propensity to hallucinate.☆91Updated last week
- ☆44Updated 6 months ago
- Repository of the code base for KT Generation process that we worked at Google Cloud and Searce GenAI Hackathon.☆74Updated last year
- Truth Forest: Toward Multi-Scale Truthfulness in Large Language Models through Intervention without Tuning☆40Updated 9 months ago
- Mistral + Haystack: build RAG pipelines that rock 🤘☆99Updated 7 months ago
- Testing speed and accuracy of RAG with, and without Cross Encoder Reranker.☆45Updated 8 months ago
- This is the repo for the container that holds the models for the text2vec-transformers module☆38Updated last month
- LLM_library is a comprehensive repository serves as a one-stop resource hands-on code, insightful summaries.☆68Updated 8 months ago
- ☆15Updated last year
- A set of scripts and notebooks on LLM finetunning and dataset creation☆89Updated last week
- Nearest Neighbors vs Approximate Nearest Neighbors☆24Updated last year
- Official repo for NAACL 2024 Findings paper "LeTI: Learning to Generate from Textual Interactions."☆60Updated last year
- This is the reproduction repository for my 🤗 Hugging Face blog post on synthetic data☆57Updated 7 months ago