anyscale / factuality-evalLinks

Library for iPython notebooks for evaluating factuality.

☆51

Alternatives and similar repositories for factuality-eval

Users that are interested in factuality-eval are comparing it to the libraries listed below

Sorting:

explodinggradients / Funtuner
Supervised instruction finetuning for LLM with HF trainer and Deepspeed
☆35Updated last year
PrithivirajDamodaran / SPLADERunner
Lite weight wrapper for the independent implementation of SPLADE++ models for search & retrieval pipelines. Models and Library created by…
☆31Updated 10 months ago
weaviate-tutorials / Hurricane
Writing Blog Posts with Generative Feedback Loops!
☆48Updated last year
BerriAI / bettertest
☆75Updated last year
PrithivirajDamodaran / Route0x
Low latency, High Accuracy, Custom Query routers for Humans and Agents. Built by Prithivi Da
☆105Updated 2 months ago
ravi03071991 / KT_Generator
Repository of the code base for KT Generation process that we worked at Google Cloud and Searce GenAI Hackathon.
☆74Updated last year
parlance-labs / langfree
Leverage your LangChain trace data for fine tuning
☆41Updated 10 months ago
hamelsmu / ft-drift
Check for data drift between two OpenAI multi-turn chat jsonl files.
☆37Updated last year
unicamp-dl / InRanker
☆48Updated last year
philschmid / sagemaker-huggingface-llama-2-samples
☆88Updated last year
titanml / takeoff-community
TitanML Takeoff Server is an optimization, compression and deployment platform that makes state of the art machine learning models access…
☆114Updated last year
Chirayu-Tripathi / nl2query
A framework for converting natural language text inputs to corresponding Pandas, MongoDB, Kusto and Neo4j (Cypher) queries.
☆81Updated last year
Arize-ai / LLMTest_NeedleInAHaystack
Doing simple retrieval from LLM models at various context lengths to measure accuracy
☆99Updated last year
fadynakhla / dr-claude
Anthropic Claude2 Hackathon:Building MCTS with Claude for optimal action prediction during patient/doctor interactions.
☆105Updated last year
Muhtasham / summarization-eval
📝 Reference-Free automatic summarization evaluation with potential hallucination detection
☆100Updated last year
run-llama / ai-engineer-workshop
☆185Updated last year
mickymultani / RAG-with-Cross-Encoder-Reranker
Testing speed and accuracy of RAG with, and without Cross Encoder Reranker.
☆48Updated last year
Cohere-Labs-Community / aya-annotations-ui
Web UI & Backend for Data Annotations in Aya
☆27Updated last year
davanstrien / data-for-fine-tuning-llms
☆77Updated last year
mrmps / ai-chunker
Chunk your text using gpt4o-mini more accurately
☆44Updated 10 months ago
neuml / txtinstruct
📚 Datasets and models for instruction-tuning
☆238Updated last year
weaviate / t2v-transformers-models
This is the repo for the container that holds the models for the text2vec-transformers module
☆51Updated 2 months ago
eugeneyan / visualizing-finetunes
☆78Updated last year
weaviate / st-weaviate-connection
A python package that provides a custom streamlit connection to query data from weaviate, the AI native vector database
☆55Updated 10 months ago
explodinggradients / notes
Research notes and extra resources for all the work at explodinggradients.com
☆23Updated 3 months ago
ibm-self-serve-assets / SuperKnowa
Build Enterprise RAG (Retriver Augmented Generation) Pipelines to tackle various Generative AI use cases with LLM's by simply plugging co…
☆111Updated 11 months ago
deepset-ai / rasa-haystack
☆47Updated 2 years ago
johnrobinsn / redpajama
Training and Inference Notebooks for the RedPajama (OpenLlama) models
☆18Updated 2 years ago
deshwalmahesh / PHUDGE
Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…
☆49Updated 11 months ago
MantisAI / hugie
Command Line Interface for Hugging Face Inference Endpoints
☆66Updated last year