mostly-ai / mostlyai-qaLinks
Synthetic Data Quality Assurance π
β60Updated this week
Alternatives and similar repositories for mostlyai-qa
Users that are interested in mostlyai-qa are comparing it to the libraries listed below
Sorting:
- Synthetic Data Engine πβ64Updated this week
- Wonderful Matrices to Build Small Language Modelsβ44Updated 5 months ago
- A framework for fine-tuning retrieval-augmented generation (RAG) systems.β122Updated this week
- β19Updated 2 months ago
- This repository contains the resource introduced in the paper: "Truth or Mirage? Towards End-to-End Factuality Evaluation with LLM-Oasis"β¦β22Updated 7 months ago
- LangFair is a Python library for conducting use-case level LLM bias and fairness assessmentsβ219Updated this week
- Code associated with the EMNLP 2024 Main paper: "Image, tell me your story!" Predicting the original meta-context of visual misinformatioβ¦β41Updated 2 months ago
- Generate Python Package with Simple Promptsβ72Updated 7 months ago
- An open-source compliance-centered evaluation framework for Generative AI modelsβ158Updated last week
- Problem-Oriented Segmentation and Retrieval EMNLP 2024 Findingsβ33Updated 8 months ago
- β16Updated 3 months ago
- [NeurIPS XAIA & Springer] Code and notebooks to paper "A Fresh Look at Sanity Checks for Saliency Maps"β25Updated last year
- This is the repository for NAACL'25 paper "TART: An Open-Source Tool-Augmented Framework for Explainable Table-based Reasoning"β54Updated 2 months ago
- syftr is an agent optimizer that helps you find the best agentic workflows for your budget.β284Updated this week
- Synthetic Data SDK β¨β608Updated this week
- A curated list of materials on AI guardailsβ39Updated last month
- A framework for pitting LLMs against each other in an evolving library of games ββ32Updated 2 months ago
- β2Updated last month
- A method for steering llms to better follow instructionsβ47Updated last week
- β126Updated 2 months ago
- A small library of LLM judgesβ232Updated 3 weeks ago
- β94Updated 3 months ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absoluteβ¦β49Updated last year
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Modelsβ108Updated 3 months ago
- Lean implementation of various multi-agent LLM methods, including Iteration of Thought (IoT)β115Updated 5 months ago
- β112Updated 2 weeks ago
- The official implementation of Preference Data Reward-Augmentation.β17Updated 2 months ago
- Source code for the collaborative reasoner research project at Meta FAIR.β95Updated 3 months ago
- β63Updated last year
- β78Updated 8 months ago