stephenleo / llm-structured-output-benchmarksLinks
Benchmark various LLM Structured Output frameworks: Instructor, Mirascope, Langchain, LlamaIndex, Fructose, Marvin, Outlines, etc on tasks like multi-label classification, named entity recognition, synthetic data generation, etc.
☆179Updated last year
Alternatives and similar repositories for llm-structured-output-benchmarks
Users that are interested in llm-structured-output-benchmarks are comparing it to the libraries listed below
Sorting:
- ☆148Updated last year
- ARAGOG- Advanced RAG Output Grading. Exploring and comparing various Retrieval-Augmented Generation (RAG) techniques on AI research paper…☆114Updated last year
- ☆241Updated 6 months ago
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker☆124Updated last month
- Recipes for learning, fine-tuning, and adapting ColPali to your multimodal RAG use cases. 👨🏻🍳☆347Updated 6 months ago
- Domain Adapted Language Modeling Toolkit - E2E RAG☆334Updated last year
- Repository to demonstrate Chain of Table reasoning with multiple tables powered by LangGraph☆147Updated last year
- In-Context Learning for eXtreme Multi-Label Classification (XMC) using only a handful of examples.☆444Updated last year
- Building a chatbot powered with a RAG pipeline to read,summarize and quote the most relevant papers related to the user query.☆167Updated last year
- LLM-driven automated knowledge graph construction from text using DSPy and Neo4j.☆197Updated last year
- A Lightweight Library for AI Observability☆252Updated 9 months ago
- Function Calling Benchmark & Testing☆92Updated last year
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆51Updated last year
- awesome synthetic (text) datasets☆314Updated 3 weeks ago
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆115Updated 8 months ago
- Generalist and Lightweight Model for Text Classification☆166Updated last week
- Low latency, High Accuracy, Custom Query routers for Humans and Agents. Built by Prithivi Da☆118Updated 8 months ago
- Code for evaluating with Flow-Judge-v0.1 - an open-source, lightweight (3.8B) language model optimized for LLM system evaluations. Crafte…☆78Updated last year
- Let's build better datasets, together!☆264Updated 11 months ago
- Client Code Examples, Use Cases and Benchmarks for Enterprise h2oGPTe RAG-Based GenAI Platform☆91Updated 3 months ago
- Model, Code & Data for the EMNLP'23 paper "Making Large Language Models Better Data Creators"☆137Updated 2 years ago
- Synthetic Data for LLM Fine-Tuning☆119Updated 2 years ago
- Initiative to evaluate and rank the most popular LLMs across common task types based on their propensity to hallucinate.☆116Updated 4 months ago
- A Python library to chunk/group your texts based on semantic similarity.☆101Updated last year
- Metrics to evaluate the quality of responses of your Retrieval Augmented Generation (RAG) applications.☆319Updated 5 months ago
- ☆125Updated 9 months ago
- A small library of LLM judges☆306Updated 4 months ago
- Chunk your text using gpt4o-mini more accurately☆44Updated last year
- This is the reproduction repository for my 🤗 Hugging Face blog post on synthetic data☆68Updated last year
- This repository implements the chain of verification paper by Meta AI☆182Updated 2 years ago