instructor-ai / evals
Using various instructor clients evaluating the quality and capabilities of extractions and reasoning.
☆47Updated last month
Related projects ⓘ
Alternatives and complementary repositories for evals
- Verbosity control for AI agents☆59Updated 5 months ago
- Chat Markup Language conversation library☆54Updated 10 months ago
- ☆44Updated last week
- Writing Blog Posts with Generative Feedback Loops!☆43Updated 8 months ago
- Convert a web page to markdown☆54Updated 2 months ago
- 📝 Reference-Free automatic summarization evaluation with potential hallucination detection☆98Updated 10 months ago
- A framework for evaluating function calls made by LLMs☆35Updated 3 months ago
- Annoucing Instructor Cloud☆34Updated 3 months ago
- A curated list of amazingly awesome Modal applications, demos, and shiny things. Inspired by awesome-php.☆86Updated last week
- ☆75Updated 5 months ago
- Using modal.com to process FineWeb-edu data☆19Updated 2 months ago
- Build reliable, secure, and production-ready AI apps easily.☆46Updated 2 weeks ago
- Simple Graph Memory for AI applications☆79Updated 3 months ago
- A Chrome extension that saves conversations with Claude to GitHubGists or your clipboard.☆64Updated 2 weeks ago
- Tools to make language models a bit easier to use☆30Updated this week
- ☆48Updated last year
- Leveraging DSPy for AI-driven task understanding and solution generation, the Self-Discover Framework automates problem-solving through r…☆57Updated 4 months ago
- Routing on Random Forest (RoRF)☆84Updated last month
- ☆3Updated 3 months ago
- auto fine tune of models with synthetic data☆72Updated 9 months ago
- utilities for loading and running text embeddings with onnx☆39Updated 3 months ago
- Leverage your LangChain trace data for fine tuning☆38Updated 3 months ago
- A seamless matchmaking application that is programmed with Cohere Command R+, Stanford NLP DSPy framework, Weaviate Vector store and Crew…☆58Updated 7 months ago
- converts url content into JSON with a simple prefix☆61Updated 6 months ago
- A strongly typed Python DSL for developing message passing multi agent systems☆50Updated 7 months ago
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆97Updated 7 months ago
- Simple examples using Argilla tools to build AI☆40Updated this week
- Python SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)☆74Updated 2 months ago
- A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive arguments☆63Updated last month