flowaicom / flow-judgeLinks
Code for evaluating with Flow-Judge-v0.1 - an open-source, lightweight (3.8B) language model optimized for LLM system evaluations. Crafted for accuracy, speed, and customization.
☆70Updated 7 months ago
Alternatives and similar repositories for flow-judge
Users that are interested in flow-judge are comparing it to the libraries listed below
Sorting:
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆66Updated 7 months ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated 11 months ago
- ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems. (EMNLP 2024 Demo)☆80Updated 3 months ago
- Leveraging DSPy for AI-driven task understanding and solution generation, the Self-Discover Framework automates problem-solving through r…☆61Updated 11 months ago
- Simple examples using Argilla tools to build AI☆53Updated 7 months ago
- A stable, fast and easy-to-use inference library with a focus on a sync-to-async API☆45Updated 8 months ago
- A framework for evaluating function calls made by LLMs☆37Updated 10 months ago
- ☆30Updated 11 months ago
- GPT-4 Level Conversational QA Trained In a Few Hours☆62Updated 10 months ago
- ☆66Updated last year
- Dataset Viber is your chill repo for data collection, annotation and vibe checks.☆47Updated 9 months ago
- ☆45Updated last year
- Writing Blog Posts with Generative Feedback Loops!☆48Updated last year
- tiny_fnc_engine is a minimal python library that provides a flexible engine for calling functions extracted from a LLM.☆38Updated 9 months ago
- ☆17Updated 6 months ago
- A seamless matchmaking application that is programmed with Cohere Command R+, Stanford NLP DSPy framework, Weaviate Vector store and Crew…☆59Updated last year
- Embed anything.☆28Updated last year
- Explore the use of DSPy for extracting features from PDFs 🔎☆40Updated last year
- Simple Graph Memory for AI applications☆86Updated last month
- ☆36Updated 4 months ago
- Python SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)☆78Updated 4 months ago
- DSPy program/pipeline inspector widget for Jupyter/VSCode Notebooks.☆35Updated last year
- LlamaWorksDB is a Retrieval Augmented Generation (RAG) product designed to interact with the documentation of various products such as Ll…☆16Updated last year
- A project that enables identification and classification of an intent of a message with dynamic labels☆41Updated 6 months ago
- Dynamic Metadata based RAG Framework☆75Updated 10 months ago
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆106Updated 2 months ago
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)☆91Updated 4 months ago
- ☆96Updated last week
- LangEvals aggregates various language model evaluators into a single platform, providing a standard interface for a multitude of scores a…☆58Updated this week
- ☆47Updated 4 months ago