GoogleCloudPlatform / evalbenchLinks
EvalBench is a flexible framework designed to measure the quality of generative AI (GenAI) workflows around database specific tasks.
☆27Updated last week
Alternatives and similar repositories for evalbench
Users that are interested in evalbench are comparing it to the libraries listed below
Sorting:
- DSPY on action with OpenSource LLMs.☆103Updated last year
- ☆44Updated last month
- The Open Data QnA python library enables you to chat with your databases by leveraging LLM Agents on Google Cloud. Open Data QnA enables…☆220Updated last week
- Research repository on interfacing LLMs with Weaviate APIs. Inspired by the Berkeley Gorilla LLM.☆140Updated 5 months ago
- Additional packages (components, document stores and the likes) to extend the capabilities of Haystack☆181Updated last week
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆115Updated 9 months ago
- Code for evaluating with Flow-Judge-v0.1 - an open-source, lightweight (3.8B) language model optimized for LLM system evaluations. Crafte…☆84Updated last year
- SUQL: Conversational Search over Structured and Unstructured Data with LLMs☆297Updated 2 weeks ago
- Demo of knowledge graph creation and Graph RAG with BAML and Kuzu☆73Updated 4 months ago
- Hugging Face Inference Toolkit used to serve transformers, sentence-transformers, and diffusers models.☆90Updated 3 weeks ago
- Official Repo for CRMArena and CRMArena-Pro☆132Updated this week
- ☆45Updated last year
- ☆78Updated 2 months ago
- ☆28Updated 5 months ago
- Make DSPy Agentic using protocol-first approach that support the Agent Protocols like MCP, A2A☆67Updated 8 months ago
- 💙 Unstructured Data Connectors for Haystack 2.0☆17Updated 2 years ago
- ☆75Updated last year
- A jump start solution using GKE or Cloud Run with Cloud SQL and VertexAI☆60Updated last month
- Framework for building data agent workflows☆84Updated last year
- LLM prompt language based on Jinja. Banks provides tools and functions to build prompts text and chat messages from generic blueprints. I…☆121Updated last week
- A Lightweight Library for AI Observability☆255Updated 11 months ago
- Simple examples using Argilla tools to build AI☆57Updated last year
- A framework for benchmarking embedding models in hybrid search scenarios (BM25 + vector search) using Weaviate.☆38Updated 3 weeks ago
- Lite weight wrapper for the independent implementation of SPLADE++ models for search & retrieval pipelines. Models and Library created by…☆34Updated last year
- ☆147Updated last year
- ☆38Updated 2 weeks ago
- Data management with LLMs☆182Updated last year
- ☆76Updated 7 months ago
- A curated list of materials on AI guardrails☆45Updated 8 months ago
- Leverage your LangChain trace data for fine tuning☆46Updated last year