zeno-ml / zeno-hubLinks
AI Evaluation Platform
☆47Updated 6 months ago
Alternatives and similar repositories for zeno-hub
Users that are interested in zeno-hub are comparing it to the libraries listed below
Sorting:
- Chat Markup Language conversation library☆55Updated last year
- Small, simple agent task environments for training and evaluation☆19Updated last year
- An attribution library for LLMs☆46Updated last year
- Mixing Language Models with Self-Verification and Meta-Verification☆110Updated last year
- ReLM is a Regular Expression engine for Language Models☆107Updated 2 years ago
- Writing Blog Posts with Generative Feedback Loops!☆50Updated last year
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆66Updated last year
- Tools to make language models a bit easier to use☆60Updated 3 weeks ago
- Code accompanying "How I learned to start worrying about prompt formatting".☆112Updated 6 months ago
- Sphynx Hallucination Induction☆53Updated 10 months ago
- Chrome Extension for exploring Hugging Face datasets 🔎☆49Updated last year
- a pipeline for using api calls to agnostically convert unstructured data into structured training data☆32Updated last year
- 🔔🧠 Easily experiment with popular language agents across diverse reasoning/decision-making benchmarks!☆54Updated 5 months ago
- Just a bunch of benchmark logs for different LLMs☆119Updated last year
- ☆87Updated last week
- A framework for evaluating function calls made by LLMs☆39Updated last year
- Code for our paper PAPILLON: PrivAcy Preservation from Internet-based and Local Language MOdel ENsembles☆61Updated 7 months ago
- ☆45Updated 2 years ago
- Verbosity control for AI agents☆64Updated last year
- ☆53Updated 10 months ago
- AI Data Management & Evaluation Platform☆216Updated 2 years ago
- Python client library for improving your LLM app accuracy☆97Updated 10 months ago
- Evaluating LLMs with fewer examples☆169Updated last year
- Explore the use of DSPy for extracting features from PDFs 🔎☆49Updated last year
- Simple replication of [ColBERT-v1](https://arxiv.org/abs/2004.12832).☆79Updated last year
- DSPy program/pipeline inspector widget for Jupyter/VSCode Notebooks.☆43Updated last year
- ☆80Updated last year
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆107Updated 2 months ago
- ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems. (EMNLP 2024 Demo)☆89Updated 2 weeks ago
- Official Repo for CRMArena and CRMArena-Pro☆126Updated last month