zeno-ml / zeno-hub
AI Evaluation Platform
☆45Updated this week
Alternatives and similar repositories for zeno-hub:
Users that are interested in zeno-hub are comparing it to the libraries listed below
- Small, simple agent task environments for training and evaluation☆18Updated 2 months ago
- Agent computer interface for AI software engineer.☆22Updated this week
- Using open source LLMs to build synthetic datasets for direct preference optimization☆49Updated 10 months ago
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆62Updated 2 months ago
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆82Updated last year
- Code for our paper PAPILLON: PrivAcy Preservation from Internet-based and Local Language MOdel ENsembles☆20Updated 2 weeks ago
- a pipeline for using api calls to agnostically convert unstructured data into structured training data☆29Updated 3 months ago
- ☆24Updated last year
- Just a bunch of benchmark logs for different LLMs☆116Updated 5 months ago
- ☆27Updated 2 months ago
- A framework for evaluating function calls made by LLMs☆36Updated 5 months ago
- Accelerating your LLM training to full speed! Made with ❤️ by ServiceNow Research☆121Updated this week
- ☆74Updated last year
- 📝 Reference-Free automatic summarization evaluation with potential hallucination detection☆99Updated last year
- Official repo for NAACL 2024 Findings paper "LeTI: Learning to Generate from Textual Interactions."☆63Updated last year
- Mixing Language Models with Self-Verification and Meta-Verification☆100Updated last month
- Large-language Model Evaluation framework with Elo Leaderboard and A-B testing☆50Updated 2 months ago
- Evaluating LLMs with CommonGen-Lite☆87Updated 9 months ago
- Visualize expert firing frequencies across sentences in the Mixtral MoE model☆17Updated last year
- ☆48Updated last year
- ☆20Updated last year
- A data-centric AI package for ML/AI. Get the best high-quality data for the best results. Discord: https://discord.gg/t6ADqBKrdZ☆64Updated last year
- Reasoning by Communicating with Agents☆23Updated 3 months ago
- ☆65Updated 7 months ago
- WorkBench: a Benchmark Dataset for Agents in a Realistic Workplace Setting.☆35Updated 5 months ago
- Tools to make language models a bit easier to use☆32Updated last month
- ReLM is a Regular Expression engine for Language Models☆103Updated last year
- Simple replication of [ColBERT-v1](https://arxiv.org/abs/2004.12832).☆79Updated 10 months ago