kixlab / EvalLMLinks
Interactive environment for evaluating LLM prompts on natural language criteria.
☆25Updated last year
Alternatives and similar repositories for EvalLM
Users that are interested in EvalLM are comparing it to the libraries listed below
Sorting:
- Routing on Random Forest (RoRF)☆239Updated last year
- A toolkit for building computer use AI agents☆182Updated 7 months ago
- Python SDK for running evaluations on LLM generated responses☆295Updated 8 months ago
- The Official Exa Python Package☆191Updated last week
- 📚 Benchmark your browser agent on ~2.5k READ and ACTION based tasks☆85Updated 6 months ago
- Together Open Deep Research☆358Updated 9 months ago
- ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems. (EMNLP 2024 Demo)☆90Updated last month
- Synthetic Data for LLM Fine-Tuning☆120Updated 2 years ago
- low-code multi-agent automation framework☆264Updated 3 months ago
- Research repository on interfacing LLMs with Weaviate APIs. Inspired by the Berkeley Gorilla LLM.☆140Updated 5 months ago
- The easiest, and fastest way to run AI-generated Python code safely☆360Updated last year
- ☆76Updated last year
- A simple Python sandbox for helpful LLM data agents☆305Updated last year
- Testing and evaluation framework for voice agents☆162Updated 8 months ago
- ☆50Updated last year
- Tutorial for building LLM router☆244Updated last year
- Deep Research for your internal data☆351Updated 8 months ago
- Natural Language Interfaces Powered by LLMs☆95Updated last year
- A lightweight express.js server implementing OpenAI’s Responses API, built on top of Chat Completions, powered by Hugging Face Inference …☆223Updated 6 months ago
- Initiative to evaluate and rank the most popular LLMs across common task types based on their propensity to hallucinate.☆116Updated 6 months ago
- Scrapybara Python SDK☆73Updated 5 months ago
- A python implementation of priompt - a neat way of managing context from diverse sources for LLM applications.☆115Updated 6 months ago
- A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive arguments☆99Updated 4 months ago
- A user interface for DSPy☆210Updated 4 months ago
- Task-based Agentic Framework using StrictJSON as the core☆460Updated 2 months ago
- An Awesome list of curated DSPy resources.☆511Updated last month
- ☆177Updated 11 months ago
- Claude Deep Research config for Claude Code.☆226Updated 10 months ago
- Open source AI analyst powered by E2B. Analyze your CSV files with Llama 3.1 and create interactive charts.☆350Updated this week
- Solving data for LLMs - Create quality synthetic datasets!☆151Updated last year