browser-use / evalLinks
☆42Updated 11 months ago
Alternatives and similar repositories for eval
Users that are interested in eval are comparing it to the libraries listed below
Sorting:
- Voyage AI Official Python Library☆89Updated 3 weeks ago
- ☆33Updated 2 years ago
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆115Updated 9 months ago
- A better way of testing, inspecting, and analyzing AI Agent traces.☆40Updated 2 months ago
- ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems. (EMNLP 2024 Demo)☆89Updated 3 weeks ago
- A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive arguments☆96Updated 3 months ago
- Routing on Random Forest (RoRF)☆237Updated last year
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)☆91Updated 11 months ago
- ☆86Updated last year
- DSPy program/pipeline inspector widget for Jupyter/VSCode Notebooks.☆43Updated last year
- A collection of Compound Retrieval Systems implemented with DSPy and Weaviate.☆92Updated 2 months ago
- Website with current metrics on the fastest AI models.☆42Updated last year
- Natural Language Interfaces Powered by LLMs☆95Updated last year
- Query language for blending SQL and LLMs across structured + unstructured data, with type constraints.☆125Updated this week
- Leveraging DSPy for AI-driven task understanding and solution generation, the Self-Discover Framework automates problem-solving through r…☆72Updated 2 months ago
- Official Repo for The Paper "Talk Structurally, Act Hierarchically: A Collaborative Framework for LLM Multi-Agent Systems"☆59Updated 10 months ago
- ☆18Updated last year
- Official Repo for CRMArena and CRMArena-Pro☆127Updated 2 months ago
- proof-of-concept of Cursor's Instant Apply feature☆88Updated last year
- Embedding models from Jina AI☆65Updated last year
- DSPY on action with OpenSource LLMs.☆102Updated last year
- Anthropic Computer Use with Modal Sandboxes☆42Updated last year
- Training setup for Langchain's Open Deep Research☆74Updated 4 months ago
- Chrome Extension for exploring Hugging Face datasets 🔎☆48Updated last year
- ☆47Updated last year
- Python SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)☆82Updated 10 months ago
- The official repository for the Anything But Wrappers: Llama Edition Hackameetup☆22Updated 2 years ago
- Simple examples using Argilla tools to build AI☆57Updated last year
- ScreenSuite - The most comprehensive benchmarking suite for GUI Agents!☆133Updated 3 months ago
- Verbosity control for AI agents☆66Updated last year