browser-use / evalLinks
☆42Updated 10 months ago
Alternatives and similar repositories for eval
Users that are interested in eval are comparing it to the libraries listed below
Sorting:
- ☆33Updated 2 years ago
- Voyage AI Official Python Library☆83Updated last week
- Code interpreter support for o1☆31Updated last year
- A collection of Compound Retrieval Systems implemented with DSPy and Weaviate.☆91Updated last month
- proof-of-concept of Cursor's Instant Apply feature☆87Updated last year
- Chrome Extension for exploring Hugging Face datasets 🔎☆49Updated last year
- A library to extract the main content from html. Developed for information on LLM and for feeding data into LangChain and LlamaIndex.☆51Updated last year
- Official Repo for The Paper "Talk Structurally, Act Hierarchically: A Collaborative Framework for LLM Multi-Agent Systems"☆58Updated 9 months ago
- A function to do all☆35Updated last year
- Agent that routes to different tools - LLM classifier SDK☆45Updated last year
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆115Updated 8 months ago
- Routing on Random Forest (RoRF)☆229Updated last year
- DSPY on action with OpenSource LLMs.☆102Updated last year
- Leveraging DSPy for AI-driven task understanding and solution generation, the Self-Discover Framework automates problem-solving through r…☆72Updated last month
- ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems. (EMNLP 2024 Demo)☆89Updated last week
- Simple examples using Argilla tools to build AI☆56Updated last year
- Aider's refactoring benchmark exercises based on popular python repos☆78Updated last year
- Natural Language Interfaces Powered by LLMs☆95Updated last year
- ☆18Updated 11 months ago
- Official Repo for CRMArena and CRMArena-Pro☆126Updated last month
- A better way of testing, inspecting, and analyzing AI Agent traces.☆40Updated last month
- Harness used to benchmark aider against SWE Bench benchmarks☆78Updated last year
- A daemon that makes a desktop OS accessible to AI agents☆36Updated 6 months ago
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)☆91Updated 10 months ago
- A multimodal RAG application that enables semantic search on multimedia sources like audio, video and images☆41Updated 2 years ago
- Query language for blending SQL and LLMs across structured + unstructured data, with type constraints.☆121Updated last week
- Simple Graph Memory for AI applications☆89Updated 6 months ago
- Using modal.com to process FineWeb-edu data☆20Updated 8 months ago
- ☆83Updated 3 months ago
- DSPy program/pipeline inspector widget for Jupyter/VSCode Notebooks.☆43Updated last year