definitive-io / human-eval-sampling-benchmarkLinks
OpenAI's human-eval sampling benchmark
☆13Updated 2 years ago
Alternatives and similar repositories for human-eval-sampling-benchmark
Users that are interested in human-eval-sampling-benchmark are comparing it to the libraries listed below
Sorting:
- Turn a Github Repo's contents into a big prompt for long-context models like Claude 3 Opus.☆219Updated 11 months ago
- ☆172Updated last year
- Your automated SWE fleet to get your tickets from the Backlog to Prod!☆98Updated last year
- An MCP Server that's also an MCP Client. Useful for letting Claude develop and test MCPs without needing to reset the application.☆124Updated 11 months ago
- auto fine tune of models with synthetic data☆78Updated last year
- OpenAI's Realtime API minus the enterprise bloat☆48Updated last year
- ⛓️ build cognitive systems, pythonic☆339Updated last year
- Code Indexer Loop is a Python library for indexing and retrieving source code files through an integrated vector database that's continuo…☆176Updated last year
- ☆291Updated 8 months ago
- Scrapybara Python SDK☆73Updated 5 months ago
- Chat with your git repo☆160Updated 2 years ago
- 🔁 Code iteration tool running on Groq☆76Updated last year
- Annoucing Instructor Cloud☆38Updated last year
- Simple AI coder that can do most of my work for me, including working on himself.☆254Updated 10 months ago
- Prompt engineering, automated.☆352Updated 9 months ago
- Automatically reformat any JSON into any schema with AI☆338Updated 10 months ago
- Letting Claude Code develop his own MCP tools :)☆123Updated 11 months ago
- ☆47Updated last year
- Edge full-stack LLM platform. Written in Rust☆384Updated last year
- A toolkit for building computer use AI agents☆182Updated 7 months ago
- the simplest self-building general autonomous agent☆332Updated last year
- Build robust, production grade function calling assistants that work. Declarative and extensible. Built on top of LangChain ⚡️☆76Updated last year
- Action library for AI Agent☆229Updated 10 months ago
- Like Claude Artifacts but lives in a single static HTML page which you can use with any language model of your choosing☆213Updated 11 months ago
- ☆191Updated last year
- Framework for building, orchestrating and deploying multi-agent systems. Managed by OpenAI Solutions team. Experimental framework.☆93Updated last year
- Improve your questions! The AI for Inquiry - QuestionImprover Agent is an LLM-driven “tool for thought” designed to enhance the depth and…☆154Updated 11 months ago
- An open-source Discord bot, created using LlamaIndex, that - Listens to your server conversations, continuously learns from them & answe…☆76Updated 2 years ago
- ☆38Updated 2 years ago
- Demo of AI chatbot that predicts user message to generate response quickly.☆105Updated last year