openai / evalsLinks
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
☆17,128Updated 9 months ago
Alternatives and similar repositories for evals
Users that are interested in evals are comparing it to the libraries listed below
Sorting:
- The ChatGPT Retrieval Plugin lets you easily find personal or work documents by asking questions in natural language.☆21,231Updated last year
- ☆21,873Updated 11 months ago
- tiktoken is a fast BPE tokeniser for use with OpenAI's models.☆16,161Updated last week
- 🦜🔗 Build context-aware reasoning applications☆117,269Updated this week
- JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf☆24,400Updated 2 months ago
- LlamaIndex is the leading framework for building LLM-powered agents over your data.☆44,665Updated last week
- An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.☆39,141Updated 4 months ago
- Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)☆12,479Updated this week
- Instruct-tune LLaMA on consumer hardware☆18,965Updated last year
- AI PDF chatbot agent built with LangChain & LangGraph☆16,043Updated 7 months ago
- A guidance language for controlling large language models.☆20,847Updated this week
- Semantic cache for LLMs. Fully integrated with LangChain and llama_index.☆7,794Updated 3 months ago
- Plug in and Play Implementation of Tree of Thoughts: Deliberate Problem Solving with Large Language Models that Elevates Model Reasoning …☆4,546Updated 2 months ago
- Examples and guides for using the OpenAI API☆68,429Updated last week
- StableLM: Stability AI Language Models☆15,792Updated last year
- Code and documentation to train Stanford's Alpaca models, and generate the data.☆30,170Updated last year
- NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.☆5,138Updated this week
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.☆40,386Updated this week
- An LLM playground you can run on your laptop☆6,362Updated 3 weeks ago
- Home of StarCoder: fine-tuning & inference!☆7,463Updated last year
- ☆3,374Updated 2 years ago
- Get a ChatGPT plugin up and running in under 5 minutes!☆4,241Updated last year
- ☆6,113Updated last week
- 🤖 Assemble, configure, and deploy autonomous AI Agents in your browser.☆35,070Updated 5 months ago
- Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.☆11,840Updated last week
- Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)☆25,739Updated last year
- 🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading☆9,808Updated last year
- OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset☆7,524Updated 2 years ago
- Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM☆7,865Updated 3 weeks ago
- [NeurIPS 2023] Tree of Thoughts: Deliberate Problem Solving with Large Language Models☆5,612Updated 9 months ago