fixie-ai / ai-benchmarksLinks
Benchmarking suite for popular AI APIs
☆87Updated 5 months ago
Alternatives and similar repositories for ai-benchmarks
Users that are interested in ai-benchmarks are comparing it to the libraries listed below
Sorting:
- Benchmark suite for LLMs from Fireworks.ai☆76Updated last week
- Website with current metrics on the fastest AI models.☆41Updated 8 months ago
- Client Code Examples, Use Cases and Benchmarks for Enterprise h2oGPTe RAG-Based GenAI Platform☆87Updated 3 weeks ago
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆87Updated this week
- IBM development fork of https://github.com/huggingface/text-generation-inference☆61Updated 2 months ago
- ☆465Updated last year
- A collection of all available inference solutions for the LLMs☆91Updated 4 months ago
- Just a bunch of benchmark logs for different LLMs☆119Updated 11 months ago
- A framework for evaluating function calls made by LLMs☆37Updated 11 months ago
- ☆120Updated last year
- Accelerating your LLM training to full speed! Made with ❤️ by ServiceNow Research☆211Updated last week
- Self-host LLMs with LMDeploy and BentoML☆21Updated last week
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆108Updated 3 months ago
- ☆157Updated last year
- an implementation of Self-Extend, to expand the context window via grouped attention☆119Updated last year
- ☆199Updated last year
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆36Updated last year
- ☆62Updated 3 months ago
- Data preparation code for Amber 7B LLM☆91Updated last year
- Experiments on speculative sampling with Llama models☆128Updated 2 years ago
- GPT-4 Level Conversational QA Trained In a Few Hours☆63Updated 10 months ago
- An experimental and alternative approach to Finetuning and RAG.☆35Updated last year
- Tutorial for building LLM router☆217Updated last year
- Deployment a light and full OpenAI API for production with vLLM to support /v1/embeddings with all embeddings models.☆42Updated last year
- ☆52Updated last year
- Simple examples using Argilla tools to build AI☆53Updated 8 months ago
- [ICLR 2024] Skeleton-of-Thought: Prompting LLMs for Efficient Parallel Generation☆171Updated last year
- Self-host LLMs with vLLM and BentoML☆134Updated 2 weeks ago
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆80Updated 2 months ago
- An OpenAI Completions API compatible server for NLP transformers models☆65Updated last year