anarchy-ai / llm-speed-benchmark
Benchmarking tool for assessing LLM models' performance across different hardwares
☆13Updated 9 months ago
Related projects: ⓘ
- Machine learning library, Distributed training, Deep learning, Reinforcement learning, Models, TensorFlow, PyTorch☆53Updated 2 weeks ago
- LLM code editor for backend services☆10Updated 2 months ago
- Runner in charge of collecting metrics from LLM inference endpoints for the Unify Hub☆16Updated 7 months ago
- Query databases and tables with AI assistance☆14Updated 5 months ago
- Public reports detailing responses to sets of prompts by Large Language Models.☆25Updated 11 months ago
- Self-host LLMs with vLLM and BentoML☆62Updated this week
- Python examples using the bigcode/tiny_starcoder_py 159M model to generate code☆43Updated last year
- Visualize expert firing frequencies across sentences in the Mixtral MoE model☆17Updated 8 months ago
- ☆10Updated last month
- Vector Database with support for late interaction and token level embeddings.☆51Updated last week
- PyGPTPrompt: A CLI tool that manages context windows for AI models, facilitating user interaction and data ingestion for optimized long-t…☆28Updated 4 months ago
- ☆29Updated 4 months ago
- Embedding models from Jina AI☆55Updated 8 months ago
- Demos of ChatGPT's function calling/structured data support.☆22Updated 9 months ago
- ☆71Updated last year
- Embed anything.☆30Updated 3 months ago
- The backend behind the LLM-Perf Leaderboard☆11Updated 4 months ago
- a lightweight, open-source blueprint for building powerful and scalable LLM chat applications☆30Updated 3 months ago
- ☆64Updated 3 months ago
- Tools for LLM agents.☆55Updated last month
- AirLLM 70B inference with single 4GB GPU☆11Updated last month
- ☆43Updated 3 weeks ago
- ⚡️ A fast and flexible PyTorch inference server that runs locally, on any cloud or AI HW.☆125Updated 3 months ago
- Estimate Your LLM's Token Toll Across Various Platforms and Configurations☆28Updated last month
- Machine learning tool-set for Paperspace VMs☆54Updated 7 months ago
- Quickly and securely turn any Linux box into a build and deployment assistant☆26Updated this week
- UI for Ollama☆14Updated last week
- Web UI for working with large language models☆23Updated 3 months ago
- ☆10Updated 2 weeks ago
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆58Updated 2 weeks ago