friendliai / friendli-client
Friendli: the fastest serving engine for generative AI
☆42Updated 2 months ago
Alternatives and similar repositories for friendli-client:
Users that are interested in friendli-client are comparing it to the libraries listed below
- FMO (Friendli Model Optimizer)☆12Updated last week
- ☆42Updated 4 months ago
- FriendliAI Model Hub☆89Updated 2 years ago
- Welcome to PeriFlow CLI ☁︎☆12Updated last year
- ☆11Updated last week
- Nexusflow function call, tool use, and agent benchmarks.☆18Updated last month
- A collection of all available inference solutions for the LLMs☆74Updated 4 months ago
- ☆21Updated this week
- ⚡️ Asynchronous framework for ChatGPT API 🤖☆21Updated last year
- Tiny configuration for Triton Inference Server☆44Updated last week
- 1-Click is all you need.☆59Updated 8 months ago
- Official repository for EXAONE 3.5 built by LG AI Research☆101Updated last month
- High-performance vector search engine with no loss of accuracy through GPU and dynamic placement☆28Updated last year
- How much energy do GenAI models consume?☆41Updated 3 months ago
- The Universe of Evaluation. All about the evaluation for LLMs.☆221Updated 6 months ago
- ☆31Updated last year
- ☆25Updated last month
- A high-throughput and memory-efficient inference and serving engine for LLMs☆47Updated this week
- ☆20Updated last year
- PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"☆46Updated this week
- manage histories of LLM applied applications☆88Updated last year
- ☆25Updated last year
- vLLM adapter for a TGIS-compatible gRPC server.☆15Updated this week
- Dotfile management with bare git☆19Updated last month
- ☆102Updated last year
- QUICK: Quantization-aware Interleaving and Conflict-free Kernel for efficient LLM inference☆114Updated 10 months ago
- ☆40Updated this week
- MIST: High-performance IoT Stream Processing☆17Updated 5 years ago
- Self-host LLMs with vLLM and BentoML☆79Updated this week