friendliai / friendli-clientLinks
Friendli: the fastest serving engine for generative AI
☆46Updated 4 months ago
Alternatives and similar repositories for friendli-client
Users that are interested in friendli-client are comparing it to the libraries listed below
Sorting:
- FMO (Friendli Model Optimizer)☆12Updated 5 months ago
- ☆46Updated 9 months ago
- FriendliAI Model Hub☆91Updated 3 years ago
- Toy O☆16Updated 8 months ago
- Nexusflow function call, tool use, and agent benchmarks.☆19Updated 5 months ago
- Welcome to PeriFlow CLI ☁︎☆12Updated last year
- Evaluate your LLM apps, RAG pipeline, any generated text, and more!☆1Updated last year
- vLLM adapter for a TGIS-compatible gRPC server.☆30Updated this week
- Training-free Post-training Efficient Sub-quadratic Complexity Attention. Implemented with OpenAI Triton.☆133Updated this week
- A collection of all available inference solutions for the LLMs☆89Updated 3 months ago
- Tiny configuration for Triton Inference Server☆45Updated 4 months ago
- Ditto is an open-source framework that enables direct conversion of HuggingFace PreTrainedModels into TensorRT-LLM engines.☆41Updated this week
- A lightweight adjustment tool for smoothing token probabilities in the Qwen models to encourage balanced multilingual generation.☆72Updated last month
- Efficient fine-tuning for ko-llm models☆182Updated last year
- OSLO: Open Source for Large-scale Optimization☆174Updated last year
- Self-host LLMs with LMDeploy and BentoML☆20Updated 2 months ago
- Efficient and Scalable Estimation of Tool Representations in Vector Space☆23Updated 9 months ago
- Build complex LLM Applications with Python Dictionary☆40Updated 7 months ago
- MIST: High-performance IoT Stream Processing☆17Updated 6 years ago
- Tutorial to get started with SkyPilot!☆57Updated last year
- IBM development fork of https://github.com/huggingface/text-generation-inference☆60Updated last month
- Dotfile management with bare git☆19Updated 3 weeks ago
- ☆12Updated last year
- Newsletter bot for 🤗 Daily Papers☆121Updated this week
- An extended project of the LLM Compiler paper, focusing on developing LLM-based Autonomous Agents.☆23Updated 7 months ago
- The Universe of Evaluation. All about the evaluation for LLMs.☆224Updated 10 months ago
- ☆39Updated last month
- Performant kernels for symmetric tensors☆15Updated 9 months ago
- Modular and structured prompt caching for low-latency LLM inference☆94Updated 6 months ago
- AI Assistant running within your browser.☆67Updated 6 months ago