autonomi-ai / nos
⚡️ A fast and flexible PyTorch inference server that runs locally, on any cloud or AI HW.
☆135Updated 8 months ago
Alternatives and similar repositories for nos:
Users that are interested in nos are comparing it to the libraries listed below
- Vector Database with support for late interaction and token level embeddings.☆52Updated 4 months ago
- ☆199Updated last year
- ☆86Updated 4 months ago
- A curated list of amazingly awesome Modal applications, demos, and shiny things. Inspired by awesome-php.☆116Updated this week
- Fast parallel LLM inference for MLX☆163Updated 7 months ago
- Replace expensive LLM calls with finetunes automatically☆62Updated last year
- Chat Markup Language conversation library☆55Updated last year
- Start a server from the MLX library.☆173Updated 6 months ago
- The Batched API provides a flexible and efficient way to process multiple requests in a batch, with a primary focus on dynamic batching o…☆121Updated 2 months ago
- GRDN.AI app for garden optimization☆70Updated last year
- run paligemma in real time☆130Updated 9 months ago
- Run GGML models with Kubernetes.☆174Updated last year
- Action library for AI Agent☆209Updated this week
- A library for building software agents using behavior trees and language models.☆80Updated 2 weeks ago
- Generate Synthetic Data Using OpenAI, MistralAI or AnthropicAI☆222Updated 9 months ago
- ☆112Updated 6 months ago
- Full finetuning of large language models without large memory requirements☆93Updated last year
- 🤖 Headless IDE for AI agents☆162Updated last week
- Foyle is a copilot to help developers deploy and operate their applications.☆121Updated 2 weeks ago
- Client Code Examples, Use Cases and Benchmarks for Enterprise h2oGPTe RAG-Based GenAI Platform☆82Updated last week
- Routing on Random Forest (RoRF)☆114Updated 4 months ago
- A fast batching API to serve LLM models☆180Updated 9 months ago
- Use context-free grammars with an LLM☆168Updated 10 months ago
- an implementation of Self-Extend, to expand the context window via grouped attention☆118Updated last year
- Maybe the new state of the art vision model? we'll see 🤷♂️☆160Updated last year
- Solving data for LLMs - Create quality synthetic datasets!☆145Updated last month
- An implementation of bucketMul LLM inference☆215Updated 7 months ago
- A stable, fast and easy-to-use inference library with a focus on a sync-to-async API☆45Updated 4 months ago
- Python client library for improving your LLM app accuracy☆96Updated last week