autonomi-ai / nosLinks

⚡️ A fast and flexible PyTorch inference server that runs locally, on any cloud or AI HW.

☆144

Alternatives and similar repositories for nos

Users that are interested in nos are comparing it to the libraries listed below

Sorting:

DeployQL / LintDB
Vector Database with support for late interaction and token level embeddings.
☆55Updated last month
mobiusml / aana_sdk
Aana SDK is a powerful framework for building AI enabled multimodal applications.
☆51Updated last week
Preemo-Inc / text-generation-inference
☆199Updated last year
sumo43 / loopvlm
run paligemma in real time
☆131Updated last year
premAI-io / benchmarks
🕹️ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.
☆137Updated last year
4dh / GRDN
GRDN.AI app for garden optimization
☆70Updated last year
log10-io / log10
Python client library for improving your LLM app accuracy
☆98Updated 5 months ago
titanml / takeoff-community
TitanML Takeoff Server is an optimization, compression and deployment platform that makes state of the art machine learning models access…
☆114Updated last year
modal-labs / awesome-modal
A curated list of amazingly awesome Modal applications, demos, and shiny things. Inspired by awesome-php.
☆150Updated last month
jlewi / foyle
Foyle is a copilot to help developers deploy and operate their applications.
☆131Updated 4 months ago
trzy / llava-cpp-server
LLaVA server (llama.cpp).
☆181Updated last year
mzbac / mlx-lora
☆38Updated last year
deployradiant / pychatml
Chat Markup Language conversation library
☆55Updated last year
virevolai / logos-shift-client
Replace expensive LLM calls with finetunes automatically
☆65Updated last year
normal-computing / extended-mind-transformers
☆123Updated last year
Leeroo-AI / leeroo_orchestrator
The implementation of "Leeroo Orchestrator: Elevating LLMs Performance Through Model Integration"
☆56Updated last year
aniketmaurya / fastserve-ai
Machine Learning Serving focused on GenAI with simplicity as the top priority.
☆59Updated last month
Not-Diamond / RoRF
Routing on Random Forest (RoRF)
☆187Updated 10 months ago
multiplexerai / Complex-to-Simple-RAG
☆39Updated last year
cohere-ai / BinaryVectorDB
Efficient vector database for hundred millions of embeddings.
☆207Updated last year
euclaise / SlimTrainer
Full finetuning of large language models without large memory requirements
☆94Updated last year
Trainy-ai / llm-atc
Fine-tuning and serving LLMs on any cloud
☆90Updated last year
sdan / selfextend
an implementation of Self-Extend, to expand the context window via grouped attention
☆119Updated last year
pulzeai-oss / knn-router
☆111Updated last year
shivamsanju / ragswift
🚀 Scale your RAG pipeline using Ragswift: A scalable centralized embeddings management platform
☆38Updated last year
Vaibhavs10 / fast-llm.rs
☆137Updated last year
BhabhaAI / dataformer
Solving data for LLMs - Create quality synthetic datasets!
☆150Updated 6 months ago
epolewski / EricLLM
A fast batching API to serve LLM models
☆185Updated last year
redotvideo / pluto
Synthetic Data for LLM Fine-Tuning
☆120Updated last year
dstackai / gpuhunt
GPU prices aggregator for cloud providers
☆40Updated this week