autonomi-ai / nosLinks
⚡️ A fast and flexible PyTorch inference server that runs locally, on any cloud or AI HW.
☆144Updated last year
Alternatives and similar repositories for nos
Users that are interested in nos are comparing it to the libraries listed below
Sorting:
- Vector Database with support for late interaction and token level embeddings.☆55Updated last month
- Aana SDK is a powerful framework for building AI enabled multimodal applications.☆51Updated last week
- ☆199Updated last year
- run paligemma in real time☆131Updated last year
- 🕹️ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.☆137Updated last year
- GRDN.AI app for garden optimization☆70Updated last year
- Python client library for improving your LLM app accuracy☆98Updated 5 months ago
- TitanML Takeoff Server is an optimization, compression and deployment platform that makes state of the art machine learning models access…☆114Updated last year
- A curated list of amazingly awesome Modal applications, demos, and shiny things. Inspired by awesome-php.☆150Updated last month
- Foyle is a copilot to help developers deploy and operate their applications.☆131Updated 4 months ago
- LLaVA server (llama.cpp).☆181Updated last year
- ☆38Updated last year
- Chat Markup Language conversation library☆55Updated last year
- Replace expensive LLM calls with finetunes automatically☆65Updated last year
- ☆123Updated last year
- The implementation of "Leeroo Orchestrator: Elevating LLMs Performance Through Model Integration"☆56Updated last year
- Machine Learning Serving focused on GenAI with simplicity as the top priority.☆59Updated last month
- Routing on Random Forest (RoRF)☆187Updated 10 months ago
- ☆39Updated last year
- Efficient vector database for hundred millions of embeddings.☆207Updated last year
- Full finetuning of large language models without large memory requirements☆94Updated last year
- Fine-tuning and serving LLMs on any cloud☆90Updated last year
- an implementation of Self-Extend, to expand the context window via grouped attention☆119Updated last year
- ☆111Updated last year
- 🚀 Scale your RAG pipeline using Ragswift: A scalable centralized embeddings management platform☆38Updated last year
- ☆137Updated last year
- Solving data for LLMs - Create quality synthetic datasets!☆150Updated 6 months ago
- A fast batching API to serve LLM models☆185Updated last year
- Synthetic Data for LLM Fine-Tuning☆120Updated last year
- GPU prices aggregator for cloud providers☆40Updated this week