skypilot-org / skypilotLinks
Run, manage, and scale AI workloads on any AI infrastructure. Use one system to access & manage all AI compute (Kubernetes, 20+ clouds, or on-prem).
β9,171Updated this week
Alternatives and similar repositories for skypilot
Users that are interested in skypilot are comparing it to the libraries listed below
Sorting:
- Large Language Model Text Generation Inferenceβ10,720Updated 3 weeks ago
- πΈ Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloadingβ9,863Updated last year
- Structured Outputsβ13,215Updated this week
- A language for constraint-guided and efficient LLM programming.β4,117Updated 7 months ago
- Multi-LoRA inference server that scales to 1000s of fine-tuned LLMsβ3,661Updated 7 months ago
- Go ahead and axolotl questionsβ11,050Updated this week
- Developer-friendly OSS embedded retrieval library for multimodal AI. Search More; Manage Less.β8,455Updated this week
- QLoRA: Efficient Finetuning of Quantized LLMsβ10,807Updated last year
- PyTorch native post-training libraryβ5,646Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMsβ67,159Updated this week
- 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.β13,087Updated this week
- NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.β5,459Updated last week
- A blazing fast inference solution for text embeddings modelsβ4,368Updated this week
- Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.β12,027Updated 2 weeks ago
- A guidance language for controlling large language models.β21,130Updated this week
- [ICLR 2024] Efficient Streaming Language Models with Attention Sinksβ7,166Updated last year
- Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)β12,669Updated last week
- Semantic cache for LLMs. Fully integrated with LangChain and llama_index.β7,896Updated 5 months ago
- SGLang is a high-performance serving framework for large language models and multimodal models.β22,190Updated this week
- Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the clβ¦β28,094Updated this week
- TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizatβ¦β12,588Updated this week
- Development repository for the Triton language and compilerβ18,041Updated this week
- Running large language models on a single GPU for throughput-oriented scenarios.β9,383Updated last year
- Simple, safe way to store and distribute tensorsβ3,581Updated 2 weeks ago
- Accessible large language models via k-bit quantization for PyTorch.β7,867Updated 3 weeks ago
- Training and serving large-scale neural networks with auto parallelization.β3,174Updated 2 years ago
- Open-source search and retrieval database for AI applications.β25,369Updated this week
- A Bulletproof Way to Generate Structured JSON from Language Modelsβ4,865Updated last year
- A fast inference library for running LLMs locally on modern consumer-class GPUsβ4,403Updated last month
- Tensor library for machine learningβ13,784Updated last week