vast-ai / vast-cli
Vast.ai python and cli api client
☆142Updated this week
Alternatives and similar repositories for vast-cli
Users that are interested in vast-cli are comparing it to the libraries listed below
Sorting:
- 🐍 | Python library for RunPod API and serverless worker SDK.☆228Updated last month
- 🧰 | RunPod CLI for pod management☆304Updated 4 months ago
- Starting point to build your own custom serverless endpoint☆104Updated last week
- inference code for mixtral-8x7b-32kseqlen☆100Updated last year
- ☆141Updated last year
- ☆120Updated 11 months ago
- ☆84Updated last year
- My swiftsknife for vast.ai service☆134Updated 3 months ago
- Helpers and such for working with Lambda Cloud☆51Updated last year
- Generative Agents: Interactive Simulacra of Human Behavior - with Local LLMs☆15Updated last year
- Command-line script for inferencing from models such as MPT-7B-Chat☆101Updated last year
- Automated prompting and scoring framework to evaluate LLMs using updated human knowledge prompts☆111Updated last year
- A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.☆64Updated last year
- Preprint: Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning☆28Updated last year
- TypeScript generator for llama.cpp Grammar directly from TypeScript interfaces☆136Updated 10 months ago
- GPU accelerated client-side embeddings for vector search, RAG etc.☆66Updated last year
- A community list of common phrases generated by GPT and Claude models☆78Updated last year
- RunPod Serverless Worker for Oobabooga Text Generation API for LLMs☆2Updated 11 months ago
- 🐳 | Dockerfiles for the RunPod container images used for our official templates.☆181Updated 2 weeks ago
- ☆50Updated last month
- DiffusionWithAutoscaler☆29Updated last year
- 2D Positional Embeddings for Webpage Structural Understanding 🦙👀☆95Updated 8 months ago
- ☆156Updated 10 months ago
- ☆61Updated last year
- [WIP] A 🔥 interface for running code in the cloud☆85Updated 2 years ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆52Updated last year
- The first AI artist☆32Updated 2 years ago
- an implementation of Self-Extend, to expand the context window via grouped attention☆119Updated last year
- Command-line script for inferencing from models such as LLaMA, in a chat scenario, with LoRA adaptations☆33Updated last year
- An endpoint server for efficiently serving quantized open-source LLMs for code.☆55Updated last year