This repository is designed for deploying and managing server processes that handle embeddings using the Infinity Embedding model or Large Language Models with an OpenAI compatible vLLM server.
β26Mar 6, 2025Updated last year
Alternatives and similar repositories for llm-hosting
Users that are interested in llm-hosting are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Tracking the history of the FARA data from https://www.justice.gov/nsd-faraβ16Aug 3, 2023Updated 2 years ago
- π A simple, modern, full-stack toolkit for Python πβ39Oct 18, 2024Updated last year
- Agent based market simulationβ15Aug 10, 2024Updated last year
- A python engine for playing dnd 5eβ23Updated this week
- Git scrapers for scraping the fediverseβ22Updated this week
- GPU virtual machines on DigitalOcean Gradient AI β’ AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- β13Feb 22, 2024Updated 2 years ago
- Interface for interacting with Gradient AI in Pythonβ15Jun 28, 2024Updated 2 years ago
- WIP: Ofen is a toolkit aimed at making transformer models production-ready. API includedβ17Oct 2, 2024Updated last year
- A Framework For Intelligence Farmingβ16Apr 3, 2025Updated last year
- Pin files for contextual, codebase-level AI assistance.β16Jul 11, 2024Updated last year
- Apache Arrow-compatible space-efficient "tape" class in pure Rust to be used with StringZilla for GPU, NUMA, and disk transfers of variabβ¦β31Nov 21, 2025Updated 7 months ago
- marimo + pixi starter templateβ18Jan 31, 2025Updated last year
- Tui Utility to test REST APIsβ13Nov 20, 2023Updated 2 years ago
- Wrap-up around RinteRface templatesβ11Apr 10, 2019Updated 7 years ago
- GPUs on demand by Runpod - Special Offer Available β’ AdRun AI, ML, and HPC workloads on powerful cloud GPUsβwithout limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- β15Oct 19, 2024Updated last year
- GyShell is An Strong AI Agent Powered Terminal, Support SSH connection.β38Jun 27, 2026Updated last week
- Obsidian plugin that allows to display contents of Arc sidebar right besides your notesβ14Jan 26, 2024Updated 2 years ago
- TUI command launcher πβ31Jun 14, 2026Updated 2 weeks ago
- This repository includes the code to download the curated HuggingFace papers into a single markdown formatted fileβ16Jul 26, 2024Updated last year
- Enhanced note taking for AI Agents with supervision.β46Nov 24, 2025Updated 7 months ago
- Hugging Face RoBERTa with Flash Attention 2β24Sep 14, 2025Updated 9 months ago
- Modal LLM LLama.cpp based model deployment as part of series of Model as a Service (MaaS)β17Mar 23, 2026Updated 3 months ago
- Bajo los adoquines, la PLAYA ποΈβ17Updated this week
- Proton VPN Special Offer - Get 70% off β’ AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- β16Dec 16, 2024Updated last year
- The RunBugRun dataset of executable bugsβ25Sep 24, 2025Updated 9 months ago
- MoodCatπΌ classifies the mood of English sentences.β14Jun 19, 2022Updated 4 years ago
- A best-of list for all awesome projects written in textualβ19Jun 6, 2024Updated 2 years ago
- DataKit is a browser-based data analysis platform that processes multi-gigabyte files locally. All processing happens in your browser - nβ¦β321Jun 18, 2026Updated 2 weeks ago
- Intentional is an open-source framework to build reliable LLM chatbots that actually talk and behave as you expect.β12Dec 31, 2024Updated last year
- ClaudeWatch - A tool to track claude active claude sessions with Warpβ79Mar 23, 2026Updated 3 months ago
- Declarative AI Pipelinesβ22Oct 2, 2024Updated last year
- β15Feb 20, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer β’ AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- REST API for Large Language Models using FastAPI, Redis and LiteLLMβ14Nov 30, 2023Updated 2 years ago
- MCP Server for QA Sphere TMSβ22Updated this week
- Minimalistic Go statusline for Claude Codeβ52Jun 1, 2026Updated last month
- Local FAISS vector store as an MCP server β Agent Memory, drop-in local semantic search for Claude / Copilot / Agents.β32Apr 24, 2026Updated 2 months ago
- Machinery data, made easy. Easily download and prepare common industrial datasets.β23Feb 13, 2024Updated 2 years ago
- Agent teams for OpenCode. Run multiple agents in parallel with messaging, shared tasks, and coordinated execution.β156Jun 18, 2026Updated 2 weeks ago
- β11Jun 9, 2022Updated 4 years ago