vtuber-plan / olah
Self-hosted huggingface mirror service.
☆76Updated this week
Related projects ⓘ
Alternatives and complementary repositories for olah
- One-click machine learning deployment (LLM, text-to-image and so on) at scale on any cluster (GCP, AWS, Lambda labs, your home lab, or ev…☆239Updated last year
- A simple service that integrates vLLM with Ray Serve for fast and scalable LLM serving.☆53Updated 7 months ago
- ✨ Split text by languages (e.g. 你喜欢看アニメ吗 -> 你喜欢看 | アニメ | 吗) for NLP tasks (e.g. parse, TTS). Powered by fasttext and langua☆36Updated last week
- ☆32Updated 9 months ago
- ⚡️ 80x faster language detection with Fasttext | Split text by language for TTS☆120Updated last month
- A Next.js version of Claude Aritfacts , inspired by llamacoder☆16Updated last month
- A lightweight script for processing HTML page to markdown format with support for code blocks☆71Updated 6 months ago
- Evaluation for AI apps and agent☆35Updated 9 months ago
- The inference code of RVC-Boss/GPT-SoVITS that can be developer-friendly.☆11Updated last month
- Self-host LLMs with vLLM and BentoML☆72Updated this week
- ☆45Updated 3 weeks ago
- a local implementation of OpenAI Assistants API: myla stands for MY Local Assistant☆49Updated 2 months ago
- ☆44Updated 3 weeks ago
- Open Source Text Embedding Models with OpenAI Compatible API☆131Updated 3 months ago
- Open-source observability for your LLM application.☆43Updated 2 weeks ago
- 🎧 Pod-Helper: Real-time audio transcription and repair on consumer hardware☆78Updated 8 months ago
- A converter and basic tester for rwkv onnx☆41Updated 9 months ago
- Sentence Transformers API: An OpenAI compatible embedding API server☆35Updated 2 months ago
- Deploy langgenious/dify, an LLM based app on kubernetes with helm chart☆201Updated 2 weeks ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆129Updated 4 months ago
- ☆18Updated 7 months ago
- ☆18Updated last year
- Deployment a light and full OpenAI API for production with vLLM to support /v1/embeddings with all embeddings models.☆37Updated 3 months ago
- A streamlined, user-friendly JSON streaming preprocessor, crafted in Python.☆73Updated last month
- Evaling and unaligning Chinese LLM censorship☆26Updated last month
- LLM inference server implementation based on llama.cpp.☆25Updated this week
- Deploy ChatGLM on Modelz☆15Updated last year
- Evaluation of bm42 sparse indexing algorithm☆60Updated 4 months ago
- CursorCore: Assist Programming through Aligning Anything☆65Updated 3 weeks ago
- Prompt 工程师利器,可同时比较多个 Prompts 在多个 LLM 模型上的效果☆97Updated last year