An end-to-end pipeline to optimize and host LLM for 100K parallel queries
☆37Jul 6, 2025Updated 11 months ago
Alternatives and similar repositories for llm-scale-deploy-guide
Users that are interested in llm-scale-deploy-guide are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Implementation of 12 AI agents evaluation techniques☆43Jul 31, 2025Updated 11 months ago
- A straightforward explanation of how DeepSeek R1 works☆18Feb 7, 2025Updated last year
- A Step-by-Step Implementation of RAPTOR based RAG implementation☆42Sep 1, 2025Updated 10 months ago
- Encountering 14 different Naive RAG fails and using KG to solve it☆25Dec 4, 2025Updated 6 months ago
- We have listed some of the free and powerful GenAI APIs and explore their benefit and usage.☆16Feb 3, 2024Updated 2 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- A benchmark dataset designed to support the development and evaluation of large language models (LLMs) for conversational mental health a…☆22Feb 24, 2025Updated last year
- Understanding Large Language Transformer Architecture like a child☆34Apr 3, 2024Updated 2 years ago
- Evaluation of BEIR Datasets using ColBERT retrieval model☆18Mar 4, 2022Updated 4 years ago
- Car Damage Detection: A computer vision project using YOLOv8 and Faster R-CNN to identify and localize car body defects like scratches, d…☆20Jul 23, 2025Updated 11 months ago
- Constrained Decoding of Diffusion LLMs with Context-Free Grammars.☆52Dec 17, 2025Updated 6 months ago
- Self-training LLaVA for medical☆16Nov 3, 2024Updated last year
- Optimizing Dynamic Knowledge Base Using AI Agent☆91Aug 13, 2025Updated 10 months ago
- A Step-by-Step Implementation of Google Veo 3 Architecture from Scratch☆83Jun 16, 2025Updated last year
- ☆20Apr 26, 2024Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Reasoning-based Evaluation and Ranking of Translations.☆19Jun 2, 2026Updated last month
- AI-powered fashion recommendation system leveraging LLMs, embeddings, and retrieval techniques to deliver personalized shopping experienc…☆37Jul 23, 2025Updated 11 months ago
- decontamination☆36Mar 4, 2026Updated 3 months ago
- ☆23Jul 23, 2025Updated 11 months ago
- A lightweight, type-safe workflow engine for TypeScript that helps you create flexible, graph-based execution flows☆28Jun 24, 2025Updated last year
- ReCAP: Recursive Context-Aware Reasoning and Planning for Large Language Model Agents, NeurIPS 2025☆38Nov 15, 2025Updated 7 months ago
- ☆15Apr 17, 2025Updated last year
- slowly building a set of infinite riddle generators for data-hungry methods☆14Nov 15, 2022Updated 3 years ago
- Run all the tests at the same time with modal.com☆11Mar 2, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- An naive anomaly detection and data visualization tool for F1 on board telemetry data.☆15Jun 17, 2022Updated 4 years ago
- A Deno-based CLI tool to recursively find and display TODOs in your project☆18Jun 19, 2025Updated last year
- ☆15May 11, 2025Updated last year
- ☆11Apr 22, 2020Updated 6 years ago
- Fork of Flame repo for training of some new stuff in development☆19Jun 23, 2026Updated last week
- ☆32Jun 5, 2025Updated last year
- Embedding language models in probability space via log-likelihood vectors☆20Jun 10, 2026Updated 3 weeks ago
- Document Drivien Development☆18Nov 9, 2025Updated 7 months ago
- ☆39Sep 7, 2025Updated 9 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆49Oct 14, 2024Updated last year
- Copy My Writing is a command-line tool for generating content based on your personal writing style.☆11Oct 12, 2025Updated 8 months ago
- CWTS OpenAlex ETL data pipeline.☆22Oct 29, 2025Updated 8 months ago
- A set of distinct value estimators that give probabilistic bounds on a sets cardinality☆22Dec 9, 2019Updated 6 years ago
- dsxkline 支持基本功能,滚动缩放滑动分页实时刷新,支持MA,BOLL、VOL、KDJ、MACD、RSI、WR、CCI、BIAS、PSY等指标,支持web,H5,iOS,android,flutter,C#等☆14Apr 1, 2023Updated 3 years ago
- Public code release for: PosterChild: Blend-Aware Artistic Posterization (EGSR 2021) [Cheng-Kang Ted Chao, Karan Singh, Yotam Gingold]☆12Apr 29, 2024Updated 2 years ago
- LLM that can be trained on 1 or more GPUs for research.☆56May 28, 2026Updated last month