FareedKhan-dev/llm-scale-deploy-guide

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/FareedKhan-dev/llm-scale-deploy-guide)

FareedKhan-dev / llm-scale-deploy-guide

An end-to-end pipeline to optimize and host LLM for 100K parallel queries

☆37

Alternatives and similar repositories for llm-scale-deploy-guide

Users that are interested in llm-scale-deploy-guide are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

FareedKhan-dev / AI-outlier-detection
View on GitHub
Outlier Detection with AI + ML
☆15Sep 12, 2025Updated 10 months ago
FareedKhan-dev / DeepSeek-R1-from-scratch
View on GitHub
A straightforward explanation of how DeepSeek R1 works
☆18Feb 7, 2025Updated last year
FareedKhan-dev / big-data-with-KG
View on GitHub
Handling Big Data with Knowledge Graph: A Detailed Guide
☆29May 11, 2025Updated last year
FareedKhan-dev / advance-contextual-engineering
View on GitHub
Contextual Engineering Pipeline
☆22Mar 5, 2026Updated 4 months ago
FareedKhan-dev / rag-with-raptor
View on GitHub
A Step-by-Step Implementation of RAPTOR based RAG implementation
☆42Sep 1, 2025Updated 10 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
DeepHiveMind / Generative-AI
View on GitHub
This repo contains industry use cases and application code for GenAI
☆14Oct 25, 2024Updated last year
FareedKhan-dev / ai-agents-eval-techniques
View on GitHub
Implementation of 12 AI agents evaluation techniques
☆46Jul 31, 2025Updated 11 months ago
FareedKhan-dev / agentic-whatsapp-ai
View on GitHub
Building an advanced Agentic eCommerce WhatsApp bot
☆22Sep 12, 2025Updated 10 months ago
HaohanZou / CoNSAL
View on GitHub
Official implementation of CoNSAL for analytical Lyapunov function discovery
☆12Jun 26, 2024Updated 2 years ago
junekihong / beam-span-parser
View on GitHub
A DP beam-search extension of Mitchell Stern's span-based neural constituency parser
☆11Aug 24, 2022Updated 3 years ago
FareedKhan-dev / langgraph-long-memory
View on GitHub
A detail Implementation of handling long-term memory in Agentic AI
☆55Oct 9, 2025Updated 9 months ago
AMD-AI-HACKATHON / AI-Scheduling-Assistant
View on GitHub
☆11Sep 8, 2025Updated 10 months ago
gabeguo / any-order-speculative-decoding
View on GitHub
Reviving Any-Order Autoregressive Models via Principled Parallel Sampling and Speculative Decoding
☆16Nov 16, 2025Updated 8 months ago
abaisero / gym-pomdps
View on GitHub
☆10Apr 13, 2023Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
FareedKhan-dev / saas-payment-guide
View on GitHub
A Beginner's Guide to Monetizing Your Python AI Chatbot
☆17Apr 22, 2025Updated last year
FareedKhan-dev / contextual-engineering-guide
View on GitHub
Implementation of contextual engineering pipeline with LangChain and LangGraph Agents
☆94Jul 29, 2025Updated 11 months ago
FareedKhan-dev / temporal-ai-agent-pipeline
View on GitHub
Optimizing Dynamic Knowledge Base Using AI Agent
☆92Aug 13, 2025Updated 11 months ago
Relaxed-System-Lab / UltraLLaDA
View on GitHub
We introduce UltraLLaDA , a scaled variant of LLaDA-8B-Base that extends the context length up to 128K tokens with light-weight post-trai…
☆15Oct 23, 2025Updated 9 months ago
thakur-nandan / beir-ColBERT
View on GitHub
Evaluation of BEIR Datasets using ColBERT retrieval model
☆18Mar 4, 2022Updated 4 years ago
FareedKhan-dev / complex-RAG-guide
View on GitHub
A step by step implementation of a complex RAG pipeline to solve real world situations
☆492Jul 1, 2025Updated last year
hkproj / vae-from-scratch-notes
View on GitHub
Notes about the video on the Variational Autoencoder
☆14Jun 7, 2023Updated 3 years ago
jh-chung1 / GNN_ElasticModulus_Prediction
View on GitHub
Application of Graph Neural Networks to predict material properties from their microstructures.
☆21Nov 18, 2024Updated last year
FareedKhan-dev / optimize-ai-agent-memory
View on GitHub
9 Different Ways to Optimize AI Agent Memories
☆336Jul 12, 2025Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
FareedKhan-dev / deep-research-agent
View on GitHub
Deep research agentic system using Time Test Diffusion
☆47Dec 11, 2025Updated 7 months ago
FareedKhan-dev / Multi-Agent-AI-System
View on GitHub
Building a Multi-Agent AI System with LangGraph and LangSmith
☆364May 31, 2025Updated last year
rmartinshort / research_assist
View on GitHub
Tool that uses tavily and langraph to conduct research, then gsuite apis to organize it
☆20Nov 17, 2024Updated last year
FareedKhan-dev / gpt4o-from-scratch
View on GitHub
Implementation of a GPT-4o like Multimodal from Scratch using Python
☆78Apr 4, 2025Updated last year
interpolants / forecasting
View on GitHub
Code for the paper: Probabilistic Forecasting with Stochastic Interpolants and Follmer Processes (generative AI for forecasting)
☆20May 17, 2026Updated 2 months ago
aio-libs / aiohttp-apischema
View on GitHub
Generate a schema and validate user input from types
☆12Jul 13, 2026Updated last week
pierrel55 / llama_st
View on GitHub
Load and run Llama from safetensors files in C
☆15Oct 24, 2024Updated last year
alexander-moore / vlm
View on GitHub
Composition of Multimodal Language Models From Scratch
☆15Aug 16, 2024Updated last year
spthm / asyncpio
View on GitHub
An asynchronous Python client for pigpio.
☆12Jul 15, 2023Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Kamichanw / CoS
View on GitHub
[ICML'25] Official code of paper "Fast Large Language Model Collaborative Decoding via Speculation"
☆30Jun 23, 2025Updated last year
BraiNEdarwin / SkyNEt
View on GitHub
Python measurement platform for the NanoElectronics group
☆10Mar 4, 2021Updated 5 years ago
jgbos / phased-array-radar
View on GitHub
Phased Array Radar
☆13Aug 15, 2012Updated 13 years ago
dtunnicliffe / fetal-health-classification
View on GitHub
Phase 3 project for Data Science program at Flatiron School. Predicting fetal health outcomes using CTG data. Testing various classificat…
☆20Feb 9, 2021Updated 5 years ago
SoftSimu / SymPhas
View on GitHub
Repository for the SymPhas software for phase-field simulations
☆29Jul 3, 2026Updated 2 weeks ago
MitchRatquest / tinyNetboot
View on GitHub
smallest netboot server
☆12Jan 5, 2018Updated 8 years ago
vaibhavkarve / leanteach2020
View on GitHub
Formalizing geometry in Lean : IGL/UniHigh Summer 2020 research project
☆32Jan 17, 2022Updated 4 years ago