modelize-ai / LLM-Inference-Deployment-TutorialLinks
Tutorial for LLM developers about engine design, service deployment, evaluation/benchmark, etc. Provide a C/S style optimized LLM inference engine.
☆19Updated last year
Alternatives and similar repositories for LLM-Inference-Deployment-Tutorial
Users that are interested in LLM-Inference-Deployment-Tutorial are comparing it to the libraries listed below
Sorting:
- Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];☆39Updated last year
- ☆36Updated 9 months ago
- This is a personal reimplementation of Google's Infini-transformer, utilizing a small 2b model. The project includes both model and train…☆57Updated last year
- OpenLLMDE: An open source data engineering framework for LLMs☆17Updated last year
- Manages vllm-nccl dependency☆17Updated last year
- FuseAI Project☆87Updated 5 months ago
- ☆33Updated this week
- code for Scaling Laws of RoPE-based Extrapolation☆73Updated last year
- Implemented a script that automatically adjusts Qwen3's inference and non-inference capabilities, based on an OpenAI-like API. The infere…☆20Updated last month
- An Experiment on Dynamic NTK Scaling RoPE☆64Updated last year
- NeurIPS 2023 - Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer☆43Updated last year
- Official Repository for Paper "BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Compet…☆18Updated 10 months ago
- ☆19Updated last year
- ☆36Updated 2 months ago
- Astraios: Parameter-Efficient Instruction Tuning Code Language Models☆58Updated last year
- Fast instruction tuning with Llama2☆11Updated last year
- CLongEval: A Chinese Benchmark for Evaluating Long-Context Large Language Models☆40Updated last year
- Self-host LLMs with LMDeploy and BentoML☆20Updated 2 weeks ago
- [ACL 2025] An official pytorch implement of the paper: Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement☆30Updated last month
- LMTuner: Make the LLM Better for Everyone☆35Updated last year
- ☆20Updated 7 months ago
- Minimal zero-shot intent classifier for arbitrary intent slot filling, via LLM prompting w LangChain.☆33Updated 2 years ago
- A small framework mimics PyTorch using CuPy or NumPy☆37Updated 3 years ago
- ☆105Updated last year
- ☆16Updated 11 months ago
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning. COLM 2024 Accepted Paper☆32Updated last year
- ☆94Updated 6 months ago
- Data preparation code for CrystalCoder 7B LLM☆45Updated last year
- A Python implementation of Toolformer using Huggingface Transformers☆14Updated 2 years ago
- 中文原生工业测评基准☆13Updated last year