modelize-ai / LLM-Inference-Deployment-Tutorial
Tutorial for LLM developers about engine design, service deployment, evaluation/benchmark, etc. Provide a C/S style optimized LLM inference engine.
☆19Updated last year
Alternatives and similar repositories for LLM-Inference-Deployment-Tutorial:
Users that are interested in LLM-Inference-Deployment-Tutorial are comparing it to the libraries listed below
- Manages vllm-nccl dependency☆17Updated 10 months ago
- Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];☆36Updated last year
- ☆36Updated 7 months ago
- ☆18Updated last year
- An Experiment on Dynamic NTK Scaling RoPE☆63Updated last year
- NeurIPS 2023 - Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer☆42Updated last year
- Official Repository for Paper "BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Compet…☆18Updated 7 months ago
- FuseAI Project☆85Updated 3 months ago
- OpenLLMDE: An open source data engineering framework for LLMs☆17Updated last year
- the newest version of llama3,source code explained line by line using Chinese☆22Updated last year
- code for paper 《RankingGPT: Empowering Large Language Models in Text Ranking with Progressive Enhancement》☆31Updated last year
- A light proxy solution for HuggingFace hub.☆46Updated last year
- On The Planning Abilities of OpenAI's o1 Models: Feasibility, Optimality, and Generalizability☆38Updated 3 months ago
- Implementation of "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"☆42Updated 5 months ago
- aigc evals☆10Updated last year
- This is a personal reimplementation of Google's Infini-transformer, utilizing a small 2b model. The project includes both model and train…☆56Updated last year
- Automatic prompt optimization framework for multi-step agent tasks.☆29Updated 5 months ago
- code for Scaling Laws of RoPE-based Extrapolation☆73Updated last year
- The paper list of multilingual pre-trained models (Continual Updated).☆20Updated 10 months ago
- Code for Robust Fine-tuning (RbFT)☆12Updated 2 months ago
- Open efforts to implement ChatGPT-like models and beyond.☆107Updated 9 months ago
- SCREWS: A Modular Framework for Reasoning with Revisions☆27Updated last year
- Understanding the correlation between different LLM benchmarks☆29Updated last year
- Large Scale Distributed Model Training strategy with Colossal AI and Lightning AI☆57Updated last year
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated last year
- Implementations of online merging optimizers proposed by Online Merging Optimizers for Boosting Rewards and Mitigating Tax in Alignment☆75Updated 10 months ago
- Simple repository for training small reasoning models☆12Updated 2 months ago
- Leveraging passage embeddings for efficient listwise reranking with large language models.☆40Updated 4 months ago
- ☆28Updated 2 years ago
- Fast instruction tuning with Llama2☆11Updated last year