ironartisan / awesome-compression1Links
模型压缩的小白入门教程
☆22Updated last year
Alternatives and similar repositories for awesome-compression1
Users that are interested in awesome-compression1 are comparing it to the libraries listed below
Sorting:
- 顾名思义:手搓的RAG☆131Updated last year
- Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];☆40Updated 2 years ago
- ☆135Updated 11 months ago
- ☢️ TensorRT 2023复赛——基于TensorRT-LLM的Llama模型推断加速优化☆51Updated 2 years ago
- LLM101n: Let's build a Storyteller 中文版☆138Updated last year
- the newest version of llama3,source code explained line by line using Chinese☆22Updated last year
- Transformer related optimization, including BERT, GPT☆17Updated 2 years ago
- Tutorial for Ray☆36Updated last year
- Music large model based on InternLM2-chat.☆23Updated last year
- Efficient, Flexible, and Highly Fault-Tolerant Model Service Management Based on SGLang☆61Updated last year
- The complete training code of the open-source high-performance Llama model, including the full process from pre-training to RLHF.☆68Updated 2 years ago
- simplest online-softmax notebook for explain Flash Attention☆13Updated last year
- ☆120Updated 2 years ago
- ☆30Updated 6 months ago
- GLM Series Edge Models☆156Updated 7 months ago
- A MoE impl for PyTorch, [ATC'23] SmartMoE☆71Updated 2 years ago
- 大模型部署实战:TensorRT-LLM, Triton Inference Server, vLLM☆26Updated last year
- ☆106Updated 2 years ago
- run ChatGLM2-6B in BM1684X☆49Updated last year