Franc-Z / QWen1.5_TensorRT-LLM
Optimize QWen1.5 models with TensorRT-LLM
☆16Updated 4 months ago
Related projects: ⓘ
- ☆90Updated last year
- ☆23Updated 3 months ago
- llm-export can export llm model to onnx.☆193Updated this week
- ☆49Updated last year
- text embedding☆133Updated last year
- Transformer related optimization, including BERT, GPT☆39Updated last year
- Simple Dynamic Batching Inference☆145Updated 2 years ago
- 使用sentencepiece中BPE训练中文词表,并在transformers中进行使用。☆107Updated last year
- A streamlined and customizable framework for efficient large model evaluation and performance benchmarking☆167Updated this week
- 欢迎来到 "LLM-travel" 仓库!探索大语言模型(LLM)的奥秘 🚀。致 力于深入理解、探讨以及实现与大模型相关的各种技术、原理和应用。☆246Updated 2 months ago
- Transformer related optimization, including BERT, GPT☆58Updated last year
- Compare multiple optimization methods on triton to imporve model service performance☆46Updated 8 months ago
- 大语言模型指令调优工具(支持 FlashAttention)☆162Updated 8 months ago
- ☆251Updated last week
- Inference code for LLaMA models☆101Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMs☆120Updated 9 months ago
- Train a Chinese LLM From 0 by Personal☆145Updated last week
- 中文书籍收录整理, Collection of Chinese Books☆169Updated 8 months ago
- ☆58Updated last year
- ☆82Updated last year
- ☆572Updated last month
- ☆131Updated last week
- RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.☆512Updated last week
- ☆123Updated 3 months ago
- code for piccolo embedding model from SenseTime☆93Updated 4 months ago
- ☆70Updated 9 months ago
- 中文 Instruction tuning datasets☆112Updated 5 months ago
- FlagScale is a large model toolkit based on open-sourced projects.☆129Updated last week
- export llama to onnx☆91Updated 3 months ago
- Firefly中文LLaMA-2大模型,支持增量预训练Baichuan2、Llama2、Llama、Falcon、Qwen、Baichuan、InternLM、Bloom等大模型☆396Updated 11 months ago