Tlntin / Qwen-TensorRT-LLM
☆602Updated 7 months ago
Alternatives and similar repositories for Qwen-TensorRT-LLM:
Users that are interested in Qwen-TensorRT-LLM are comparing it to the libraries listed below
- Accelerate inference without tears☆309Updated 2 weeks ago
- Community maintained hardware plugin for vLLM on Ascend☆370Updated this week
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆238Updated 2 weeks ago
- 通义千问VLLM推理部署DEMO☆550Updated last year
- C++ implementation of Qwen-LM☆582Updated 3 months ago
- RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.☆670Updated 2 months ago
- ☆90Updated last year
- LLM Tuning with PEFT (SFT+RM+PPO+DPO with LoRA)☆403Updated last year
- chatglm多gpu用deepspeed和☆408Updated 8 months ago
- ☆158Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆132Updated 3 months ago
- Firefly中文LLaMA-2大模型,支持增量预训练Baichuan2、Llama2、Llama、Falcon、Qwen、Baichuan、InternLM、Bloom等大模型☆408Updated last year
- ☆324Updated 2 months ago
- 使用peft库,对chatGLM-6B/chatGLM2-6B实现4bit的QLoRA高效微调,并做lora model和base model的merge及4bit的量化(quantize)。☆358Updated last year
- LLM Inference benchmark☆405Updated 8 months ago
- export llama to onnx☆117Updated 3 months ago
- Baichuan2代码的逐行解析版本,适合小白☆212Updated last year
- This is a repository used by individuals to experiment and reproduce the pre-training process of LLM.☆412Updated 11 months ago
- Best practice for training LLaMA models in Megatron-LM☆645Updated last year
- A streamlined and customizable framework for efficient large model evaluation and performance benchmarking☆676Updated this week
- 欢迎来到 "LLM-travel" 仓库!探索大语言模型(LLM)的奥秘 🚀。致力于深入理解、探讨以及实现与大模型相关的各种技术、原理和应用。☆303Updated 8 months ago
- ☆105Updated 4 months ago
- The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.☆956Updated this week
- 部署你自己的OpenAI api🤩, 基于flask, transformers (使用 Baichuan2-13B-Chat-4bits 模型, 可以运行在单张Tesla T4显卡) ,实现了OpenAI中Chat, Models和Completions接口,包含流式响…☆92Updated last year
- Inference code for LLaMA models☆118Updated last year
- llm-export can export llm model to onnx.☆272Updated 2 months ago
- 更纯粹、更高压缩率的Tokenizer☆470Updated 4 months ago
- 一个基于HuggingFace开发的大语言模型训练、测试工具。支持各模型的webui、终端预测,低参数量及全参数模型训练(预训练、SFT、RM、PPO、DPO)和融合、量化。☆214Updated last year
- 开源SFT数据集整理,随时补充☆501Updated last year
- Alpaca Chinese Dataset -- 中文指令微调数据集☆193Updated 5 months ago