Tlntin / Qwen-TensorRT-LLMLinks
☆622Updated last year
Alternatives and similar repositories for Qwen-TensorRT-LLM
Users that are interested in Qwen-TensorRT-LLM are comparing it to the libraries listed below
Sorting:
- Accelerate inference without tears☆322Updated 5 months ago
- ☆27Updated 9 months ago
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆263Updated 3 weeks ago
- ☆172Updated this week
- ☆90Updated 2 years ago
- llm-export can export llm model to onnx.☆304Updated 7 months ago
- LLM Tuning with PEFT (SFT+RM+PPO+DPO with LoRA)☆431Updated last year
- RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.☆840Updated last month
- 通义千问VLLM推理部署DEMO☆596Updated last year
- ☆485Updated 3 weeks ago
- The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.☆1,316Updated this week
- Best practice for training LLaMA models in Megatron-LM☆660Updated last year
- Baichuan2代码的逐行解析版本,适合小白☆214Updated last year
- export llama to onnx☆132Updated 8 months ago
- chatglm多gpu用deepspeed和☆410Updated last year
- llm deploy project based mnn. This project has merged into MNN.☆1,596Updated 7 months ago
- OpenLLMWiki: Docs of OpenLLMAI. Survey, reproduction and domain/task adaptation of open source chatgpt alternatives/implementations. PiXi…☆261Updated 8 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆137Updated 8 months ago
- CIKM2023 Best Demo Paper Award. HugNLP is a unified and comprehensive NLP library based on HuggingFace Transformer. Please hugging for NL…☆389Updated last year
- Inference code for LLaMA models☆122Updated 2 years ago
- C++ implementation of Qwen-LM☆609Updated 8 months ago
- Optimize QWen1.5 models with TensorRT-LLM☆17Updated last year
- Yuan 2.0 Large Language Model☆689Updated last year
- 欢迎来到 "LLM-travel" 仓库!探索大语言模型(LLM)的奥秘 🚀。致力于深入理解、探讨以及实现与大模型相关的各种技术、原理和应用。☆338Updated last year
- 部署你自己的OpenAI api🤩, 基于flask, transformers (使用 Baichuan2-13B-Chat-4bits 模型, 可以运行在单张Tesla T4显卡) , 实现了OpenAI中Chat, Models和Completions接口,包含流式响…☆95Updated last year
- 一个基于HuggingFace开发的大语言模型训练、测试工具。支持各模型的webui、终端预测,低参数量及全参数模型训练(预训练、SFT、RM、PPO、DPO)和融合、量化。☆218Updated last year
- a lightweight LLM model inference framework☆738Updated last year
- LLM Inference benchmark☆426Updated last year
- Firefly中文LLaMA-2大模型,支持增量预训练Baichuan2、Llama2、Llama、Falcon、Qwen、Baichuan、InternLM、Bloom等大模型☆413Updated last year
- FlagScale is a large model toolkit based on open-sourced projects.☆347Updated last week