Tlntin / Qwen-TensorRT-LLMLinks
☆623Updated last year
Alternatives and similar repositories for Qwen-TensorRT-LLM
Users that are interested in Qwen-TensorRT-LLM are comparing it to the libraries listed below
Sorting:
- Accelerate inference without tears☆333Updated 2 weeks ago
- ☆27Updated 11 months ago
- ☆174Updated this week
- 通义千问VLLM推理部署DEMO☆610Updated last year
- RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.☆874Updated last week
- llm-export can export llm model to onnx.☆313Updated last month
- Inference code for LLaMA models☆125Updated 2 years ago
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆265Updated 2 months ago
- export llama to onnx☆136Updated 9 months ago
- ☆503Updated last month
- ☆90Updated 2 years ago
- Best practice for training LLaMA models in Megatron-LM☆659Updated last year
- Optimize QWen1.5 models with TensorRT-LLM☆17Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMs☆137Updated 10 months ago
- FlagScale is a large model toolkit based on open-sourced projects.☆358Updated last week
- 欢迎来到 "LLM-travel" 仓库!探索大语言模型(LLM)的奥秘 🚀。致力于深入理解、探讨以及实现与大模型相关的各种技术、原理和应用。☆343Updated last year
- LLM Inference benchmark☆426Updated last year
- ☆63Updated last month
- 使用sentencepiece中BPE训练中文词表,并在transformers中进行使用。☆119Updated 2 years ago
- LLM Tuning with PEFT (SFT+RM+PPO+DPO with LoRA)☆436Updated 2 years ago
- OpenLLMWiki: Docs of OpenLLMAI. Survey, reproduction and domain/task adaptation of open source chatgpt alternatives/implementations. PiXi…☆262Updated 10 months ago
- The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.☆1,367Updated this week
- ☆354Updated last year
- Firefly中文LLaMA-2大模型,支持增量预训练Baichuan2、Llama2、Llama、Falcon、Qwen、Baichuan、InternLM、Bloom等大模型☆413Updated last year
- C++ implementation of Qwen-LM☆606Updated 10 months ago
- Transformer related optimization, including BERT, GPT☆39Updated 2 years ago
- Yuan 2.0 Large Language Model☆688Updated last year
- Efficient Training (including pre-training and fine-tuning) for Big Models☆610Updated last month
- 部署你自己的OpenAI api🤩, 基于flask, transformers (使用 Baichuan2-13B-Chat-4bits 模型, 可以运行在单张Tesla T4显卡) ,实现了OpenAI中Chat, Models和Completions接口,包含流式响…☆96Updated last year
- Ascend PyTorch adapter (torch_npu). Mirror of https://gitee.com/ascend/pytorch☆437Updated 3 weeks ago