hipudding / llama.cppLinks
LLM inference in C/C++
☆11Updated this week
Alternatives and similar repositories for llama.cpp
Users that are interested in llama.cpp are comparing it to the libraries listed below
Sorting:
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆274Updated 6 months ago
- ☆523Updated 2 weeks ago
- RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.☆1,037Updated last week
- FlagScale is a large model toolkit based on open-sourced projects.☆471Updated last week
- ☆437Updated 4 months ago
- LLM Inference benchmark☆433Updated last year
- ☆183Updated last week
- Inference code for LLaMA models☆128Updated 2 years ago
- Transformer related optimization, including BERT, GPT☆59Updated 2 years ago
- Baichuan2代码的逐行解析版本,适合小白☆213Updated 2 years ago
- Accelerate inference without tears☆372Updated 2 weeks ago
- ☆55Updated last year
- 欢迎来到 "LLM-travel" 仓库!探索大语言模型(LLM)的奥秘 🚀。致力于深入理解、探讨以及实现与大模型相关的各种技术、原理和应用。☆368Updated last year
- Transformer related optimization, including BERT, GPT☆39Updated 2 years ago
- ☆84Updated 2 years ago
- LLM 推理服务性能测试☆44Updated 2 years ago
- A flexible and efficient training framework for large-scale alignment tasks☆447Updated 3 months ago
- Optimize QWen1.5 models with TensorRT-LLM☆17Updated last year
- export llama to onnx☆137Updated last year
- 中文版 llm-numbers☆130Updated 2 years ago
- ☆130Updated last year
- 通义千问VLLM推理部署DEMO☆638Updated last year
- 怎么训练一个LLM分词器☆153Updated 2 years ago
- Community maintained hardware plugin for vLLM on Ascend☆1,618Updated last week
- 一种任务级GPU算力分时调度的高性能深度学习训练平台☆733Updated 2 years ago
- 高性能文本 Tokenizer 库☆32Updated 2 years ago
- ☆79Updated 2 years ago
- ☆22Updated 2 years ago
- Best practice for training LLaMA models in Megatron-LM☆664Updated 2 years ago
- ☆624Updated last year