viitrix / vt-transformer
Transformer framework for edge computing based on C++.
☆124Updated 4 months ago
Alternatives and similar repositories for vt-transformer:
Users that are interested in vt-transformer are comparing it to the libraries listed below
- OpenLLaMA-Chinese, a permissively licensed open source instruction-following models based on OpenLLaMA☆66Updated last year
- run ChatGLM2-6B in BM1684X☆49Updated last year
- ☆107Updated 11 months ago
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆238Updated 2 weeks ago
- qwen2 and llama3 cpp implementation☆43Updated 9 months ago
- ☆39Updated 4 months ago
- 支持中文场景的的小语言模型 llama2.c-zh☆145Updated last year
- A Toolkit for Running On-device Large Language Models (LLMs) in APP☆67Updated 8 months ago
- 纯c++的全平台llm加速库,支持python调用,支持baichuan, glm, llama, moss基座,手机端流畅运行chatglm-6B级模型单卡可达10000+token / s,☆45Updated last year
- C++ implementation of Qwen-LM☆582Updated 3 months ago
- ☆218Updated last month
- ☆602Updated 7 months ago
- ☆217Updated last year
- GLM Series Edge Models☆132Updated last month
- ☆310Updated 3 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆132Updated 3 months ago
- Explore LLM model deployment based on AXera's AI chips☆86Updated last week
- ☆314Updated 9 months ago
- LLM101n: Let's build a Storyteller 中文版☆130Updated 7 months ago
- ☆78Updated 10 months ago
- Serving Inside Pytorch☆157Updated this week
- TianMu: A modern AI tool with multi-platform support, markdown support, multimodal, continuous conversation, and customizable commands. 一…☆83Updated last year
- Phi3 中文仓库☆320Updated 4 months ago
- Easy, fast, and cheap pretrain,finetune, serving for everyone☆291Updated last week
- Efficient AI Inference & Serving☆468Updated last year
- A demo built on Megrez-3B-Instruct, integrating a web search tool to enhance the model's question-and-answer capabilities.☆37Updated 3 months ago
- 配合 HAI Platform 使用的集成化用户界面☆47Updated last year
- ☆90Updated last year
- ☆27Updated 4 months ago
- An easy-to-use framework for modular RAG☆341Updated this week