wangkuiyi / huggingface-tokenizer-in-cxxLinks
☆70Updated 2 years ago
Alternatives and similar repositories for huggingface-tokenizer-in-cxx
Users that are interested in huggingface-tokenizer-in-cxx are comparing it to the libraries listed below
Sorting:
- ☆125Updated last year
- Universal cross-platform tokenizers binding to HF and sentencepiece☆426Updated 3 months ago
- qwen2 and llama3 cpp implementation☆48Updated last year
- LLaMa/RWKV onnx models, quantization and testcase☆368Updated 2 years ago
- llm deploy project based onnx.☆47Updated last year
- A general 2-8 bits quantization toolbox with GPTQ/AWQ/HQQ/VPTQ, and export to onnx/onnx-runtime easily.☆183Updated 8 months ago
- transformer tokenizers (e.g. BERT tokenizer) in C++ (WIP)☆18Updated 3 years ago
- Whisper in TensorRT-LLM☆16Updated 2 years ago
- Transformer related optimization, including BERT, GPT☆17Updated 2 years ago
- The Triton backend for the ONNX Runtime.☆168Updated this week
- The Triton backend for TensorRT.☆79Updated 3 weeks ago
- Inference RWKV v5, v6 and v7 with Qualcomm AI Engine Direct SDK☆88Updated last month
- Large Language Model Onnx Inference Framework☆36Updated last week
- A Toolkit to Help Optimize Onnx Model☆256Updated last week
- A Toolkit to Help Optimize Large Onnx Model☆162Updated last month
- Easy and Efficient Quantization for Transformers☆203Updated 5 months ago
- ☢️ TensorRT 2023复赛——基于TensorRT-LLM的Llama模型推断加速优化☆50Updated 2 years ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆16Updated last year
- Model compression for ONNX☆99Updated last year
- ☆205Updated 7 months ago
- ☆25Updated 2 years ago
- Inference Vision Transformer (ViT) in plain C/C++ with ggml☆300Updated last year
- Common source, scripts and utilities shared across all Triton repositories.☆77Updated this week
- simplify >2GB large onnx model☆69Updated last year
- ☆120Updated last year
- OpenAI compatible API for TensorRT LLM triton backend☆218Updated last year
- ggml implementation of embedding models including SentenceTransformer and BGE☆63Updated last year
- ☆78Updated last year
- C++ implementations for various tokenizers (sentencepiece, tiktoken etc).☆42Updated this week
- 用于学习GOT/Qwen/OnnxLLm☆53Updated last year