gf712 / gpt2-cpp
GPT2 implementation in C++ using Ort
☆26Updated 4 years ago
Alternatives and similar repositories for gpt2-cpp:
Users that are interested in gpt2-cpp are comparing it to the libraries listed below
- ☆124Updated last year
- GGML implementation of BERT model with Python bindings and quantization.☆54Updated last year
- LLM training in simple, raw C/CUDA☆92Updated 10 months ago
- Python bindings for ggml☆140Updated 6 months ago
- Experiments with BitNet inference on CPU☆53Updated 11 months ago
- minimal C implementation of speculative decoding based on llama2.c☆19Updated 7 months ago
- ☆59Updated 2 years ago
- Tiny C++11 GPT-2 inference implementation from scratch☆55Updated 2 months ago
- Inference of Mamba models in pure C☆186Updated last year
- Inference RWKV with multiple supported backends.☆33Updated this week
- A converter and basic tester for rwkv onnx☆42Updated last year
- General purpose GPU compute framework built on Vulkan to support 1000s of cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). …☆44Updated 2 weeks ago
- asynchronous/distributed speculative evaluation for llama3☆38Updated 7 months ago
- qwen2 and llama3 cpp implementation☆43Updated 9 months ago
- RWKV in nanoGPT style☆187Updated 9 months ago
- Port of Suno AI's Bark in C/C++ for fast inference☆53Updated 10 months ago
- Fork of llama.cpp, extended for GPT-NeoX, RWKV-v4, and Falcon models☆29Updated last year
- fast bpe tokenizer, simple to understand, easy to use☆25Updated last year
- Inference Llama 2 in one file of pure C++☆83Updated last year
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆88Updated this week
- Inference RWKV v5, v6 and v7 with Qualcomm AI Engine Direct SDK☆55Updated this week
- Header-only safetensors loader and saver in C++☆53Updated this week
- Universal cross-platform tokenizers binding to HF and sentencepiece☆310Updated 2 weeks ago
- Train your own small bitnet model☆65Updated 4 months ago
- High-Performance SGEMM on CUDA devices☆85Updated last month
- Use safetensors with ONNX 🤗☆48Updated last week
- C++ pipeline with OpenVINO native API for Stable Diffusion v1.5☆13Updated last year
- Inference Vision Transformer (ViT) in plain C/C++ with ggml☆260Updated 11 months ago
- tinygrad port of the RWKV large language model.☆44Updated this week