High-efficiency LLM inference engine in C++/CUDA. Run Llama 70B on RTX 3090.
☆462Feb 22, 2026Updated 3 months ago
Alternatives and similar repositories for ntransformer
Users that are interested in ntransformer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆19Jul 12, 2025Updated 11 months ago
- Rust implementation of the Zstandard Seekable Format☆270Jun 3, 2026Updated 2 weeks ago
- tiny torch, but close to metal☆130Dec 29, 2025Updated 5 months ago
- ⚡ Production-grade RAG chunking engine powered by Rust. Process GBs of CSV, PDF, JSON, JSONL, DOCX, XLSX, URLs, ETC., in seconds with O(1…☆48Feb 28, 2026Updated 3 months ago
- Automatically exported from code.google.com/p/smhasher☆16Mar 10, 2021Updated 5 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Minimal implementation of a Byte Pair Encoding (BPE) tokenizer in Zig☆14Apr 7, 2025Updated last year
- How to build apps for ChatGPT?☆24Oct 28, 2025Updated 7 months ago
- Tensor parallelism is all you need. Run LLMs on an AI cluster at home using any device. Distribute the workload, divide RAM usage, and in…☆18Nov 11, 2024Updated last year
- ☆127May 17, 2026Updated last month
- Server for Matching Long/Lat to Timezone☆47Feb 21, 2026Updated 3 months ago
- Tokenflood is a load testing framework for simulating arbitary loads on instruction-tuned LLMs☆45Jun 10, 2026Updated last week
- Outrageous Voice Assistant - Fully local end-to-end ASR + LLM + TTS pipeline using open weight models and a simple web based UI☆145Mar 12, 2026Updated 3 months ago
- DiscoGrad - automatically differentiate across conditional branches in C++ programs☆210Sep 12, 2024Updated last year
- new optimizer☆20Aug 4, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- An AI coding skill that transforms project planning into visual, interactive HTML decision documents☆53Apr 9, 2026Updated 2 months ago
- A crate built on top of `axum-sessions`, implementing the CSRF Synchronizer Token Pattern☆15Updated this week
- An interactive visual explainer for Andrej Karpathy's microGPT☆41Mar 12, 2026Updated 3 months ago
- Docker-based inference engine for AMD GPUs☆233Oct 7, 2024Updated last year
- Python helper to send proper HDMI-CEC frame to wake up AVR☆28Nov 17, 2025Updated 7 months ago
- The typed graph between your code and whichever warehouse, table format, or query engine you've chosen — typed compiler, branches, replay…☆265Updated this week
- Converts between country names, ISO 3166-1 codes, and Unicode flag emojis.☆17May 24, 2026Updated 3 weeks ago
- A hassle-free utility to encrypt error handling strings in public binaries to protect business logic☆26Apr 29, 2022Updated 4 years ago
- ☆16Apr 20, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- dotfiles: karabiner, starship, tmux, skhd, yabai, alacritty☆17May 29, 2026Updated 3 weeks ago
- 💭 Chat with AI via API☆33Oct 20, 2024Updated last year
- Use Civet in any project.☆15Jun 20, 2024Updated last year
- Open-source CUDA, Triton and HIP compiler targeting multiple GPU and CPU architectures.☆1,697Updated this week
- https://ansi.tools☆38May 17, 2026Updated last month
- A simple app for downloading YouTube Shorts transcripts. Built to self-host with Python and Streamlit. Free and open source.☆32Dec 4, 2024Updated last year
- ☆82Mar 21, 2026Updated 2 months ago
- ☆252Mar 20, 2024Updated 2 years ago
- Inspired by Midnight Commander, tailored to my taste.☆55Apr 22, 2026Updated last month
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Read-through cache for object storage☆591Jun 10, 2026Updated last week
- Ideas, concepts, tools and examples of sketch programming☆23Dec 20, 2025Updated 5 months ago
- Filesystem 'at' implementations for Unix and Windows☆13May 28, 2026Updated 3 weeks ago
- High-performance open-source synthetic data engine. Uses LLMs for schema design and vectorized NumPy for deterministic, scalable generati…☆59Updated this week
- Ruqe brings the convenient types and methods found in Rust into Dart, such as the Result, Option, pattern-matching, etc.☆13Sep 13, 2023Updated 2 years ago
- linux container runtime built with zig☆23Feb 15, 2025Updated last year
- A library for building dynamic terminal apps, using bonsai☆198May 18, 2026Updated last month