Rust+OpenCL+AVX2 implementation of LLaMA inference code
☆554Feb 12, 2024Updated 2 years ago
Alternatives and similar repositories for rllama
Users that are interested in rllama are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [Unmaintained, see README] An ecosystem of Rust libraries for working with large language models☆6,154Jun 24, 2024Updated 2 years ago
- A relatively basic implementation of RWKV in Rust written by someone with very little math and ML knowledge. Supports 32, 8 and 4 bit eva…☆94Sep 2, 2023Updated 2 years ago
- Deep learning in Rust, with shape checked tensors and neural networks☆1,911Jul 23, 2024Updated last year
- A fast llama2 decoder in pure Rust.☆1,064Nov 30, 2023Updated 2 years ago
- Bleeding edge low level Rust binding for GGML☆17Jun 26, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- `llm-chain` is a powerful rust crate for building chains in large language models allowing you to summarise text and complete complex tas…☆1,601Oct 31, 2024Updated last year
- An implementation of the diffusers api in Rust☆591Apr 4, 2024Updated 2 years ago
- Inference Llama 2 in one file of pure Rust 🦀☆235Sep 11, 2023Updated 2 years ago
- LLaMa 7b with CUDA acceleration implemented in rust. Minimal GPU memory needed!☆113Jul 27, 2023Updated 2 years ago
- Rust bindings for the C++ api of PyTorch.☆5,433May 17, 2026Updated last month
- ☆58Apr 6, 2023Updated 3 years ago
- Burn is a next generation tensor library and Deep Learning Framework that doesn't compromise on flexibility, efficiency and portability.☆15,487Jun 26, 2026Updated last week
- LLama.cpp rust bindings☆422Jun 27, 2024Updated 2 years ago
- Llama2 LLM ported to Rust burn☆280Apr 16, 2024Updated 2 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Minimalist ML framework for Rust☆20,562Jun 26, 2026Updated last week
- Run LLaMA inference on CPU, with Rust 🦀🚀🦙☆35Jan 5, 2025Updated last year
- Rust native ready-to-use NLP pipelines and transformer-based models (BERT, DistilBERT, GPT2,...)☆3,068Jan 13, 2026Updated 5 months ago
- A WebGPU-accelerated ONNX inference run-time written 100% in Rust, ready for native and the web☆1,753Jul 21, 2024Updated last year
- Tiny, no-nonsense, self-contained, Tensorflow and ONNX inference☆2,974Updated this week
- Fast, flexible LLM inference☆7,410Updated this week
- Rust bindings to https://github.com/ggerganov/whisper.cpp☆945Jul 30, 2025Updated 11 months ago
- Linear algebra foundation for the Rust programming language☆2,551Jun 24, 2026Updated last week
- Pure Rust implementation of a minimal Generative Pretrained Transformer☆934Oct 21, 2025Updated 8 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- High-level, optionally asynchronous Rust bindings to llama.cpp☆246Jun 5, 2024Updated 2 years ago
- ☆28Aug 10, 2023Updated 2 years ago
- Ecosystem of libraries and tools for writing and executing fast GPU code fully in Rust.☆5,254Apr 29, 2026Updated 2 months ago
- Safe rust wrapper around CUDA toolkit☆1,170Jun 19, 2026Updated 2 weeks ago
- Tensors and dynamic neural networks in pure Rust.☆1,086Oct 10, 2022Updated 3 years ago
- Port of Microsoft's BioGPT in C/C++ using ggml☆87Feb 21, 2024Updated 2 years ago
- A Discord bot, written in Rust, that generates responses using the LLaMA language model.☆94Aug 13, 2023Updated 2 years ago
- ndarray: an N-dimensional array with array views, multidimensional slicing, and efficient operations☆4,293Jun 19, 2026Updated 2 weeks ago
- A Rust machine learning framework.☆4,693May 30, 2026Updated last month
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Tensor library for machine learning☆14,935Jun 26, 2026Updated last week
- ☆13Aug 29, 2022Updated 3 years ago
- Instant, controllable, local pre-trained AI models in Rust☆2,207Jun 24, 2026Updated last week
- Fast ML inference & training for ONNX models in Rust☆2,376Updated this week
- ggml implementation of BERT☆501Feb 23, 2024Updated 2 years ago
- A Rust implementation of the Khronos OpenCL 3.0 API.☆134Mar 8, 2026Updated 3 months ago
- SoTA Transformers with C-backend for fast inference on your CPU.☆312Dec 9, 2023Updated 2 years ago