Runs LLaMA with Extremely HIGH speed
☆95Nov 21, 2023Updated 2 years ago
Alternatives and similar repositories for fast-llama
Users that are interested in fast-llama are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A simple library for working with Hugging Face models.☆14Dec 30, 2024Updated last year
- 用户态线程实现☆20Mar 16, 2019Updated 7 years ago
- A std::execution style runtime context and High Performance RPC Transport for using OpenUCX. Including CUDA/ROCM/... devices with RDMA.☆30Feb 22, 2026Updated last month
- Computer Papers☆11Jun 15, 2025Updated 9 months ago
- use rust to achieve algorithm☆11May 28, 2019Updated 6 years ago
- ☆12Oct 30, 2021Updated 4 years ago
- Sentence Embedding as a Service☆15Jun 30, 2025Updated 8 months ago
- Run Retrieval-based Voice Conversion training and inference with ease.☆11Jan 24, 2025Updated last year
- Faster Pytorch bitsandbytes 4bit fp4 nn.Linear ops☆30Mar 16, 2024Updated 2 years ago
- ☆17Mar 8, 2020Updated 6 years ago
- In-memory relational database.☆11Nov 28, 2025Updated 3 months ago
- Learning about CUDA by writing PTX code.☆157Feb 27, 2024Updated 2 years ago
- 《深入理解文件系统原理和实践》pdf, ISBN: 978-7-89381-214-9☆27May 6, 2024Updated last year
- A novel media player that allows you to navigate by speaker☆90Dec 22, 2025Updated 3 months ago
- 基于FreeVC的歌声转换☆21Dec 16, 2022Updated 3 years ago
- Context-aware LLM Translator (CALT)☆51Jan 8, 2025Updated last year
- GPT-2 inference engine written in Zig☆39Jul 24, 2023Updated 2 years ago
- Powdered Metal — High performance LLM fine-tuning framework for Apple Silicon☆133Updated this week
- Automatically annotates YOLO dataset using Moondream visual model☆20Aug 24, 2025Updated 7 months ago
- Translate word embeddings across models☆10Aug 17, 2020Updated 5 years ago
- notes on langchain☆18Jan 23, 2024Updated 2 years ago
- A common protocol for AI agent tools☆10Oct 21, 2024Updated last year
- Yet another frontend for LLM, written using .NET and WinUI 3☆10Sep 14, 2025Updated 6 months ago
- Multi-platform BASIC compiler, supports Windows, Linux and macOS. easy extends with C++☆21May 5, 2021Updated 4 years ago
- Yet another `llama.cpp` Rust wrapper☆12Jun 19, 2024Updated last year
- [NAACL'25] "Revealing the Barriers of Language Agents in Planning"☆13Jun 22, 2025Updated 9 months ago
- A free and open-source GUI tool that simplifies combining multiple code files into one, with automatic labeling and support for various p…☆14Jan 3, 2025Updated last year
- android auto clicke with linux and scrcpy☆14Feb 22, 2021Updated 5 years ago
- Code execution runtime for the STAC Overflow: Map Floodwater from Radar Imagery competition☆12Sep 29, 2021Updated 4 years ago
- Simple cron schedule expression editor for Linux Desktop☆17Jul 13, 2020Updated 5 years ago
- ☆15Feb 28, 2022Updated 4 years ago
- A free AI text generation interface based on KoboldAI☆33Feb 27, 2024Updated 2 years ago
- ☆14Feb 4, 2022Updated 4 years ago
- GFPGAN face reconstruction with ncnn on a bare Raspberry Pi☆14Jan 4, 2023Updated 3 years ago
- Simple agent framework using Ollama tool calling☆10Aug 27, 2024Updated last year
- ☆17May 15, 2024Updated last year
- llama.cpp bindings and utilities for zig☆70Apr 14, 2025Updated 11 months ago
- From-scratch implementation of OpenAI's GPT-OSS model in Python. No Torch, No GPUs.☆108Nov 5, 2025Updated 4 months ago
- A fork of llama3.c used to do some R&D on inferencing☆22Dec 20, 2024Updated last year