LLM Inference Engine: High-performance CUDA-accelerated framework for large language model inference A cutting-edge, open-source implementation of a large language model (LLM) inference engine, optimized for consumer-grade hardware. This project showcases advanced techniques in GPU acceleration, memory management, and algorithmic optimizations
☆11Sep 29, 2024Updated last year
Alternatives and similar repositories for LlamaInfer
Users that are interested in LlamaInfer are comparing it to the libraries listed below
Sorting:
- Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and min…☆26Nov 11, 2024Updated last year
- A Voxel Editor in C++☆10Oct 27, 2020Updated 5 years ago
- Using gemini AI and VLMs to control a robot in simulation☆13Jan 3, 2025Updated last year
- Repository for go shared libraries (for now).☆11Dec 1, 2025Updated 3 months ago
- This is the combined collection of the course notes for some of the computer science classes at CMU released online.☆65Jan 20, 2025Updated last year
- QUIC pluggable crypto to use the protocol as plaintext (for use when cryptography is already handled at another layer, e.g. Wireguard)☆10Aug 27, 2025Updated 6 months ago
- A microservice to encapsulate our inventory management functionality☆13Oct 25, 2023Updated 2 years ago
- Pretrain、Posttrain、RAG、Agent等大模型相关的基础项目合集☆27Dec 7, 2025Updated 2 months ago
- Unity 3D C# script that generates a mesh to correct lens distortion in real-time on the GPU. Fast and configurable. It is for show video …☆15Feb 12, 2016Updated 10 years ago
- ☆11Oct 2, 2025Updated 5 months ago
- Implementation of an intersection type systems in Elixir. This was a prototype that was never completed nor released.☆10Nov 10, 2020Updated 5 years ago
- ☆13Feb 2, 2025Updated last year
- ☆10Aug 9, 2021Updated 4 years ago
- Companion code to my ElixirConf 2021 talk☆16Oct 15, 2021Updated 4 years ago
- ☆12Nov 17, 2024Updated last year
- A Rust implementation of the OCI artifact specification for WebAssembly☆11Feb 5, 2026Updated 3 weeks ago
- SQL migration tool☆12Oct 11, 2025Updated 4 months ago
- A Kubernetes operator for managing Prefect servers and work pools☆17Feb 24, 2026Updated last week
- A VPN written in Rust☆13Apr 17, 2025Updated 10 months ago
- The last-write-wins register CRDT☆17Nov 10, 2024Updated last year
- Provides Lock and RwLock synchronization primitives.☆15Dec 9, 2024Updated last year
- Examples showing mechanisms of `Maybe` monad and `do-notation`.☆14Jun 29, 2023Updated 2 years ago
- An experiment, a playground, a sandbox, a toy — LLMs judging code.☆10Jan 28, 2025Updated last year
- Typed functional programming with Fauna DB☆12Mar 31, 2021Updated 4 years ago
- ☆11Jul 24, 2024Updated last year
- Aggregated tips from the "Oban Tips" Twitter series☆12Jun 27, 2023Updated 2 years ago
- A nix filter with straight-forward include semantic☆12Aug 31, 2023Updated 2 years ago
- Bash interpreter written in pure Elixir. Execute shell scripts from Elixir with compile-time validation, persistent sessions, and the ab…☆32Jan 30, 2026Updated last month
- A bunch of shell scripts and small rust programs for my personal use☆12Feb 20, 2026Updated last week
- AWS Route53 provider implementation for libdns☆16Nov 21, 2025Updated 3 months ago
- An implementation of a gRPC server and client in Erlang.☆11Apr 25, 2025Updated 10 months ago
- Go tool to connect to websocket for sending and receiving messages and load testing.☆10Dec 20, 2025Updated 2 months ago
- WIP: generate protobuf package from .proto files, web service and CLI client☆13Mar 3, 2019Updated 6 years ago
- Vector Search Benchmarking suite☆12Updated this week
- An LLM inference engine, written in C++☆18Feb 5, 2026Updated 3 weeks ago
- ☆12Aug 19, 2024Updated last year
- Radial lens undistortion filtering in WebGL☆12Mar 31, 2016Updated 9 years ago
- A Golang application that demonstrates how to monitor a Golang service using Prometheus and Grafana. This is for Docker's official Deno L…☆15Mar 22, 2025Updated 11 months ago
- MagickCache is a secure, high-performance caching tool for images, videos, audio, and metadata. It uses memory mapping for fast access, s…☆18Feb 23, 2026Updated last week