AllenJWZhu/LlamaInfer

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/AllenJWZhu/LlamaInfer)

AllenJWZhu / LlamaInfer

LLM Inference Engine: High-performance CUDA-accelerated framework for large language model inference A cutting-edge, open-source implementation of a large language model (LLM) inference engine, optimized for consumer-grade hardware. This project showcases advanced techniques in GPU acceleration, memory management, and algorithmic optimizations

☆11

Alternatives and similar repositories for LlamaInfer

Users that are interested in LlamaInfer are comparing it to the libraries listed below

Sorting:

kyegomez / Exa
View on GitHub
Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and min…
☆26Nov 11, 2024Updated last year
cianjinks / Voxelio
View on GitHub
A Voxel Editor in C++
☆10Oct 27, 2020Updated 5 years ago
Omoshirokunai / ros_llm_ws_pal
View on GitHub
Using gemini AI and VLMs to control a robot in simulation
☆13Jan 3, 2025Updated last year
replicate / go
View on GitHub
Repository for go shared libraries (for now).
☆11Dec 1, 2025Updated 3 months ago
AllenJWZhu / CMU_Course_Notes
View on GitHub
This is the combined collection of the course notes for some of the computer science classes at CMU released online.
☆65Jan 20, 2025Updated last year
jeromegn / quinn-plaintext
View on GitHub
QUIC pluggable crypto to use the protocol as plaintext (for use when cryptography is already handled at another layer, e.g. Wireguard)
☆10Aug 27, 2025Updated 6 months ago
system76 / warehouse
View on GitHub
A microservice to encapsulate our inventory management functionality
☆13Oct 25, 2023Updated 2 years ago
TeenLucifer / llm_base
View on GitHub
Pretrain、Posttrain、RAG、Agent等大模型相关的基础项目合集
☆27Dec 7, 2025Updated 2 months ago
prosa100 / vr-pano-lense-correction
View on GitHub
Unity 3D C# script that generates a mesh to correct lens distortion in real-time on the GPU. Fast and configurable. It is for show video …
☆15Feb 12, 2016Updated 10 years ago
Danny-Dasilva / CycleTLS-Proxy
View on GitHub
☆11Oct 2, 2025Updated 5 months ago
josevalim / intersection_types
View on GitHub
Implementation of an intersection type systems in Elixir. This was a prototype that was never completed nor released.
☆10Nov 10, 2020Updated 5 years ago
cleaton / ex_scylla
View on GitHub
☆13Feb 2, 2025Updated last year
everFinance / ar-erc20-contract
View on GitHub
☆10Aug 9, 2021Updated 4 years ago
s3cur3 / genserver_architecture
View on GitHub
Companion code to my ElixirConf 2021 talk
☆16Oct 15, 2021Updated 4 years ago
DockYard / motion
View on GitHub
☆12Nov 17, 2024Updated last year
bytecodealliance / rust-oci-wasm
View on GitHub
A Rust implementation of the OCI artifact specification for WebAssembly
☆11Feb 5, 2026Updated 3 weeks ago
crgimenes / migration
View on GitHub
SQL migration tool
☆12Oct 11, 2025Updated 4 months ago
PrefectHQ / prefect-operator
View on GitHub
A Kubernetes operator for managing Prefect servers and work pools
☆17Feb 24, 2026Updated last week
anirudhsudhir / snoopy
View on GitHub
A VPN written in Rust
☆13Apr 17, 2025Updated 10 months ago
lpil / lww-register-crdt
View on GitHub
The last-write-wins register CRDT
☆17Nov 10, 2024Updated last year
rocicorp / lock
View on GitHub
Provides Lock and RwLock synchronization primitives.
☆15Dec 9, 2024Updated last year
jechol / examples-elixir
View on GitHub
Examples showing mechanisms of `Maybe` monad and `do-notation`.
☆14Jun 29, 2023Updated 2 years ago
mrnugget / code-judger
View on GitHub
An experiment, a playground, a sandbox, a toy — LLMs judging code.
☆10Jan 28, 2025Updated last year
typescript-fun / typed-functional-fauna
View on GitHub
Typed functional programming with Fauna DB
☆12Mar 31, 2021Updated 4 years ago
RouterDev / router-kv
View on GitHub
☆11Jul 24, 2024Updated last year
sorentwo / oban_tips
View on GitHub
Aggregated tips from the "Oban Tips" Twitter series
☆12Jun 27, 2023Updated 2 years ago
divnix / incl
View on GitHub
A nix filter with straight-forward include semantic
☆12Aug 31, 2023Updated 2 years ago
tv-labs / bash
View on GitHub
Bash interpreter written in pure Elixir. Execute shell scripts from Elixir with compile-time validation, persistent sessions, and the ab…
☆32Jan 30, 2026Updated last month
Nukesor / scripts
View on GitHub
A bunch of shell scripts and small rust programs for my personal use
☆12Feb 20, 2026Updated last week
libdns / route53
View on GitHub
AWS Route53 provider implementation for libdns
☆16Nov 21, 2025Updated 3 months ago
emqx / grpc-erl
View on GitHub
An implementation of a gRPC server and client in Erlang.
☆11Apr 25, 2025Updated 10 months ago
akshaykhairmode / wscli
View on GitHub
Go tool to connect to websocket for sending and receiving messages and load testing.
☆10Dec 20, 2025Updated 2 months ago
bketelsen / microgen
View on GitHub
WIP: generate protobuf package from .proto files, web service and CLI client
☆13Mar 3, 2019Updated 6 years ago
pinecone-io / VSB
View on GitHub
Vector Search Benchmarking suite
☆12Updated this week
microsoft / glinthawk
View on GitHub
An LLM inference engine, written in C++
☆18Feb 5, 2026Updated 3 weeks ago
KRRT7 / driverless-recaptcha-solver
View on GitHub
☆12Aug 19, 2024Updated last year
marcodiiga / lens_distortion_filtering
View on GitHub
Radial lens undistortion filtering in WebGL
☆12Mar 31, 2016Updated 9 years ago
dockersamples / go-prometheus-monitoring
View on GitHub
A Golang application that demonstrates how to monitor a Golang service using Prometheus and Grafana. This is for Docker's official Deno L…
☆15Mar 22, 2025Updated 11 months ago
ImageMagick / MagickCache
View on GitHub
MagickCache is a secure, high-performance caching tool for images, videos, audio, and metadata. It uses memory mapping for fast access, s…
☆18Feb 23, 2026Updated last week