A collection of experiments related to LLM inference with llama.cpp/mlx
☆40Apr 11, 2026Updated last week
Alternatives and similar repositories for llama-sandbox
Users that are interested in llama-sandbox are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Go language bindings for the ggwave C++ library☆14Apr 9, 2025Updated last year
- ffmpeg+cuvid+tensorrt+multicamera☆12Dec 31, 2024Updated last year
- Testing LLM reasoning abilities with family relationship quizzes.☆63Jan 28, 2025Updated last year
- Stable Diffusion in TensorRT 8.5+☆15Mar 19, 2023Updated 3 years ago
- ☆31Dec 23, 2024Updated last year
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- HunyuanDiT with TensorRT and libtorch☆18May 22, 2024Updated last year
- ☆18Dec 7, 2023Updated 2 years ago
- Inference of Mamba, Mamba2 and Mamba3 models in pure C☆200Mar 18, 2026Updated last month
- Recording models☆12Sep 19, 2023Updated 2 years ago
- 本仓库在OpenVINO推理框架下部署Nanodet检测算法,并重写预处理和后处理部分,具有超高性能!让你在Intel CPU平台上的检测速度起飞! 并基于NNCF和PPQ工具将模型量化(PTQ)至int8精度,推理速度更快!☆16Jun 14, 2023Updated 2 years ago
- Guess the Hacker News titles☆12Mar 24, 2022Updated 4 years ago
- ☆22Apr 10, 2024Updated 2 years ago
- monodepth running in Android by ncnn☆23Oct 12, 2021Updated 4 years ago
- ☆20Dec 29, 2023Updated 2 years ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- High-level, optionally asynchronous Rust bindings to llama.cpp☆245Jun 5, 2024Updated last year
- Downsampling array of intervals☆26Dec 11, 2019Updated 6 years ago
- Examples of AI model running on the board, such as horizon/rockchip and so on.☆21Jul 10, 2023Updated 2 years ago
- A live multiplayer trivia game where users can bid for the subject of the next question☆29Jan 9, 2026Updated 3 months ago
- Semantic emoji finder. Python/dash UI. Uses sentence transformer embeddings and duckdb☆19Sep 15, 2025Updated 7 months ago
- LLM-powered lossless compression tool☆307Jan 2, 2026Updated 3 months ago
- TypeScript generator for llama.cpp Grammar directly from TypeScript interfaces☆142Jul 9, 2024Updated last year
- Llama3 Streaming Chat Sample☆22Apr 24, 2024Updated last year
- segment-anything based mnn☆37Dec 13, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆16Jul 11, 2025Updated 9 months ago
- Gaussian blur for ImGui in Dx12☆33Mar 6, 2025Updated last year
- LightNet is an optimized deep learning framework based on the popular darknet platform. It is optimized to create efficient and high-spee…☆38Sep 17, 2023Updated 2 years ago
- WebAssembly (Wasm) Build and Bindings for llama.cpp☆288Jul 23, 2024Updated last year
- [ICML 2022] Official implementation of "Score-Guided Intermediate Layer Optimization: Fast Langevin Mixing for Inverse Problems".☆12Jul 19, 2022Updated 3 years ago
- Port of Microsoft's BioGPT in C/C++ using ggml☆86Feb 21, 2024Updated 2 years ago
- MobileSAM の エンコーダー/デコーダーをONNXに変換し、推論するサンプル☆12Apr 11, 2024Updated 2 years ago
- Extracts structured data from unstructured input. Programming language agnostic. Uses llama.cpp☆45May 16, 2024Updated last year
- A tiny, didactical implementation of LLAMA 3☆42Dec 2, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆30Nov 16, 2024Updated last year
- YOLOv12 TensorRT 端到端模型加速推理和INT8量化实现☆13Mar 5, 2025Updated last year
- Converts CLIP models to ONNX☆11Jan 17, 2023Updated 3 years ago
- Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and min…☆26Nov 11, 2024Updated last year
- Using OpenAI's Whisper via whisper.cpp with SFML☆14Dec 2, 2025Updated 4 months ago
- Web App to transcribe memos using Whisper AI.☆18Oct 23, 2022Updated 3 years ago
- print_f64 implementation purely in assembly without using any 3rd party dependencies including libc, libm, etc.☆12Feb 1, 2021Updated 5 years ago