asynchronous/distributed speculative evaluation for llama3
☆40Aug 8, 2024Updated last year
Alternatives and similar repositories for llama_duo
Users that are interested in llama_duo are comparing it to the libraries listed below
Sorting:
- Go language bindings for the ggwave C++ library☆14Apr 9, 2025Updated 11 months ago
- ☆30Dec 23, 2024Updated last year
- ffmpeg+cuvid+tensorrt+multicamera☆12Dec 31, 2024Updated last year
- Recording models☆12Sep 19, 2023Updated 2 years ago
- ☆16Dec 7, 2023Updated 2 years ago
- Port of Suno AI's Bark in C/C++ for fast inference☆55Apr 15, 2024Updated last year
- 本仓库在OpenVINO推理框架下部署Nanodet检测算法,并重写预处理和后处理部分,具有超高性能!让你在Intel CPU平台上的检测速度起飞! 并基于NNCF和PPQ工具将模型量化(PTQ)至int8精度,推理速度更快!☆15Jun 14, 2023Updated 2 years ago
- 📈Implementing the ADAM optimizer from the ground up with PyTorch and comparing its performance on six 3-D objective functions (each prog…☆22Jul 2, 2022Updated 3 years ago
- TensorRT实现BiSeNetV1与BiSeNetV2部署☆20Apr 14, 2022Updated 3 years ago
- monodepth running in Android by ncnn☆23Oct 12, 2021Updated 4 years ago
- HunyuanDiT with TensorRT and libtorch☆17May 22, 2024Updated last year
- minimal C implementation of speculative decoding based on llama2.c☆28Jul 15, 2024Updated last year
- ☆22Apr 10, 2024Updated last year
- Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and min…☆26Nov 11, 2024Updated last year
- Testing LLM reasoning abilities with family relationship quizzes.☆63Jan 28, 2025Updated last year
- Examples of AI model running on the board, such as horizon/rockchip and so on.☆21Jul 10, 2023Updated 2 years ago
- Gaussian blur for ImGui in Dx12☆33Mar 6, 2025Updated last year
- A friendly Zig launcher and toolchain manager.☆22Dec 2, 2025Updated 3 months ago
- Generate visual podcasts about novels using open source models☆26Feb 15, 2023Updated 3 years ago
- A live multiplayer trivia game where users can bid for the subject of the next question☆29Jan 9, 2026Updated 2 months ago
- Inference of Mamba and Mamba2 models in pure C☆197Jan 22, 2026Updated last month
- A fast and tiny replay format for Geometry Dash bots.☆10Jan 5, 2026Updated 2 months ago
- A SapientML plugin of SapientMLGenerator☆11Dec 23, 2025Updated 2 months ago
- yolov7-pose end2end TRT实现☆27Sep 8, 2022Updated 3 years ago
- ☆28Feb 9, 2024Updated 2 years ago
- ☆16Apr 20, 2025Updated 10 months ago
- segment-anything based mnn☆35Dec 13, 2023Updated 2 years ago
- Spotify clone with webSDK you could do almost everything with this project that you are doing in spotify☆10May 3, 2023Updated 2 years ago
- 4D Miner C++ Modding Headers / 4D-Modding API Headers☆12Dec 31, 2025Updated 2 months ago
- Chaucha functions for usage with Github Actions☆11Sep 18, 2020Updated 5 years ago
- This project is intended to build and deploy an SNPE model on Qualcomm Devices, which are having unsupported layers which are not part of…☆10Oct 4, 2021Updated 4 years ago
- OpenGL Planet Renderer☆35Jul 31, 2021Updated 4 years ago
- ppstructure deploy by ncnn☆36Jul 16, 2024Updated last year
- Port of Microsoft's BioGPT in C/C++ using ggml☆86Feb 21, 2024Updated 2 years ago
- A tiny, didactical implementation of LLAMA 3☆42Dec 2, 2024Updated last year
- cuda编程学习入门☆38Jul 22, 2024Updated last year
- 使用ONNXRuntime部署一种用于边缘检测的轻量级密集卷积神经网络LDC,包含C++和Python两个版本的程序☆11Apr 24, 2023Updated 2 years ago
- Reinforcement learning with VizDoom platform☆11Apr 18, 2022Updated 3 years ago
- A platform aimed at creating websites that perform self-optimization☆12May 4, 2024Updated last year