luchangli03 / onnxsim_large_modelView external linksLinks
simplify >2GB large onnx model
☆71Nov 30, 2024Updated last year
Alternatives and similar repositories for onnxsim_large_model
Users that are interested in onnxsim_large_model are comparing it to the libraries listed below
Sorting:
- export llama to onnx☆136Dec 28, 2024Updated last year
- run ChatGLM2-6B in BM1684X☆49Mar 1, 2024Updated last year
- Sophgo AI chips driver and runtime library.☆24Feb 5, 2026Updated last week
- How to export Hugging Face's 🤗 NLP Transformers models to ONNX and use the exported model with the appropriate Transformers pipeline.☆25Apr 19, 2022Updated 3 years ago
- A whisper repo for TPU☆10Jun 4, 2024Updated last year
- A gesture recognition module trained from scratch using Pytorch, deployed with ncnn and TensorRT.☆13May 1, 2022Updated 3 years ago
- An EXPERIMENTAL implementation of Stable Diffusion in .NET, ported from Python libraries by Huggingface☆15Oct 30, 2023Updated 2 years ago
- Recording models☆12Sep 19, 2023Updated 2 years ago
- This is a simple C# demo for stable-diffusion.cpp with safe code only.☆16Mar 25, 2024Updated last year
- ☆17Nov 14, 2023Updated 2 years ago
- Run Chinese MobileBert model on SNPE.☆15May 19, 2023Updated 2 years ago
- LLaMa/RWKV onnx models, quantization and testcase☆366Jul 6, 2023Updated 2 years ago
- Decoding Attention is specially optimized for MHA, MQA, GQA and MLA using CUDA core for the decoding stage of LLM inference.☆46Jun 11, 2025Updated 8 months ago
- Stable Diffusion model v1.5 for TorchSharp☆19Aug 6, 2024Updated last year
- 🦙 LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022☆20May 14, 2024Updated last year
- ☆44Jul 5, 2024Updated last year
- ASIC simulation of Multi-ported Memory Module. And it can offer SRAM-based dual-port basic building block to support multiple read/write …☆22May 30, 2016Updated 9 years ago
- .NET application for stable diffusion, Leveraging OnnxStack, Amuse seamlessly integrates many StableDiffusion capabilities all within the…☆22Dec 29, 2023Updated 2 years ago
- llm deploy project based onnx.☆49Oct 9, 2024Updated last year
- Contextual Position Encoding but with some custom CUDA Kernels https://arxiv.org/abs/2405.18719☆22Jun 5, 2024Updated last year
- llm-export can export llm model to onnx.☆344Oct 24, 2025Updated 3 months ago
- Low-Rank Llama Custom Training☆23Mar 27, 2024Updated last year
- minimal C implementation of speculative decoding based on llama2.c☆25Jul 15, 2024Updated last year
- A tutorial for CUDA&PyTorch☆253Feb 3, 2026Updated last week
- RISCV C and Triton AI-Benchmark☆23Jan 28, 2026Updated 2 weeks ago
- Quantize transformers to any learned arbitrary 4-bit numeric format☆51Jan 25, 2026Updated 3 weeks ago
- This repository includes the source-code and dataset used in our CIKM2022 paper titled 'Commonsense Knowledge Base Completion with Relati…☆28Nov 13, 2022Updated 3 years ago
- Several optimization methods of half-precision general matrix vector multiplication (HGEMV) using CUDA core.☆72Sep 8, 2024Updated last year
- A general 2-8 bits quantization toolbox with GPTQ/AWQ/HQQ/VPTQ, and export to onnx/onnx-runtime easily.☆184Apr 2, 2025Updated 10 months ago
- A library enabling easy transfer and handling of PyTorch models between .NET and Python environments☆30Dec 2, 2024Updated last year
- My Implementation of Q-Sparse: All Large Language Models can be Fully Sparsely-Activated☆33Aug 14, 2024Updated last year
- ☆32Jul 2, 2025Updated 7 months ago
- ☆38Updated this week
- ☆31Nov 11, 2024Updated last year
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆273Aug 6, 2025Updated 6 months ago
- Text-to-Speech (TTS) engine for the Armenian language☆12Sep 29, 2024Updated last year
- C++ implementations for various tokenizers (sentencepiece, tiktoken etc).☆48Feb 4, 2026Updated last week
- Our 2nd-gen LMM☆34May 22, 2024Updated last year
- An onnx-based quantitation tool.☆71Jan 8, 2024Updated 2 years ago