mi-optimize is a versatile tool designed for the quantization and evaluation of large language models (LLMs). The library's seamless integration of various quantization methods and evaluation techniques empowers users to customize their approaches according to specific requirements and constraints, providing a high level of flexibility.
☆25Nov 28, 2024Updated last year
Alternatives and similar repositories for MI-optimize
Users that are interested in MI-optimize are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆45Mar 4, 2026Updated last month
- Pulp virtual platform☆24Jul 16, 2025Updated 9 months ago
- Any-Precision Deep Neural Networks (AAAI 2021)☆62May 2, 2020Updated 5 years ago
- Optimizing the Deployment of Tiny Transformers on Low-Power MCUs☆35Sep 2, 2024Updated last year
- Various test models in WNNX format. It can view with `pip install wnetron && wnetron`☆12Jun 22, 2022Updated 3 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Improved the performance of 8-bit PTQ4DM expecially on FID.☆11Aug 30, 2023Updated 2 years ago
- [ECCV24] MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization☆14Nov 27, 2024Updated last year
- CMix-NN: Mixed Low-Precision CNN Library for Memory-Constrained Edge Devices☆49Mar 19, 2020Updated 6 years ago
- ☆42Dec 15, 2022Updated 3 years ago
- C rewrite of a minimal Python JPEG decoder☆12Jan 2, 2019Updated 7 years ago
- BNG Image Format Implementation☆12Sep 19, 2020Updated 5 years ago
- Binary translation in Rust☆12Jun 22, 2020Updated 5 years ago
- This is an implementation of YOLO using LSQ network quantization method.☆22Apr 13, 2022Updated 4 years ago
- ☆12Nov 17, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆53Jul 18, 2024Updated last year
- ☆16Nov 25, 2022Updated 3 years ago
- 3D reconstruction and Plane detection using plane-to-plane homography constraints for uncalibrated image pair under Manhattan World Assum…☆16Dec 2, 2019Updated 6 years ago
- INT-Q Extension of the CMSIS-NN library for ARM Cortex-M target☆18Jan 10, 2020Updated 6 years ago
- 🖥️ a toy riscv emulator☆14Oct 20, 2021Updated 4 years ago
- A simple C++17 header-only library for generating SVG plots☆10Mar 17, 2024Updated 2 years ago
- Machine Learning Function Approximation: This code implements the fully-connected Deep Neural Network (DNN) architectures considered in t…☆20Oct 27, 2020Updated 5 years ago
- ☆20Mar 6, 2022Updated 4 years ago
- Codebase for the Progressive Mixed-Precision Decoding paper.☆19Jul 15, 2025Updated 9 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- RISC-V instruction encoding/decoding☆13Mar 22, 2023Updated 3 years ago
- RISC-V Static Binary Translator☆18Mar 6, 2019Updated 7 years ago
- Evaluation Code repository for the paper "ModuLoRA: Finetuning 3-Bit LLMs on Consumer GPUs by Integrating with Modular Quantizers". (2023…☆13Dec 5, 2023Updated 2 years ago
- (ICLR 2025) BinaryDM: Accurate Weight Binarization for Efficient Diffusion Models☆25Oct 4, 2024Updated last year
- Official Repo for SparseLLM: Global Pruning of LLMs (NeurIPS 2024)☆68Mar 27, 2025Updated last year
- Parallel Spectral Clustering☆17Aug 14, 2017Updated 8 years ago
- Hinton's Forward-Forward Algorithm for Deep Learning☆10Feb 6, 2023Updated 3 years ago
- Asynchronous I/O framework for C with coroutine scheduling☆16Jul 6, 2025Updated 9 months ago
- Simple C library for safely handling utf8 strings☆16Nov 30, 2014Updated 11 years ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Code for Neurips24 paper: QuaRot, an end-to-end 4-bit inference of large language models.☆501Nov 26, 2024Updated last year
- ☆10Jul 16, 2016Updated 9 years ago
- ☆29Sep 3, 2025Updated 7 months ago
- Lightweight C plotting library without special dependencies for Linux and Win☆12Feb 19, 2021Updated 5 years ago
- ☆25Mar 20, 2021Updated 5 years ago
- ☆12Aug 26, 2022Updated 3 years ago
- Methods of Self Calibration☆20Jun 10, 2019Updated 6 years ago