☆19Nov 6, 2023Updated 2 years ago
Alternatives and similar repositories for Quantize-Watermark
Users that are interested in Quantize-Watermark are comparing it to the libraries listed below
Sorting:
- [AAAI 2024] DenoSent: A Denoising Objective for Self-Supervised Sentence Representation Learning☆15Apr 29, 2024Updated last year
- ☆25Oct 31, 2024Updated last year
- [CVPR 2025 Highlight] FIMA-Q: Post-Training Quantization for Vision Transformers by Fisher Information Matrix Approximation☆26Jun 16, 2025Updated 8 months ago
- ☆38Oct 2, 2024Updated last year
- ☆13Jun 22, 2025Updated 8 months ago
- The official code for "Advancing Multimodal Large Language Models with Quantization-Aware Scale Learning for Efficient Adaptation" | [MM2…☆14Dec 7, 2024Updated last year
- Information Extraction related tools and models☆10Mar 16, 2023Updated 2 years ago
- ☆12Jul 30, 2025Updated 7 months ago
- Official Code For Dual Grained Quantization: Efficient Fine-Grained Quantization for LLM☆14Dec 27, 2023Updated 2 years ago
- Code for the AAAI 2024 Oral paper "OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Model…☆69Mar 7, 2024Updated last year
- SLMTokBench for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"☆37Aug 29, 2023Updated 2 years ago
- ☆17Apr 7, 2025Updated 10 months ago
- [ICLR 2025] BitStack: Any-Size Compression of Large Language Models in Variable Memory Environments☆39Feb 17, 2025Updated last year
- SeqXGPT: An advance method for sentence-level AI-generated text detection.☆100Oct 16, 2023Updated 2 years ago
- [ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Mod…☆39Mar 11, 2024Updated last year
- PyTorch implementation of "Deep Transferring Quantization" (ECCV2020)☆18Jun 22, 2022Updated 3 years ago
- Composite Backdoor Attacks Against Large Language Models☆22Apr 12, 2024Updated last year
- [COLM 2025] Official PyTorch implementation of "Quantization Hurts Reasoning? An Empirical Study on Quantized Reasoning Models"☆71Jul 8, 2025Updated 7 months ago
- FireQ: Fast INT4-FP8 Kernel and RoPE-aware Quantization for LLM Inference Acceleration☆20Jun 27, 2025Updated 8 months ago
- torch_quantizer is a out-of-box quantization tool for PyTorch models on CUDA backend, specially optimized for Diffusion Models.☆23Mar 29, 2024Updated last year
- ☆21Feb 11, 2022Updated 4 years ago
- Implementation of CoLA: Compute-Efficient Pre-Training of LLMs via Low-Rank Activation☆25Feb 18, 2025Updated last year
- PB-LLM: Partially Binarized Large Language Models☆156Nov 20, 2023Updated 2 years ago
- AFPQ code implementation☆23Nov 6, 2023Updated 2 years ago
- The code repository of "MBQ: Modality-Balanced Quantization for Large Vision-Language Models"☆79Mar 17, 2025Updated 11 months ago
- FusedChat is a dialogue dataset. It contains dialogue sessions fusing task-oriented dialogues and open-domain dialogues.☆29Jul 20, 2022Updated 3 years ago
- Faster Pytorch bitsandbytes 4bit fp4 nn.Linear ops☆30Mar 16, 2024Updated last year
- Repository for CPU Kernel Generation for LLM Inference☆28Jul 13, 2023Updated 2 years ago
- ☆30Jul 22, 2024Updated last year
- Code and pretrained models for "DUB: Discrete Unit Back-translation for Speech Translation" (ACL 2023 Findings)☆28Jun 28, 2023Updated 2 years ago
- [ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models☆35Jun 12, 2024Updated last year
- [TMLR] Official PyTorch implementation of paper "Efficient Quantization-aware Training with Adaptive Coreset Selection"☆37Aug 20, 2024Updated last year
- [ACL 2024] A novel QAT with Self-Distillation framework to enhance ultra low-bit LLMs.☆134May 16, 2024Updated last year
- ☆32Mar 31, 2025Updated 11 months ago
- [ICML 2024] Sparse Model Inversion: Efficient Inversion of Vision Transformers with Less Hallucination☆13Apr 29, 2025Updated 10 months ago
- [Preprint] Why is the State of Neural Network Pruning so Confusing? On the Fairness, Comparison Setup, and Trainability in Network Prunin…☆41Sep 9, 2025Updated 5 months ago
- FFNet: MetaMixer-based Efficient Convolutional Mixer Design☆31Mar 11, 2025Updated 11 months ago
- Implementation of ICLR 2018 paper "Loss-aware Weight Quantization of Deep Networks"☆27Oct 24, 2019Updated 6 years ago
- ☆36Jul 25, 2022Updated 3 years ago