fire / pytorch-nncpLinks
☆10Updated 2 years ago
Alternatives and similar repositories for pytorch-nncp
Users that are interested in pytorch-nncp are comparing it to the libraries listed below
Sorting:
- This repository contains the source code and dataset link mentioned in WWW 2022 accepted paper "TRACE:A Fast Transformer-based General-Pu…☆30Updated 3 years ago
- An implementation of LLMzip using GPT-2☆12Updated last year
- Dzip: improved general-purpose lossless compression based on novel neural network modeling☆71Updated 3 years ago
- ☆50Updated 4 months ago
- QuIP quantization☆52Updated last year
- High-speed and easy-use LLM serving framework for local deployment☆108Updated 2 months ago
- Code for paper: "QuIP: 2-Bit Quantization of Large Language Models With Guarantees"☆365Updated last year
- A collection of tools for neural compression enthusiasts.☆551Updated 8 months ago
- [ICLR'25] Fast Inference of MoE Models with CPU-GPU Orchestration☆211Updated 6 months ago
- ☆69Updated last year
- ☆130Updated 2 months ago
- A fusion of a linear layer and a cross entropy loss, written for pytorch in triton.☆67Updated 10 months ago
- The implementation for MLSys 2023 paper: "Cuttlefish: Low-rank Model Training without All The Tuning"☆44Updated 2 years ago
- EE-LLM is a framework for large-scale training and inference of early-exit (EE) large language models (LLMs).☆63Updated 11 months ago
- ☆144Updated 2 years ago
- ☆68Updated 10 months ago
- Advanced Ultra-Low Bitrate Compression Techniques for the LLaMA Family of LLMs☆110Updated last year
- Generating Captions via Perceiver-Resampler Cross-Attention Networks☆16Updated 2 years ago
- Generic image compressor for machine learning. Pytorch code for our paper "Lossy compression for lossless prediction".☆117Updated 2 years ago
- Fork of llama.cpp, extended for GPT-NeoX, RWKV-v4, and Falcon models☆29Updated last year
- ☆26Updated last year
- Bamboo-7B Large Language Model☆93Updated last year
- The official code for Dropping Backward Propagation (DropBP)☆30Updated 7 months ago
- GEAR: An Efficient KV Cache Compression Recipefor Near-Lossless Generative Inference of LLM☆163Updated 10 months ago
- ☆137Updated 9 months ago
- A client library for LAION's effort to filter CommonCrawl with CLIP, building a large scale image-text dataset.☆30Updated 2 years ago
- Code repo for the paper BiT Robustly Binarized Multi-distilled Transformer☆108Updated last year
- Structural Pruning for LLaMA☆54Updated 2 years ago
- state-of-the-art lossless audio compression☆56Updated last week
- Work in progress.☆67Updated last week