fire / pytorch-nncpLinks

☆12

Alternatives and similar repositories for pytorch-nncp

Users that are interested in pytorch-nncp are comparing it to the libraries listed below

Sorting:

mohit1997 / Dzip-torch
Dzip: improved general-purpose lossless compression based on novel neural network modeling
☆74Updated 3 years ago
mynotwo / A-Fast-Transformer-based-General-Purpose-LosslessCompressor
This repository contains the source code and dataset link mentioned in WWW 2022 accepted paper "TRACE:A Fast Transformer-based General-Pu…
☆30Updated 3 years ago
erika-n / GPTzip
An implementation of LLMzip using GPT-2
☆13Updated 2 years ago
vcskaushik / LLMzip
☆63Updated 10 months ago
facebookresearch / NeuralCompression
A collection of tools for neural compression enthusiasts.
☆582Updated last year
zzd1992 / FlashWindowAttention
Speedup the attention computation of Swin Transformer
☆25Updated 5 months ago
SmerkyG / GoldFinch-paper
GoldFinch and other hybrid transformer components
☆12Updated last month
Cornell-RelaxML / QuIP
Code for paper: "QuIP: 2-Bit Quantization of Large Language Models With Guarantees"
☆390Updated last year
chu-tianxiang / QuIP-for-all
QuIP quantization
☆61Updated last year
google-deepmind / language_modeling_is_compression
☆165Updated last year
thu-ml / low-bit-optimizers
Low-bit optimizers for PyTorch
☆132Updated 2 years ago
zhuzilin / whisper-openvino
openvino version of openai/whisper
☆178Updated 2 years ago
tridao / flash-attention-wheels
☆57Updated 2 years ago
mesolitica / vllm-whisper
A high-throughput and memory-efficient inference and serving engine for Whisper, https://mesolitica.com/blog/vllm-whisper
☆31Updated last year
lucidrains / CoLT5-attention
Implementation of the conditionally routed attention in the CoLT5 architecture, in Pytorch
☆230Updated last year
erogol / BlaGPT
Experimental playground for benchmarking language model (LM) architectures, layers, and tricks on smaller datasets. Designed for flexible…
☆87Updated 2 weeks ago
facebookresearch / Qinco
Residual Quantization with Implicit Neural Codebooks
☆105Updated last month
kyegomez / MambaByte
Implementation of MambaByte in "MambaByte: Token-free Selective State Space Model" in Pytorch and Zeta
☆126Updated last month
HazyResearch / flash-fft-conv
FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores
☆333Updated 11 months ago
3outeille / GPTQ-for-RWKV
☆13Updated 2 years ago
maum-ai / pnlp-mixer
Unofficial PyTorch Implementation for pNLP-Mixer: an Efficient all-MLP Architecture for Language (https://arxiv.org/abs/2202.04350)
☆65Updated 3 years ago
lucidrains / PEER-pytorch
Pytorch implementation of the PEER block from the paper, Mixture of A Million Experts, by Xu Owen He at Deepmind
☆131Updated 3 weeks ago
syncdoth / RetNet
Huggingface compatible implementation of RetNet (Retentive Networks, https://arxiv.org/pdf/2307.08621.pdf) including parallel, recurrent,…
☆227Updated last year
futo-org / whisper-acft
☆166Updated last year
HazyResearch / based
Code for exploring Based models from "Simple linear attention language models balance the recall-throughput tradeoff"
☆243Updated 5 months ago
Entropy-xcy / bitnet158
☆70Updated last year
fkodom / yet-another-retnet
A simple but robust PyTorch implementation of RetNet from "Retentive Network: A Successor to Transformer for Large Language Models" (http…
☆106Updated 2 years ago
RWKV / rwkv-onnx
A converter and basic tester for rwkv onnx
☆43Updated last year
bzhangGo / rmsnorm
Root Mean Square Layer Normalization
☆258Updated 2 years ago
kyegomez / FlashAttention20
Get down and dirty with FlashAttention2.0 in pytorch, plug in and play no complex CUDA kernels
☆112Updated 2 years ago