catid / cuda_float_compressLinks

Python package for compressing floating-point PyTorch tensors

☆12

Alternatives and similar repositories for cuda_float_compress

Users that are interested in cuda_float_compress are comparing it to the libraries listed below

Sorting:

NolanoOrg / SpectraSuite
☆52Updated last year
catid / lllm
Latent Large Language Models
☆19Updated last year
Zyphra / tree_attention
Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters
☆130Updated 11 months ago
tensoic / Cerule
Cerule - A Tiny Mighty Vision Model
☆67Updated last week
RiddleHe / llm-interp
A collection of lightweight interpretability scripts to understand how LLMs think
☆66Updated last week
yandex-research / swarm
Official code for "SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient"
☆145Updated last year
tiiuae / onebitllms
Lightweight toolkit package to train and fine-tune 1.58bit Language models
☆98Updated 6 months ago
google-deepmind / asyncdiloco
☆47Updated last year
RAIVNLab / AdANNS
Code repository for the paper - "AdANNS: A Framework for Adaptive Semantic Search"
☆65Updated 2 years ago
foundation-model-stack / bamba
Train, tune, and infer Bamba model
☆136Updated 5 months ago
Aleph-Alpha-Research / trigrams
☆58Updated this week
xaedes / llama.cpp
Port of Facebook's LLaMA model in C/C++
☆21Updated 2 years ago
N8python / n8loom
A tree-based prefix cache library that allows rapid creation of looms: hierarchal branching pathways of LLM generations.
☆72Updated 9 months ago
PrimeIntellect-ai / pccl
PCCL (Prime Collective Communications Library) implements fault tolerant collective communications over IP
☆138Updated 2 months ago
eth-easl / deltazip
Compression for Foundation Models
☆35Updated 3 months ago
cchan / nanoGPT-fp8
☆13Updated 2 years ago
RobertCsordas / moe
Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"
☆38Updated 5 months ago
schwartz-lab-NLP / TOVA
Token Omission Via Attention
☆127Updated last year
NousResearch / StripedHyenaTrainer
☆62Updated last year
okarthikb / state-space-models
☆28Updated last year
nisten / grokadamw
new optimizer
☆20Updated last year
kaiokendev / cutoff-len-is-context-len
Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit
☆62Updated 2 years ago
sekstini / basedxl
☆18Updated last year
zaydzuhri / flame
Fork of Flame repo for training of some new stuff in development
☆19Updated this week
eth-easl / fmengine
Utilities for Training Very Large Models
☆58Updated last year
Leeroo-AI / leeroo_orchestrator
The implementation of "Leeroo Orchestrator: Elevating LLMs Performance Through Model Integration"
☆55Updated last year
Zyphra / zcookbook
Training hybrid models for dummies.
☆29Updated 2 weeks ago
euclaise / supertrainer2000
☆50Updated last year
lucidrains / mind-evolution
Implementation of Mind Evolution, Evolving Deeper LLM Thinking, from Deepmind
☆57Updated 5 months ago
softmax1 / Flash-Attention-Softmax-N
CUDA and Triton implementations of Flash Attention with SoftmaxN.
☆73Updated last year