andrewargatkiny / dense-attentionLinks

This is the repo for DenseAttention and DANet - fast and conceptually simple modification of standard attention and Transformer

☆19

Alternatives and similar repositories for dense-attention

Users that are interested in dense-attention are comparing it to the libraries listed below

Sorting:

skolai / fewbit
Compression schema for gradients of activations in backward pass
☆44Updated 2 years ago
ai-forever / LIBRA
☆20Updated 8 months ago
FusionBrainLab / LLM-Microscope
☆70Updated last year
drisspg / transformer_nuggets
A place to store reusable transformer components of my own creation or found on the interwebs
☆62Updated 2 weeks ago
google-research / jestimator
Amos optimizer with JEstimator lib.
☆82Updated last year
alvarobartt / safejax
Serialize JAX, Flax, Haiku, or Objax model params with 🤗`safetensors`
☆47Updated last year
mgmalek / efficient_cross_entropy
☆121Updated last year
warner-benjamin / optimi
Fast, Modern, and Low Precision PyTorch Optimizers
☆116Updated 3 months ago
glassroom / heinsen_sequence
Code implementing "Efficient Parallelization of a Ubiquitious Sequential Computation" (Heinsen, 2023)
☆97Updated last year
stat-ml / ncvis
Noise-Contrastive Visualization
☆55Updated 2 years ago
graphcore-research / out-of-the-box-fp8-training
Demo of the unit_scaling library, showing how a model can be easily adapted to train in FP8.
☆46Updated last year
HomebrewML / HomebrewNLP-torch
A case study of efficient training of large language models using commodity hardware.
☆68Updated 3 years ago
LIONS-EPFL / scion
☆49Updated last month
HomebrewML / Olmax
HomebrewNLP in JAX flavour for maintable TPU-Training
☆51Updated last year
davisyoshida / abnormal-floats
Code for the note "NF4 Isn't Information Theoretically Optimal (and that's Good)
☆21Updated 2 years ago
AnswerDotAI / fastkmeans
☆86Updated 5 months ago
lixilinx / psgd_torch
Pytorch implementation of preconditioned stochastic gradient descent (Kron and affine preconditioner, low-rank approximation precondition…
☆188Updated this week
hal-314 / fastai-batch-size-finder
Implementation of OpenAI paper with Simple Noise Scale on Fastai V2
☆20Updated 4 years ago
goodevening13 / aquakv
☆18Updated last month
meta-pytorch / torchfix
TorchFix - a linter for PyTorch-using code with autofix support
☆151Updated 3 months ago
srush / triton-autodiff
Experiment of using Tangent to autodiff triton
☆80Updated last year
MadryLab / platinum-benchmarks
☆38Updated 7 months ago
PgLoLo / optiacts
☆20Updated last year
thecharlieblake / lovely-llama
An implementation of the Llama architecture, to instruct and delight
☆21Updated 6 months ago
lianakoleva / no-libtorch-compile
☆21Updated 9 months ago
ethansmith2000 / fsdp_optimizers
supporting pytorch FSDP for optimizers
☆84Updated last year
cloneofsimo / min-fsdp
☆91Updated last year
arogozhnikov / adamw_bfloat16
AdamW optimizer for bfloat16 models in pytorch 🔥.
☆38Updated last year
cene555 / ru-clip-tiny
RuCLIP tiny (Russian Contrastive Language–Image Pretraining) is a neural network trained to work with different pairs (images, texts).
☆35Updated 3 years ago
NX-AI / mlstm_kernels
Tiled Flash Linear Attention library for fast and efficient mLSTM Kernels.
☆77Updated 2 weeks ago