Algomancer / VCRegLinks

Minimal Implimentation of VCRec (2024) for collapse provention.

☆17

Alternatives and similar repositories for VCReg

Users that are interested in VCReg are comparing it to the libraries listed below

Sorting:

lucidrains / holodeck-pytorch
Implementation of a holodeck, written in Pytorch
☆18Updated last year
cloneofsimo / zeroshampoo
☆34Updated 10 months ago
crowsonkb / torch-dist-utils
Utilities for PyTorch distributed
☆24Updated 4 months ago
lucidrains / quartic-transformer
Exploring an idea where one forgets about efficiency and carries out attention across each edge of the nodes (tokens)
☆52Updated 3 months ago
ClashLuke / SOAP
☆21Updated 8 months ago
lucidrains / GAF-microbatch-pytorch
Implementation of Gradient Agreement Filtering, from Chaubard et al. of Stanford, but for single machine microbatches, in Pytorch
☆25Updated 5 months ago
ethansmith2000 / TransformerExperiments
☆19Updated 2 months ago
smearle / autoverse
Generative cellular automaton-like learning environments for RL.
☆19Updated 5 months ago
Z-T-WANG / LaProp-Optimizer
Codes accompanying the paper "LaProp: a Better Way to Combine Momentum with Adaptive Gradient"
☆29Updated 4 years ago
ClashLuke / tpucare
Automatically take good care of your preemptible TPUs
☆36Updated 2 years ago
lucidrains / gateloop-transformer
Implementation of GateLoop Transformer in Pytorch and Jax
☆89Updated last year
BlinkDL / SmallInitEmb
LayerNorm(SmallInit(Embedding)) in a Transformer to improve convergence
☆59Updated 3 years ago
crowsonkb / LDLM
Latent Diffusion Language Models
☆68Updated last year
lucaslingle / mu_transformer
Transformer with Mu-Parameterization, implemented in Jax/Flax. Supports FSDP on TPU pods.
☆31Updated last month
idiap / sigma-gpt
σ-GPT: A New Approach to Autoregressive Models
☆65Updated 11 months ago
HomebrewML / TrueGrad
PyTorch interface for TrueGrad Optimizers
☆42Updated last year
layer6ai-labs / calo-forest
A scalable implementation of diffusion and flow-matching with XGBoost models, applied to calorimeter data.
☆18Updated 8 months ago
lucidrains / grokfast-pytorch
Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"
☆101Updated 6 months ago
matthias-wright / jax-fid
FID computation in Jax/Flax.
☆28Updated last year
codekansas / rwkv
RWKV model implementation
☆38Updated 2 years ago
main-horse / hnet
H-Net Dynamic Hierarchical Architecture
☆22Updated this week
okarthikb / state-space-models
☆27Updated last year
njwfish / DistributionEmbeddings
☆25Updated last month
EleutherAI / rnngineering
Engineering the state of RNN language models (Mamba, RWKV, etc.)
☆32Updated last year
google-deepmind / dks
Multi-framework implementation of Deep Kernel Shaping and Tailored Activation Transformations, which are methods that modify neural netwo…
☆71Updated 2 weeks ago
crowsonkb / dice-mc
DiCE: The Infinitely Differentiable Monte-Carlo Estimator
☆31Updated last year
dvruette / barrel-rec-pytorch
☆53Updated last year
NousResearch / StripedHyenaTrainer
☆61Updated last year
borisdayma / clip-jax
Train vision models using JAX and 🤗 transformers
☆98Updated 3 months ago
yk / llmvm
☆30Updated last year