gumran / language-diffusionLinks
A quick implementation of diffusion language models.
☆47Updated 3 months ago
Alternatives and similar repositories for language-diffusion
Users that are interested in language-diffusion are comparing it to the libraries listed below
Sorting:
- Flexible library for merging large language models (LLMs) via evolutionary optimization (ACL 2025 Demo).☆98Updated 6 months ago
- DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule☆64Updated 2 years ago
- Code for minimum-entropy coupling.☆32Updated last month
- ☆62Updated last year
- Simplified implementation of UMAP like dimensionality reduction algorithm☆53Updated last year
- ☆35Updated last year
- Implementations of growing and pruning in neural networks☆22Updated 2 years ago
- A Python package for generating concise, high-quality summaries of a probability distribution☆57Updated 2 weeks ago
- Understanding how features learned by neural networks evolve throughout training☆41Updated last year
- Automatically take good care of your preemptible TPUs☆37Updated 2 years ago
- Sparse and discrete interpretability tool for neural networks☆64Updated last year
- This repository includes code to reproduce the tables in "Loss Landscapes are All You Need: Neural Network Generalization Can Be Explaine…☆40Updated 2 years ago
- PyTorch implementation for "Long Horizon Temperature Scaling", ICML 2023☆20Updated 2 years ago
- ☆33Updated last year
- Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAX☆92Updated 2 years ago
- Deep Networks Grok All the Time and Here is Why☆38Updated last year
- Implementation of Gradient Agreement Filtering, from Chaubard et al. of Stanford, but for single machine microbatches, in Pytorch☆25Updated last year
- Sequence Modeling with Multiresolution Convolutional Memory (ICML 2023)☆127Updated 2 years ago
- Jax like function transformation engine but micro, microjax☆34Updated last year
- LayerNorm(SmallInit(Embedding)) in a Transformer to improve convergence☆61Updated 3 years ago
- ☆60Updated 3 years ago
- Latent Diffusion Language Models☆70Updated 2 years ago
- ☆18Updated last year
- ☆39Updated last year
- Serialize JAX, Flax, Haiku, or Objax model params with 🤗`safetensors`☆47Updated last year
- Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount…☆53Updated 2 years ago
- ☆111Updated 6 months ago
- unofficial re-implementation of "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"☆83Updated 3 years ago
- ☆13Updated last year
- HomebrewNLP in JAX flavour for maintable TPU-Training☆51Updated 2 years ago