nate-gillman / fourier-head
Official implementation of "Fourier Head: Helping Large Language Models Learn Complex Probability Distributions"
☆47Updated this week
Related projects ⓘ
Alternatives and complementary repositories for fourier-head
- Flow-matching algorithms in JAX☆74Updated 3 months ago
- ☆89Updated this week
- Visualizations of the theory behind diffusion models.☆74Updated 6 months ago
- Cellular Automata Accelerated in JAX☆68Updated last week
- Graph neural networks in JAX.☆67Updated 4 months ago
- Official Implementation of "ADOPT: Modified Adam Can Converge with Any β2 with the Optimal Rate"☆227Updated this week
- Repository for code used in the xVal paper☆121Updated 7 months ago
- Implementation of the proposed Spline-Based Transformer from Disney Research☆75Updated this week
- ☆122Updated this week
- Implementation of a framework for Gamengen in Pytorch☆90Updated last month
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"☆84Updated 2 months ago
- ☆53Updated 9 months ago
- Neural Optimal Transport with Lagrangian Costs☆47Updated 3 months ago
- Code repository for Trajectory Flow Matching☆23Updated last week
- A MAD laboratory to improve AI architecture designs 🧪☆95Updated 6 months ago
- 3D Gaussian Splatting in JAX☆54Updated 5 months ago
- ☆46Updated 4 months ago
- Swarming algorithms like PSO, Ant Colony, Sakana, and more in PyTorch 😊☆110Updated this week
- ☆32Updated this week
- Diffusion models in PyTorch☆86Updated 3 weeks ago
- Pytorch-like dataloaders in JAX.☆59Updated 3 weeks ago
- σ-GPT: A New Approach to Autoregressive Models☆59Updated 2 months ago
- Kolmogorov–Arnold Networks with modified activation (using MLP to represent the activation)☆104Updated 2 weeks ago
- A State-Space Model with Rational Transfer Function Representation.☆70Updated 5 months ago
- ☆76Updated 6 months ago
- Meta Flow Matching: Integrating Vector Fields on the Wasserstein Manifold☆39Updated 2 months ago
- ☆40Updated 4 months ago
- RS-IMLE☆35Updated last month
- Scalable neural net training via automatic normalization in the modular norm.☆119Updated 2 months ago
- Simplified Masked Diffusion Language Model☆202Updated this week