nikopj / FlashAttention.jlLinks
Julia implementation of flash-attention operation for neural networks.
☆11Updated 2 years ago
Alternatives and similar repositories for FlashAttention.jl
Users that are interested in FlashAttention.jl are comparing it to the libraries listed below
Sorting:
- Julia implementation of the Flash Attention algorithm☆19Updated 2 years ago
- Simple, blazing fast, transformer components.☆23Updated 2 years ago
- Distributed Data Parallel Training of Deep Neural Networks☆57Updated last year
- Differentiable matrix factorizations using ImplicitDifferentiation.jl.☆30Updated last year
- ☆19Updated last year
- Integrating Neural Ordinary Differential Equations, the Method of Lines, and Graph Neural Networks☆18Updated last year
- Implicit Layer Machine Learning via Deep Equilibrium Networks, O(1) backpropagation with accelerated convergence.☆57Updated 3 weeks ago
- "Maybe we have our own magic."☆47Updated 5 years ago
- Differentiate python calls from Julia☆55Updated 3 years ago
- Data structures for graph neural network☆18Updated last year
- GPU integrations for Dagger.jl☆54Updated 2 months ago
- Programming Gemm Kernels on NVIDIA GPUs with Tensor Cores in Julia☆42Updated 3 weeks ago
- ☆22Updated 2 years ago
- Machine learning from scratch in Julia☆32Updated 6 months ago
- A Julia wrapper for the NVIDIA Collective Communications Library.☆28Updated 2 weeks ago
- Code for paper https://arxiv.org/abs/2306.07961☆53Updated last year
- Structure Preserving Machine Learning Models in Julia☆50Updated 2 weeks ago
- Accelerate your ML research using pre-built Deep Learning Models with Lux☆42Updated this week
- Optimisers.jl defines many standard optimisers and utilities for learning loops.☆89Updated 5 months ago
- Julia implementation of stochastic optimization algorithms for large-scale optimal transport.☆18Updated 4 years ago
- Physics-Enhanced Regression for Initial Value Problems☆20Updated last year
- ☆28Updated 3 years ago
- Curated list of high-quality operators for deep learning in Julia☆40Updated 3 years ago
- Reusable functionality for defining custom attention/transformer layers.☆53Updated 3 weeks ago
- Implementations of Infinitesimal Continuous Normalizing Flows Algorithms in Julia☆27Updated this week
- Data-parallelism on CUDA using Transducers.jl and for loops (FLoops.jl)☆57Updated 2 years ago
- Immutables as mutables, mutables as immutables.☆22Updated 6 months ago
- Cellular automata creation and analysis tools