davidar / eigenGPTLinks
Minimal C++ implementation of GPT2
☆40Updated last year
Alternatives and similar repositories for eigenGPT
Users that are interested in eigenGPT are comparing it to the libraries listed below
Sorting:
- Make triton easier☆47Updated 11 months ago
- Exploration into the Firefly algorithm in Pytorch☆39Updated 3 months ago
- OMNI: Open-endedness via Models of human Notions of Interestingness☆49Updated 4 months ago
- ☆53Updated last year
- Lightweight Llama 3 8B Inference Engine in CUDA C☆46Updated 2 months ago
- MACTA: A Multi-agent Reinforcement Learning Approach for Cache Timing Attacks and Detection☆46Updated 2 years ago
- FlexAttention w/ FlashAttention3 Support☆26Updated 8 months ago
- Code for "Meta Learning Backpropagation And Improving It" @ NeurIPS 2021 https://arxiv.org/abs/2012.14905☆32Updated 3 years ago
- Experimental scripts for researching data adaptive learning rate scheduling.☆23Updated last year
- Fast reinforcement learning 💨☆24Updated 2 months ago
- CUDA and Triton implementations of Flash Attention with SoftmaxN.☆70Updated last year
- Transformer with Mu-Parameterization, implemented in Jax/Flax. Supports FSDP on TPU pods.☆30Updated last week
- RWKV model implementation☆38Updated last year
- Utilities for Training Very Large Models☆58Updated 8 months ago
- Loop Nest - Linear algebra compiler and code generator.☆22Updated 2 years ago
- throwaway GPT inference☆139Updated last year
- ☆18Updated 2 years ago
- Demo of the unit_scaling library, showing how a model can be easily adapted to train in FP8.☆44Updated 10 months ago
- This code accompanies the paper "Leveraging Skills from Unlabeled Prior Data for Efficient Online Exploration."☆28Updated 7 months ago
- A tiny deep learning library written in Java☆25Updated 2 years ago
- JAX implementations of RWKV☆19Updated last year
- Solve puzzles. Learn CUDA.☆64Updated last year
- Implementation of Hyena Hierarchy in JAX☆10Updated 2 years ago
- Standalone commandline CLI tool for compiling Triton kernels☆18Updated 8 months ago
- A really tiny autograd engine☆94Updated last week
- ☆29Updated 6 months ago
- Token Omission Via Attention☆126Updated 7 months ago
- Experiments with BitNet inference on CPU☆55Updated last year
- ☆78Updated 11 months ago
- Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"☆37Updated last year