davidar / eigenGPTLinks
Minimal C++ implementation of GPT2
☆40Updated last year
Alternatives and similar repositories for eigenGPT
Users that are interested in eigenGPT are comparing it to the libraries listed below
Sorting:
- Make triton easier☆46Updated last year
- A fork of llama3.c used to do some R&D on inferencing☆22Updated 6 months ago
- A thin, highly portable toolkit for efficiently compiling dense loop-based computation.☆148Updated 2 years ago
- Code for "Meta Learning Backpropagation And Improving It" @ NeurIPS 2021 https://arxiv.org/abs/2012.14905☆32Updated 3 years ago
- FlexAttention w/ FlashAttention3 Support☆26Updated 8 months ago
- Exploration into the Firefly algorithm in Pytorch☆40Updated 4 months ago
- throwaway GPT inference☆140Updated last year
- CUDA and Triton implementations of Flash Attention with SoftmaxN.☆70Updated last year
- A really tiny autograd engine☆94Updated last month
- Lightweight Llama 3 8B Inference Engine in CUDA C☆47Updated 3 months ago
- Standalone commandline CLI tool for compiling Triton kernels☆17Updated 9 months ago
- MACTA: A Multi-agent Reinforcement Learning Approach for Cache Timing Attacks and Detection☆46Updated 2 years ago
- ☆53Updated last year
- ☆32Updated last year
- Inference of Mamba models in pure C☆187Updated last year
- RWKV model implementation☆38Updated last year
- ☆78Updated 11 months ago
- RWKV in nanoGPT style☆191Updated last year
- Experimental scripts for researching data adaptive learning rate scheduling.☆23Updated last year
- OMNI: Open-endedness via Models of human Notions of Interestingness☆50Updated 4 months ago
- LayerNorm(SmallInit(Embedding)) in a Transformer to improve convergence☆59Updated 3 years ago
- Fast modular code to create and train cutting edge LLMs☆67Updated last year
- ☆18Updated 2 years ago
- Multi-agent simulator in Jax for research and teaching in AI & ALife☆29Updated 3 weeks ago
- [NeurIPS 2023] Sparse Modular Activation for Efficient Sequence Modeling☆37Updated last year
- Experiments with BitNet inference on CPU☆54Updated last year
- Using FlexAttention to compute attention with different masking patterns☆44Updated 9 months ago
- ☆39Updated last month
- Personal solutions to the Triton Puzzles☆19Updated 11 months ago
- Inference Llama 2 in one file of pure C++☆83Updated last year