davidar / eigenGPT
Minimal C++ implementation of GPT2
☆40Updated last year
Alternatives and similar repositories for eigenGPT:
Users that are interested in eigenGPT are comparing it to the libraries listed below
- ☆32Updated 11 months ago
- throwaway GPT inference☆139Updated 11 months ago
- Make triton easier☆47Updated 10 months ago
- Jax like function transformation engine but micro, microjax☆31Updated 6 months ago
- Lightweight Llama 3 8B Inference Engine in CUDA C☆47Updated last month
- Standalone commandline CLI tool for compiling Triton kernels☆17Updated 7 months ago
- MACTA: A Multi-agent Reinforcement Learning Approach for Cache Timing Attacks and Detection☆46Updated 2 years ago
- Utilities for Training Very Large Models☆58Updated 7 months ago
- FlexAttention w/ FlashAttention3 Support☆26Updated 7 months ago
- LLM training in simple, raw C/CUDA☆18Updated last year
- Code for "Meta Learning Backpropagation And Improving It" @ NeurIPS 2021 https://arxiv.org/abs/2012.14905☆31Updated 3 years ago
- Experimental scripts for researching data adaptive learning rate scheduling.☆23Updated last year
- Official Implementation of NeurIPS'23 Paper "Cross-Episodic Curriculum for Transformer Agents"☆31Updated last year
- OMNI: Open-endedness via Models of human Notions of Interestingness☆45Updated 3 months ago
- High quality implementations of imitation and inverse reinforcement learning algorithms☆14Updated last month
- Exploration into the Firefly algorithm in Pytorch☆38Updated 2 months ago
- ☆17Updated last week
- Learn online intrinsic rewards from LLM feedback☆37Updated 4 months ago
- Latent Large Language Models☆18Updated 8 months ago
- ☆64Updated 10 months ago
- Fast reinforcement learning 💨☆24Updated last month
- Transformer with Mu-Parameterization, implemented in Jax/Flax. Supports FSDP on TPU pods.☆30Updated this week
- Fast and memory efficient PyTorch implementation of the Perceiver with FlashAttention.☆26Updated 6 months ago
- Clover: Quantized 4-bit Linear Algebra Library☆112Updated 6 years ago
- ☆37Updated last month
- Gpu benchmark☆60Updated 3 months ago
- ☆18Updated 2 years ago
- asynchronous/distributed speculative evaluation for llama3☆39Updated 8 months ago
- GGML implementation of BERT model with Python bindings and quantization.☆56Updated last year
- Efficiently send large arrays across machines☆16Updated 9 months ago