vincent-163 / transformer-arithmetic
☆13Updated last year
Related projects ⓘ
Alternatives and complementary repositories for transformer-arithmetic
- ☆26Updated 2 years ago
- ☆8Updated 3 months ago
- One stop shop for all things carp☆59Updated 2 years ago
- Latent Diffusion Language Models☆67Updated last year
- Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing☆47Updated 2 years ago
- minGPT in JAX☆46Updated 2 years ago
- Advanced Reasoning Benchmark Dataset for LLMs☆45Updated last year
- ☆54Updated 2 years ago
- Mechanistic Interpretability for Transformer Models☆49Updated 2 years ago
- A dataset of alignment research and code to reproduce it☆69Updated last year
- A library to create and manage configuration files, especially for machine learning projects.☆77Updated 2 years ago
- Experiments on GPT-3's ability to fit numerical models in-context.☆14Updated 2 years ago
- RWKV model implementation☆38Updated last year
- Probabilistic LLM evaluations. [CogSci2023; ACL2023]☆72Updated 3 months ago
- [ICML 2023] "Outline, Then Details: Syntactically Guided Coarse-To-Fine Code Generation", Wenqing Zheng, S P Sharan, Ajay Kumar Jaiswal, …☆37Updated last year
- Hidden Engrams: Long Term Memory for Transformer Model Inference☆34Updated 3 years ago
- Scratchpad/Chain-of-Thought Prompts☆12Updated 2 years ago
- Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAX☆79Updated 9 months ago
- RWKV-v2-RNN trained on the Pile. See https://github.com/BlinkDL/RWKV-LM for details.☆66Updated 2 years ago
- Documentation for dynamic machine learning systems.☆27Updated 2 months ago
- Meta-learning inductive biases in the form of useful conserved quantities.☆37Updated 2 years ago
- ☆12Updated this week
- ☆18Updated last year
- Implementation of deep implicit attention in PyTorch☆63Updated 3 years ago
- ☆43Updated 2 months ago
- Diffusion-based markup-to-image generation☆78Updated last year
- Language-annotated Abstraction and Reasoning Corpus☆78Updated last year
- BigKnow2022: Bringing Language Models Up to Speed☆14Updated last year