COLVERTYETY / GraphWriter
A wrapper for TensorBoard SummaryWriter with real-time terminal visualization using the Rich library.
☆18Updated last year
Related projects ⓘ
Alternatives and complementary repositories for GraphWriter
- Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing☆47Updated 2 years ago
- Exploration into the proposed "Self Reasoning Tokens" by Felipe Bonetto☆53Updated 6 months ago
- My explorations into editing the knowledge and memories of an attention network☆34Updated last year
- Implementation of some personal helper functions for Einops, my most favorite tensor manipulation library ❤️☆52Updated last year
- A convolution-free, transformer-only version of the CycleGAN framework☆32Updated 2 years ago
- ☆29Updated 2 years ago
- A python library for highly configurable transformers - easing model architecture search and experimentation.☆49Updated 2 years ago
- AdaCat☆49Updated 2 years ago
- ☆31Updated 2 months ago
- A scalable implementation of diffusion and flow-matching with XGBoost models, applied to calorimeter data.☆17Updated 2 weeks ago
- Another attempt at a long-context / efficient transformer by me☆37Updated 2 years ago
- Code for the paper PermuteFormer☆42Updated 3 years ago
- Training and evaluation code for the paper "Headless Language Models: Learning without Predicting with Contrastive Weight Tying" (https:/…☆23Updated 7 months ago
- Local Attention - Flax module for Jax☆20Updated 3 years ago
- Describe the format of image/text datasets☆11Updated 2 years ago
- High performance pytorch modules☆18Updated last year
- Implementation of a Transformer using ReLA (Rectified Linear Attention) from https://arxiv.org/abs/2104.07012☆49Updated 2 years ago
- Reference implementation of "Softmax Attention with Constant Cost per Token" (Heinsen, 2024)☆24Updated 5 months ago
- Scaling Sparse Fine-Tuning to Large Language Models☆17Updated 9 months ago
- Hacks for PyTorch☆17Updated last year
- Official Repository for Efficient Linear-Time Attention Transformers.☆18Updated 5 months ago
- A simple Transformer where the softmax has been replaced with normalization☆18Updated 4 years ago
- Blog post☆16Updated 9 months ago
- Implementation of Insertion-deletion Denoising Diffusion Probabilistic Models☆30Updated 2 years ago