loeweX / Custom-ConvLayers-Pytorch
A reimplementation of 2D Convolutional and Transposed Convolutional Layers in PyTorch, designed for easy modifications and analysis. Includes comprehensive explanations and testing.
☆19Updated last year
Alternatives and similar repositories for Custom-ConvLayers-Pytorch:
Users that are interested in Custom-ConvLayers-Pytorch are comparing it to the libraries listed below
- ☆30Updated 5 months ago
- Code for "Accelerating Training with Neuron Interaction and Nowcasting Networks" [to appear at ICLR 2025]☆18Updated last month
- A case study of efficient training of large language models using commodity hardware.☆69Updated 2 years ago
- Embedding Recycling for Language models☆38Updated last year
- QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning P…☆34Updated last year
- reproduces experiments from "Grounding inductive biases in natural images: invariance stems from variations in data"☆17Updated 7 months ago
- Repository for the PopulAtion Parameter Averaging (PAPA) paper☆26Updated last year
- Transformer with Mu-Parameterization, implemented in Jax/Flax. Supports FSDP on TPU pods.☆30Updated this week
- Usable implementation of Emerging Symbol Binding Network (ESBN), in Pytorch☆24Updated 4 years ago
- Triton Implementation of HyperAttention Algorithm☆47Updated last year
- ☆31Updated last week
- QLoRA for Masked Language Modeling☆22Updated last year
- Cyclemoid implementation for PyTorch☆89Updated 3 years ago
- PyTorch implementation for "Long Horizon Temperature Scaling", ICML 2023☆20Updated last year
- Latest Weight Averaging (NeurIPS HITY 2022)☆30Updated last year
- ☆52Updated 6 months ago
- Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing☆48Updated 3 years ago
- ☆51Updated 10 months ago
- Official code for the paper: "Metadata Archaeology"☆19Updated last year
- HomebrewNLP in JAX flavour for maintable TPU-Training☆49Updated last year
- Engineering the state of RNN language models (Mamba, RWKV, etc.)☆32Updated 11 months ago
- Code Release for "Broken Neural Scaling Laws" (BNSL) paper☆58Updated last year
- Contains my experiments with the `big_vision` repo to train ViTs on ImageNet-1k.☆22Updated 2 years ago
- PyTorch implementation of GLOM☆22Updated 3 years ago
- ☆73Updated 2 years ago
- PyTorch implementation for MRL☆18Updated last year
- ☆20Updated last year
- Unofficially Implements https://arxiv.org/abs/2112.05682 to get Linear Memory Cost on Attention for PyTorch☆12Updated 3 years ago
- Automatically take good care of your preemptible TPUs☆36Updated last year
- AdaCat☆49Updated 2 years ago