loeweX / Custom-ConvLayers-PytorchLinks
A reimplementation of 2D Convolutional and Transposed Convolutional Layers in PyTorch, designed for easy modifications and analysis. Includes comprehensive explanations and testing.
☆23Updated 2 years ago
Alternatives and similar repositories for Custom-ConvLayers-Pytorch
Users that are interested in Custom-ConvLayers-Pytorch are comparing it to the libraries listed below
Sorting:
- A case study of efficient training of large language models using commodity hardware.☆68Updated 3 years ago
- ☆68Updated 10 months ago
- Functional local implementations of main model parallelism approaches☆95Updated 2 years ago
- Official repository for the paper "Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks"☆61Updated 3 years ago
- ☆75Updated 3 years ago
- Sparse and discrete interpretability tool for neural networks☆64Updated last year
- some common Huggingface transformers in maximal update parametrization (µP)☆87Updated 3 years ago
- HomebrewNLP in JAX flavour for maintable TPU-Training☆51Updated 2 years ago
- Standalone Product Key Memory module in Pytorch - for augmenting Transformer models☆87Updated 2 months ago
- A place to store reusable transformer components of my own creation or found on the interwebs☆70Updated 2 weeks ago
- Triton Implementation of HyperAttention Algorithm☆48Updated 2 years ago
- Multi-framework implementation of Deep Kernel Shaping and Tailored Activation Transformations, which are methods that modify neural netwo…☆75Updated 6 months ago
- ☆22Updated last year
- Supercharge huggingface transformers with model parallelism.☆77Updated 6 months ago
- ☆56Updated last year
- Implementation of a Transformer that Ponders, using the scheme from the PonderNet paper☆81Updated 4 years ago
- Various transformers for FSDP research☆38Updated 3 years ago
- Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount…☆53Updated 2 years ago
- ☆82Updated last year
- Code Release for "Broken Neural Scaling Laws" (BNSL) paper☆59Updated 2 years ago
- Experiments for efforts to train a new and improved t5☆76Updated last year
- JAX/Flax implementation of the Hyena Hierarchy☆34Updated 2 years ago
- Explores the ideas presented in Deep Ensembles: A Loss Landscape Perspective (https://arxiv.org/abs/1912.02757) by Stanislav Fort, Huiyi …☆66Updated 5 years ago
- Engineering the state of RNN language models (Mamba, RWKV, etc.)☆32Updated last year
- Fastai community entry to 2020 Reproducibility Challenge☆17Updated 3 years ago
- nanoGPT-like codebase for LLM training☆113Updated 2 months ago
- ☆68Updated last year
- An implementation of Transformer with Expire-Span, a circuit for learning which memories to retain☆34Updated 5 years ago
- Implementation of the Kalman Filtering Attention proposed in "Kalman Filtering Attention for User Behavior Modeling in CTR Prediction"☆59Updated 2 years ago
- ☆31Updated last week