loeweX / Custom-ConvLayers-PytorchLinks
A reimplementation of 2D Convolutional and Transposed Convolutional Layers in PyTorch, designed for easy modifications and analysis. Includes comprehensive explanations and testing.
☆23Updated 2 years ago
Alternatives and similar repositories for Custom-ConvLayers-Pytorch
Users that are interested in Custom-ConvLayers-Pytorch are comparing it to the libraries listed below
Sorting:
- A case study of efficient training of large language models using commodity hardware.☆68Updated 3 years ago
- ☆56Updated last year
- ☆68Updated 10 months ago
- Triton Implementation of HyperAttention Algorithm☆48Updated 2 years ago
- some common Huggingface transformers in maximal update parametrization (µP)☆87Updated 3 years ago
- Transformer with Mu-Parameterization, implemented in Jax/Flax. Supports FSDP on TPU pods.☆32Updated 8 months ago
- Neural Turing Machines in pytorch☆49Updated 4 years ago
- ☆167Updated 2 years ago
- ☆22Updated last year
- Various transformers for FSDP research☆38Updated 3 years ago
- JAX/Flax implementation of the Hyena Hierarchy☆34Updated 2 years ago
- nanoGPT-like codebase for LLM training☆113Updated 3 months ago
- Explores the ideas presented in Deep Ensembles: A Loss Landscape Perspective (https://arxiv.org/abs/1912.02757) by Stanislav Fort, Huiyi …☆66Updated 5 years ago
- A lightweight PyTorch implementation of the Transformer-XL architecture proposed by Dai et al. (2019)☆37Updated 3 years ago
- ☆39Updated last year
- ☆62Updated last year
- This project shows how to derive the total number of training tokens from a large text dataset from 🤗 datasets with Apache Beam and Data…☆27Updated 3 years ago
- A place to store reusable transformer components of my own creation or found on the interwebs☆72Updated 3 weeks ago
- Machine Learning eXperiment Utilities☆48Updated 6 months ago
- Code for the paper "On the Expressivity Role of LayerNorm in Transformers' Attention" (Findings of ACL'2023)☆57Updated last year
- ☆68Updated last year
- ☆82Updated last year
- Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"☆88Updated last year
- Sparse and discrete interpretability tool for neural networks☆64Updated last year
- Functional local implementations of main model parallelism approaches☆95Updated 2 years ago
- HomebrewNLP in JAX flavour for maintable TPU-Training☆51Updated 2 years ago
- Code Release for "Broken Neural Scaling Laws" (BNSL) paper☆59Updated 2 years ago
- ☆35Updated last year
- Proof-of-concept of global switching between numpy/jax/pytorch in a library.☆18Updated last year
- ☆75Updated 3 years ago