Minimal pretraining script for language modeling in PyTorch. Supporting torch compilation and DDP. It includes a model implementation and a data preprocessing script.
☆46Mar 16, 2026Updated 3 weeks ago
Alternatives and similar repositories for plainLM
Users that are interested in plainLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Linear Attention for Efficient Bidirectional Sequence Modeling☆16May 13, 2025Updated 10 months ago
- Gradient-based Hyperparameter Optimization Over Long Horizons☆14Sep 29, 2021Updated 4 years ago
- This repository is for setting-up cuda-9/8, nvidia-396/387/384 driver, OpenCV-3.3, ROS Kinetic, Tensorflow-1.11/1.7/1.4/1.2.1, Pytorch-0.…☆30Jul 7, 2022Updated 3 years ago
- ☆17Oct 25, 2022Updated 3 years ago
- Artistic style transfer has been part of the quickly growing AI Art community in recent times. Pioneered by Gatys et al this class of met…☆30Mar 14, 2022Updated 4 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- Official implementation for the paper "Controlled Sparsity via Constrained Optimization"☆12Aug 10, 2022Updated 3 years ago
- Official repository for the paper "Exploring the Promise and Limits of Real-Time Recurrent Learning" (ICLR 2024)☆13Jun 11, 2025Updated 9 months ago
- Code for the paper: Why Transformers Need Adam: A Hessian Perspective☆64Mar 11, 2025Updated last year
- A Signal Propagation Perspective for Pruning Neural Networks at Initialization☆14Jun 23, 2020Updated 5 years ago
- ☆13Mar 10, 2026Updated 3 weeks ago
- Things I care about☆13Jul 10, 2022Updated 3 years ago
- A set of solutions to ETHZ ROS lectures☆13Jul 19, 2017Updated 8 years ago
- PyTorch implementation of the paper The Lottery Ticket Hypothesis for Object Recognition☆23Apr 22, 2021Updated 4 years ago
- Convert CVXPY expressions to PyTorch expressions☆18Jul 8, 2025Updated 9 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- [ICML 2024] Sparse Model Inversion: Efficient Inversion of Vision Transformers with Less Hallucination☆14Apr 29, 2025Updated 11 months ago
- MoMo: Momentum Models for Adaptive Learning Rates☆19Jun 12, 2024Updated last year
- Free subscriptions for students.☆15Mar 4, 2020Updated 6 years ago
- AGaLiTe: Approximate Gated Linear Transformers for Online Reinforcement Learning (Published in TMLR)☆23Oct 15, 2024Updated last year
- Pytorch2Jax is a small Python library that provides functions that wraps PyTorch models into Jax functions and Flax modules.☆21Feb 20, 2023Updated 3 years ago
- ☆15Jan 12, 2026Updated 2 months ago
- Notes and code for Programming Massively Parallel Processors☆13Mar 29, 2025Updated last year
- Benchmarking Optimizers for LLM Pretraining☆57Dec 30, 2025Updated 3 months ago
- Source code of our TNNLS paper "Boosting Convolutional Neural Networks with Middle Spectrum Grouped Convolution"☆12Apr 14, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- ☆15Jan 12, 2025Updated last year
- ☆16Dec 9, 2023Updated 2 years ago
- [ICML 2025] Official PyTorch implementation of "NegMerge: Sign-Consensual Weight Merging for Machine Unlearning"☆14Nov 25, 2025Updated 4 months ago
- Implementation of the Prioritized Option-Critic on the Four-Rooms Environment☆17Dec 24, 2017Updated 8 years ago
- Source code for my PhD thesis: Backpropagation Beyond the Gradient☆21Feb 25, 2023Updated 3 years ago
- Hyperparameter search and metric visualization tool for personal research.☆18Mar 15, 2026Updated 3 weeks ago
- ☆10Apr 24, 2024Updated last year
- Non official implementation of the Linear Recurrent Unit (LRU, Orvieto et al. 2023)☆62Sep 3, 2025Updated 7 months ago
- ☆57Jun 23, 2025Updated 9 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- The implementation for FREE-Merging: Fourier Transform for Model Merging with Lightweight Experts (ICCV25)☆14Jun 26, 2025Updated 9 months ago
- Keras dataset reader (Sequence) using buckets for RNNs☆25May 30, 2018Updated 7 years ago
- ☆15Aug 19, 2024Updated last year
- ☆23Dec 16, 2025Updated 3 months ago
- ☆12Jul 30, 2025Updated 8 months ago
- ☆29Nov 29, 2023Updated 2 years ago
- ☆11Sep 20, 2024Updated last year