Minimal pretraining script for language modeling in PyTorch. Supporting torch compilation and DDP. It includes a model implementation and a data preprocessing script.
☆47Mar 16, 2026Updated last month
Alternatives and similar repositories for plainLM
Users that are interested in plainLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Linear Attention for Efficient Bidirectional Sequence Modeling☆16May 13, 2025Updated 11 months ago
- Code for implementing central flows☆44Sep 5, 2025Updated 7 months ago
- Official implementation for the paper "Controlled Sparsity via Constrained Optimization"☆12Aug 10, 2022Updated 3 years ago
- Hessian trace estimation using PyTorch and Hutch++☆20Oct 29, 2020Updated 5 years ago
- Official repository for the paper "Exploring the Promise and Limits of Real-Time Recurrent Learning" (ICLR 2024)☆13Jun 11, 2025Updated 10 months ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- ☆180Updated this week
- ☆13Jun 26, 2025Updated 10 months ago
- Code for the paper: Why Transformers Need Adam: A Hessian Perspective☆65Mar 11, 2025Updated last year
- Implementation of the psquare algorithm for quantile value estimation☆10Apr 21, 2024Updated 2 years ago
- KV Cache Steering for Inducing Reasoning in Small Language Models☆48Jul 24, 2025Updated 9 months ago
- A Signal Propagation Perspective for Pruning Neural Networks at Initialization☆14Jun 23, 2020Updated 5 years ago
- Optimize nonsmooth functions with gradient sampling, (ns) BFGS...☆10Feb 11, 2025Updated last year
- ☆18Jun 23, 2023Updated 2 years ago
- Things I care about☆13Jul 10, 2022Updated 3 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- A set of solutions to ETHZ ROS lectures☆13Jul 19, 2017Updated 8 years ago
- PyTorch implementation of the paper The Lottery Ticket Hypothesis for Object Recognition☆23Apr 22, 2021Updated 5 years ago
- A Python Package for Portfolio Optimization using the Critical Line Algorithm☆27Aug 1, 2023Updated 2 years ago
- This repository demonstrates the application of our proposed task-free continual learning method on a synthetic experiment.☆13Jun 24, 2019Updated 6 years ago
- Free subscriptions for students.☆15Mar 4, 2020Updated 6 years ago
- [CVPR 2025 Highlight] FIMA-Q: Post-Training Quantization for Vision Transformers by Fisher Information Matrix Approximation☆28Jun 16, 2025Updated 10 months ago
- SMART introduces a novel test-time framework where Small Language Models (SLMs) reason step-by-step, and Large Language Models (LLMs) pro…☆11Jul 9, 2025Updated 9 months ago
- AGaLiTe: Approximate Gated Linear Transformers for Online Reinforcement Learning (Published in TMLR)☆23Oct 15, 2024Updated last year
- [CVPR 2025] QuartDepth☆17Mar 24, 2025Updated last year
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Distributed pretraining of large language models (LLMs) on cloud TPU slices, with Jax and Equinox.☆25Sep 29, 2024Updated last year
- ☆70Mar 2, 2026Updated last month
- Finetune Google's pre-trained ViT models from HuggingFace's model hub.☆19Apr 4, 2021Updated 5 years ago
- Benchmarking Optimizers for LLM Pretraining☆57Dec 30, 2025Updated 3 months ago
- ☆15Jan 12, 2025Updated last year
- Source code of our TNNLS paper "Boosting Convolutional Neural Networks with Middle Spectrum Grouped Convolution"☆12Apr 14, 2023Updated 3 years ago
- ☆16Dec 9, 2023Updated 2 years ago
- ☆14Jul 14, 2025Updated 9 months ago
- [ICML 2025] Official PyTorch implementation of "NegMerge: Sign-Consensual Weight Merging for Machine Unlearning"☆14Nov 25, 2025Updated 5 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆15Apr 11, 2024Updated 2 years ago
- Implementation of the Prioritized Option-Critic on the Four-Rooms Environment☆17Dec 24, 2017Updated 8 years ago
- Source code for my PhD thesis: Backpropagation Beyond the Gradient☆21Feb 25, 2023Updated 3 years ago
- Mastodoner is a command line tool (and Python library) for archiving Mastodon, a decentralized micro-blogging social network.☆13Oct 21, 2024Updated last year
- ☆29Mar 10, 2026Updated last month
- Non official implementation of the Linear Recurrent Unit (LRU, Orvieto et al. 2023)☆62Sep 3, 2025Updated 7 months ago
- ☆17Mar 10, 2025Updated last year