☆70Jun 9, 2026Updated last week
Alternatives and similar repositories for assignment3-scaling
Users that are interested in assignment3-scaling are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆55May 7, 2026Updated last month
- 6,080-param transformer achieving 100% accuracy on 10-digit addition. Trained from scratch in 10 minutes.☆22Feb 19, 2026Updated 3 months ago
- ☆34Nov 30, 2025Updated 6 months ago
- The Full Spectrum of Deepnet Hessians at Scale: Dynamics with SGD Training and Sample Size☆19May 19, 2019Updated 7 years ago
- Benchmarking Optimizers for LLM Pretraining☆60May 3, 2026Updated last month
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆35Jul 5, 2023Updated 2 years ago
- This nanoGPT-lecture code git, including Andrej Karpathy's nanoGPT, ng-vedio-lecture, gpt_dev.ipynb and my learning notes. Welcome to lik…☆19May 21, 2024Updated 2 years ago
- Official Code for What Makes and Breaks Safety Fine-tuning? A Mechanistic Study (NeurIPS 2024)☆12Oct 31, 2024Updated last year
- Flax (JAX) implementation of Progressive Growing of GANs for Improved Quality, Stability, and Variation☆12May 24, 2021Updated 5 years ago
- A toolkit that provides a range of model diffing techniques including a UI to visualize them interactively.☆74Apr 15, 2026Updated 2 months ago
- Code for paper: "Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines"☆11Oct 11, 2024Updated last year
- ☆61Sep 17, 2025Updated 9 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆13Nov 27, 2023Updated 2 years ago
- The "CoT-ICL Lab" framework for meta-training transformers☆11Jun 3, 2026Updated 2 weeks ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- OpenAI 2025 ICPC Submissions☆61Sep 17, 2025Updated 9 months ago
- ☆14Dec 20, 2021Updated 4 years ago
- ☆33Mar 17, 2026Updated 3 months ago
- Official Inspect Implementation for "ImpossibleBench: Measuring LLMs' Propensity of Exploiting Test Cases"☆40Dec 1, 2025Updated 6 months ago
- Code for the paper "Distinguishing the Knowable from the Unknowable with Language Models"☆11Apr 15, 2024Updated 2 years ago
- Code for the paper: Proving Theorems Recursively☆12May 23, 2024Updated 2 years ago
- codes and plots for "Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs"☆11Dec 30, 2024Updated last year
- Code for "Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining"☆28Oct 14, 2025Updated 8 months ago
- Pytorch routines for (Ker)nel (Mac)hines☆12Oct 10, 2025Updated 8 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- FROM $f(x)$ AND $g(x)$ TO $f(g(x))$: LLMs Learn New Skills in RL by Composing Old Ones☆68Jan 26, 2026Updated 4 months ago
- ☆10Mar 8, 2025Updated last year
- exercise for transformers-benchmarks, add 3090 benchmark☆13Feb 3, 2023Updated 3 years ago
- Code for "Evidence of Learned Look-Ahead in a Chess-Playing Neural Network"☆29Jun 4, 2024Updated 2 years ago
- Exploring the minimal architecture required for coherent English language generation.☆13Jun 11, 2026Updated last week
- Library that provides metrics to assess representation quality☆28Feb 5, 2025Updated last year
- A Jupyter-style custom node for executing Python code and plotting within ComfyUI workflows.☆40Mar 18, 2026Updated 3 months ago
- A Template Repository for a Swift Package-based Stanford Byers Center for Biodesign Digital Health Project☆18Apr 1, 2026Updated 2 months ago
- ☆84Aug 31, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Reproducing GPT on the TinyStories dataset☆19Jan 18, 2024Updated 2 years ago
- ☆17Feb 4, 2025Updated last year
- Tutorials for MATH 4432 Statistical Machine Learning, HKUST, Fall 2022☆11Sep 17, 2024Updated last year
- Modern utility library and typescript typings for building JSON Schema documents☆14Nov 28, 2025Updated 6 months ago
- This is a repository for RM2021 Software tutorial☆11Nov 4, 2020Updated 5 years ago
- [ICLR 2025] This repository contains the code to reproduce the results from our paper From Sparse Dependence to Sparse Attention: Unveili…☆12Mar 7, 2025Updated last year
- Dark Patterns in Chatbot Design☆20Jun 15, 2024Updated 2 years ago