☆34Nov 30, 2025Updated 6 months ago
Alternatives and similar repositories for inductive-bias-probes
Users that are interested in inductive-bias-probes are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 6,080-param transformer achieving 100% accuracy on 10-digit addition. Trained from scratch in 10 minutes.☆22Feb 19, 2026Updated 4 months ago
- ☆26Feb 20, 2026Updated 4 months ago
- The Full Spectrum of Deepnet Hessians at Scale: Dynamics with SGD Training and Sample Size☆19May 19, 2019Updated 7 years ago
- Benchmarking Optimizers for LLM Pretraining☆60May 3, 2026Updated last month
- ☆16Mar 22, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Code for "What really matters in matrix-whitening optimizers?"☆24Oct 31, 2025Updated 7 months ago
- ☆77Jun 9, 2026Updated 2 weeks ago
- ☆16Mar 3, 2023Updated 3 years ago
- ☆35Jul 5, 2023Updated 2 years ago
- Official Code for What Makes and Breaks Safety Fine-tuning? A Mechanistic Study (NeurIPS 2024)☆12Oct 31, 2024Updated last year
- Flax (JAX) implementation of Progressive Growing of GANs for Improved Quality, Stability, and Variation☆12May 24, 2021Updated 5 years ago
- A toolkit that provides a range of model diffing techniques including a UI to visualize them interactively.☆74Apr 15, 2026Updated 2 months ago
- ☆61Sep 17, 2025Updated 9 months ago
- codes and plots for "Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs"☆11Dec 30, 2024Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Pytorch routines for (Ker)nel (Mac)hines☆12Oct 10, 2025Updated 8 months ago
- Code for "Evidence of Learned Look-Ahead in a Chess-Playing Neural Network"☆29Jun 4, 2024Updated 2 years ago
- A project designed to build and render a full Minecraft crafting tree.☆10Aug 10, 2021Updated 4 years ago
- Exploring the minimal architecture required for coherent English language generation.☆13Jun 11, 2026Updated 2 weeks ago
- Experiments on the impact of depth in transformers and SSMs.☆41Oct 23, 2025Updated 8 months ago
- [ACL 2025 Findings] Text2World: Benchmarking Large Language Models for Symbolic World Model Generation☆29Feb 25, 2025Updated last year
- Maximum Entropy-Regularized Multi-Goal Reinforcement Learning (ICML 2019)☆24May 30, 2019Updated 7 years ago
- A python package to design and debug RL agents.☆33Apr 2, 2026Updated 2 months ago
- A Jupyter-style custom node for executing Python code and plotting within ComfyUI workflows.☆41Mar 18, 2026Updated 3 months ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- ☆84Aug 31, 2023Updated 2 years ago
- The code for creating the iGSM datasets in papers "Physics of Language Models Part 2.1, Grade-School Math and the Hidden Reasoning Proces…☆88Jan 12, 2025Updated last year
- NeurIPS22 "RankFeat: Rank-1 Feature Removal for Out-of-distribution Detection" and T-PAMI Extension☆20Feb 21, 2025Updated last year
- ☆17Feb 4, 2025Updated last year
- Tutorials for MATH 4432 Statistical Machine Learning, HKUST, Fall 2022☆11Sep 17, 2024Updated last year
- This is a repository for RM2021 Software tutorial☆11Nov 4, 2020Updated 5 years ago
- [ICLR 2025] This repository contains the code to reproduce the results from our paper From Sparse Dependence to Sparse Attention: Unveili…☆12Mar 7, 2025Updated last year
- HyperPose☆13Nov 6, 2025Updated 7 months ago
- Experiments for "A Closer Look at In-Context Learning under Distribution Shifts"☆18May 29, 2023Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- C++-Animation-(Standard-Template-Library)-Engine,or CASTLE for short,is a C++ plotting and animation engine created by BiliBili uploader …☆11Jan 17, 2021Updated 5 years ago
- Curse-of-memory phenomenon of RNNs in sequence modelling☆19May 8, 2025Updated last year
- Code for the paper: https://arxiv.org/pdf/2309.06979.pdf☆21Jul 29, 2024Updated last year
- Design and analyze optimal deep learning models.☆31Aug 2, 2025Updated 10 months ago
- This repository contains example code to build models on TPUs☆30Feb 17, 2023Updated 3 years ago
- Computing the greatest common divisor with transformers, source code for the paper https//arxiv.org/abs/2308.15594☆14Aug 11, 2025Updated 10 months ago
- IMG: Calibrating Diffusion Models via Implicit Multimodal Guidance, ICCV 2025☆30Oct 1, 2025Updated 8 months ago