allenbai01 / transformers-as-statisticiansView external linksLinks
☆35Jul 5, 2023Updated 2 years ago
Alternatives and similar repositories for transformers-as-statisticians
Users that are interested in transformers-as-statisticians are comparing it to the libraries listed below
Sorting:
- Curse-of-memory phenomenon of RNNs in sequence modelling☆19May 8, 2025Updated 9 months ago
- Code for the paper "Distinguishing the Knowable from the Unknowable with Language Models"☆11Apr 15, 2024Updated last year
- ☆10Mar 6, 2022Updated 3 years ago
- Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification☆11Aug 12, 2023Updated 2 years ago
- ☆11Mar 13, 2023Updated 2 years ago
- ☆240May 10, 2024Updated last year
- [ICLR 2025] This repository contains the code to reproduce the results from our paper From Sparse Dependence to Sparse Attention: Unveili…☆12Mar 7, 2025Updated 11 months ago
- Implementation of Nonparametric Hamiltonian Monte Carlo☆13Feb 13, 2023Updated 3 years ago
- Official repository for the paper "Exploring the Promise and Limits of Real-Time Recurrent Learning" (ICLR 2024)☆13Jun 11, 2025Updated 8 months ago
- Code for the paper "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns"☆18Mar 15, 2024Updated last year
- csl: PyTorch-based Constrained Learning☆12Jun 1, 2022Updated 3 years ago
- Official code for "Accelerating Feedforward Computation via Parallel Nonlinear Equation Solving", ICML 2021☆29Sep 25, 2021Updated 4 years ago
- Gradient-based Hyperparameter Optimization Over Long Horizons☆14Sep 29, 2021Updated 4 years ago
- ☆12Nov 3, 2021Updated 4 years ago
- Post-processing for fair classification☆16Jun 30, 2025Updated 7 months ago
- Code for the ICLR 2021 Paper "In-N-Out: Pre-Training and Self-Training using Auxiliary Information for Out-of-Distribution Robustness"☆13Oct 23, 2021Updated 4 years ago
- Reproducible code for Augmentation paper☆17Jan 23, 2019Updated 7 years ago
- ☆29Nov 30, 2025Updated 2 months ago
- MDL Complexity computations and experiments from the paper "Revisiting complexity and the bias-variance tradeoff".☆18Jun 12, 2023Updated 2 years ago
- Variational Reinforcement Learning☆17Jul 25, 2024Updated last year
- Repository for reproducing `Model-Based Robust Deep Learning`☆16Jan 22, 2021Updated 5 years ago
- JAX implementation of "Fine-Tuning Language Models with Just Forward Passes"☆19Jun 10, 2023Updated 2 years ago
- Generalised UDRL☆37May 12, 2022Updated 3 years ago
- Experiments and code to generate the GINC small-scale in-context learning dataset from "An Explanation for In-context Learning as Implici…☆105Nov 10, 2023Updated 2 years ago
- Blog post☆17Feb 16, 2024Updated last year
- Representation Learning in RL☆13Jun 1, 2022Updated 3 years ago
- The code to reproduce CVPR 2021 paper "Towards Robust Classification Model by Counterfactual and Invariant Data Generation"☆17Jul 29, 2021Updated 4 years ago
- A set of kernel-based (Un)conditional independence tests including SDCIT (Lee and Honavar, UAI 2017)☆17Feb 6, 2020Updated 6 years ago
- Code for paper: End-to-end Stochastic Optimization with Energy-based Model☆16Feb 14, 2023Updated 3 years ago
- Code to accompany the paper "The Information Geometry of Unsupervised Reinforcement Learning"☆20Oct 6, 2021Updated 4 years ago
- Implementation of paper "Probabilistic Active Meta-Learning" (NeurIPS 2020).☆20Dec 2, 2020Updated 5 years ago
- Generators for linear programming instances with controllable difficulty and solution properties.☆15Apr 26, 2021Updated 4 years ago
- ☆20Nov 4, 2025Updated 3 months ago
- Pytorch implementation of SuperPolyak subgradient method.☆43Nov 18, 2022Updated 3 years ago
- Experiments for distributed optimization algorithms☆85May 24, 2023Updated 2 years ago
- ☆44Jul 21, 2025Updated 6 months ago
- CUDA 12.2 HMM demos☆20Jul 26, 2024Updated last year
- ☆25Apr 18, 2025Updated 9 months ago
- The Full Spectrum of Deepnet Hessians at Scale: Dynamics with SGD Training and Sample Size☆19May 19, 2019Updated 6 years ago