bergen / EdgeTransformer
☆21Updated 2 years ago
Related projects: ⓘ
- Implementation of ICML 22 Paper: Scaling Structured Inference with Randomization☆14Updated 2 years ago
- Official code for "Accelerating Feedforward Computation via Parallel Nonlinear Equation Solving", ICML 2021☆25Updated 2 years ago
- Code for Residual Energy-Based Models for Text Generation in PyTorch.☆22Updated 3 years ago
- ☆15Updated this week
- ☆12Updated 4 months ago
- Code to reproduce the results for Compositional Attention☆60Updated last year
- ☆32Updated 3 years ago
- The official repository for our paper "The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization".☆32Updated 2 years ago
- [NeurIPS'20] Code for the Paper Compositional Visual Generation and Inference with Energy Based Models☆43Updated last year
- ☆25Updated 9 months ago
- ☆40Updated 2 years ago
- The official repository for our paper "The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns …☆15Updated 10 months ago
- ☆48Updated last year
- This repository contains some of the code used in the paper "Training Language Models with Langauge Feedback at Scale"☆26Updated last year
- ☆22Updated 2 months ago
- A Kernel-Based View of Language Model Fine-Tuning https://arxiv.org/abs/2210.05643☆68Updated last year
- Suite of 500 procedurally-generated NLP tasks to study language model adaptability☆21Updated 2 years ago
- [ICML 2022] Latent Diffusion Energy-Based Model for Interpretable Text Modeling☆63Updated 2 years ago
- STABILIZING GRADIENTS FOR DEEP NEURAL NETWORKS VIA EFFICIENT SVD PARAMETERIZATION☆16Updated 6 years ago
- ☆18Updated 3 months ago
- ☆16Updated 3 years ago
- ☆49Updated 2 years ago
- Curse-of-memory phenomenon of RNNs in sequence modelling☆17Updated this week
- ☆44Updated 11 months ago
- PyTorch implementation for "Long Horizon Temperature Scaling", ICML 2023☆18Updated last year
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"☆24Updated 5 months ago
- ☆49Updated 3 years ago
- ☆30Updated 8 months ago
- [ACL 2023 Findings] What In-Context Learning “Learns” In-Context: Disentangling Task Recognition and Task Learning☆21Updated last year
- Gradient Estimation with Discrete Stein Operators (NeurIPS 2022)☆17Updated 10 months ago