google-research / causallm_icl
☆10Updated 8 months ago
Related projects: ⓘ
- Fine-Tuning Pre-trained Transformers into Decaying Fast Weights☆19Updated last year
- ☆12Updated 9 months ago
- Minimum Description Length probing for neural network representations☆15Updated 11 months ago
- Code for NeurIPS 2023 paper "Non-autoregressive Machine Translation with Probabilistic Context-free Grammar".☆10Updated 8 months ago
- Code for "Are “Hierarchical” Visual Representations Hierarchical?" in NeurIPS Workshop for Symmetry and Geometry in Neural Representation…☆18Updated 10 months ago
- The official repository for our paper "The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns …☆15Updated 10 months ago
- ☆24Updated last year
- Official code for the paper: "Metadata Archaeology"☆18Updated last year
- Code for the paper "Data Feedback Loops: Model-driven Amplification of Dataset Biases"☆15Updated 2 years ago
- ☆30Updated 8 months ago
- [EMNLP 2023] Official implementation of the algorithm ETSC: Exact Toeplitz-to-SSM Conversion our EMNLP 2023 paper - Accelerating Toeplitz…☆14Updated 11 months ago
- Repository for Skill Set Optimization☆12Updated last month
- Offcial Repo of Paper "Eliminating Position Bias of Language Models: A Mechanistic Approach""☆10Updated 3 weeks ago
- Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"☆34Updated 10 months ago
- Code repository for the public reproduction of the language modelling experiments on "MatFormer: Nested Transformer for Elastic Inference…☆17Updated 10 months ago
- Official repository for the paper "Images as Weight Matrices: Sequential Image Generation Through Synaptic Learning Rules" (ICLR 2023)☆12Updated last year
- ☆25Updated 9 months ago
- Project for SNARE benchmark☆10Updated 3 months ago
- Official code for the paper "Attention as a Hypernetwork"☆20Updated 2 months ago
- PyTorch implementation for "Long Horizon Temperature Scaling", ICML 2023☆18Updated last year
- HyPe: Better Pre-trained Language Model Fine-tuning with Hidden Representation Perturbation [ACL 2023]☆13Updated last year
- ☆44Updated 11 months ago
- ☆19Updated last month
- Source code for a LoRA-based continual relation extraction method.☆10Updated 11 months ago
- ☆14Updated this week
- ☆19Updated last year
- ☆12Updated 8 months ago
- [NeurIPS 2023] Sparse Modular Activation for Efficient Sequence Modeling☆34Updated 9 months ago
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"☆24Updated 5 months ago
- Curse-of-memory phenomenon of RNNs in sequence modelling☆17Updated this week