Zhang-Yihao / Transfomer2DFALinks
Implementation for paper Automata Extraction from Transformers.
☆11Updated last year
Alternatives and similar repositories for Transfomer2DFA
Users that are interested in Transfomer2DFA are comparing it to the libraries listed below
Sorting:
- This is the official code implementation of Bongard-OpenWorld (ICLR 2024).☆14Updated 11 months ago
- Code for the paper "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns"☆18Updated last year
- Just a repository that will house some MLPs and their variants, so to avoid having to reimplement them again and again for different proj…☆40Updated 2 weeks ago
- Experimental scripts for researching data adaptive learning rate scheduling.☆22Updated 2 years ago
- [CoLM 24] Official Repository of MambaByte: Token-free Selective State Space Model☆24Updated last year
- Titans - Learning to Memorize at Test Time☆53Updated 11 months ago
- ☆10Updated last year
- Implementation of a Hierarchical Mamba as described in the paper: "Hierarchical State Space Models for Continuous Sequence-to-Sequence Mo…☆14Updated last year
- Revisiting Hierarchical Text Classification : Inference and Metrics☆15Updated last year
- Official code for the paper "Attention as a Hypernetwork"☆46Updated last year
- ☆13Updated last year
- implementation of dualformer☆24Updated 9 months ago
- Code and data for paper "(How) do Language Models Track State?"☆21Updated 8 months ago
- Code for our ACL '23 paper titled "Grokking of Hierarchical Structure in Vanilla Transformers"☆23Updated 2 years ago
- A benchmark dataset and simple code examples for measuring the perception and reasoning of multi-sensor Vision Language models.☆19Updated 11 months ago
- Implementation and explorations into Blackbox Gradient Sensing (BGS), an evolutionary strategies approach proposed in a Google Deepmind p…☆20Updated 5 months ago
- ☆21Updated last year
- Applies ROME and MEMIT on Mamba-S4 models☆14Updated last year
- [NeurIPS 2025 Oral] Exploring Diffusion Transformer Designs via Grafting☆67Updated 6 months ago
- Official PyTorch Implementation for Vision-Language Models Create Cross-Modal Task Representations, ICML 2025☆31Updated 7 months ago
- ☆13Updated last year
- [NeurIPS'24 LanGame workshop] On The Planning Abilities of OpenAI's o1 Models: Feasibility, Optimality, and Generalizability☆41Updated 5 months ago
- Hierarchical State Space Models☆48Updated last year
- ☆13Updated 10 months ago
- RS-IMLE☆43Updated last year
- The official repo of continuous speculative decoding☆31Updated 8 months ago
- [ICML 2025] Code for "R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts"☆17Updated 9 months ago
- Resa: Transparent Reasoning Models via SAEs☆46Updated 3 months ago
- ☆25Updated 6 months ago
- HGRN2: Gated Linear RNNs with State Expansion☆56Updated last year