Official Repo for Error-Free Linear Attention is a Free Lunch: Exact Solution from Continuous-Time Dynamics
☆72Mar 26, 2026Updated 2 weeks ago
Alternatives and similar repositories for EFLA
Users that are interested in EFLA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code for "Theoretical Foundations of Deep Selective State-Space Models" (NeurIPS 2024)☆16Jan 7, 2025Updated last year
- Combining SOAP and MUON☆19Feb 11, 2025Updated last year
- ☆17Dec 19, 2024Updated last year
- ☆48Jun 16, 2025Updated 9 months ago
- ☆36Feb 26, 2024Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Parallel Associative Scan for Language Models☆18Jan 8, 2024Updated 2 years ago
- ☆19Dec 4, 2025Updated 4 months ago
- Official implementation of Browse-Master, a tool-augmented web-search agent.☆28Aug 22, 2025Updated 7 months ago
- Code for the paper "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns"☆18Mar 15, 2024Updated 2 years ago
- [ICLR 2025 & COLM 2025] Official PyTorch implementation of the Forgetting Transformer and Adaptive Computation Pruning☆150Feb 25, 2026Updated last month
- ☆39Apr 27, 2024Updated last year
- Efficient PScan implementation in PyTorch☆17Jan 2, 2024Updated 2 years ago
- ☆15Mar 2, 2025Updated last year
- Unofficial implementation of paper : Exploring the Space of Key-Value-Query Models with Intention☆12May 24, 2023Updated 2 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Domain Adaptation and Adapters☆16Feb 28, 2023Updated 3 years ago
- A Wikipedia-based summarization dataset☆14Mar 27, 2023Updated 3 years ago
- continous batching and parallel acceleration for RWKV6☆22Jun 28, 2024Updated last year
- [AAAI 2026] Official repository of Circulant Attention☆46Jan 12, 2026Updated 3 months ago
- RADLADS training code☆39May 7, 2025Updated 11 months ago
- HGRN2: Gated Linear RNNs with State Expansion☆57Aug 20, 2024Updated last year
- Our EMNLP 2022 paper on VIP-Based Prompting for Parameter-Efficient Learning☆10Oct 22, 2022Updated 3 years ago
- Code for the paper "Function-Space Learning Rates"☆25Jun 3, 2025Updated 10 months ago
- Minimal, standalone, fast Python OpenEXR reader for single-part, uncompressed scan-line files as produced by Blender.☆18Jun 14, 2022Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- A ComfyUI and ComfyScript Gradio-based app for generating characters using a multi-step process.☆19Nov 5, 2025Updated 5 months ago
- ☆32Apr 12, 2024Updated 2 years ago
- Code for "Distribution-based Emotion Recognition in Conversation"☆19Feb 6, 2023Updated 3 years ago
- Official codes for COLING 2024 paper "Robust and Scalable Model Editing for Large Language Models": https://arxiv.org/abs/2403.17431v1☆14Mar 27, 2024Updated 2 years ago
- [ICML 2023] "Data Efficient Neural Scaling Law via Model Reusing" by Peihao Wang, Rameswar Panda, Zhangyang Wang☆14Jan 4, 2024Updated 2 years ago
- Event Relation in Text-to-Audio (TTA) Generation☆20Feb 26, 2025Updated last year
- ☆20May 30, 2024Updated last year
- ☆43Sep 15, 2025Updated 6 months ago
- olive-cli: a minimal llm-based operating system for engineers packaged as a terminal app.☆20Jun 13, 2025Updated 9 months ago
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- This repository contains the dataset and baselines explained in the paper: M2H2: A Multimodal Multiparty Hindi Dataset For HumorRecogniti…☆19Mar 14, 2023Updated 3 years ago
- One Small Step in Latent, One Giant Leap for Pixels: Fast Latent Upscale Adapter for Your Diffusion Model☆29Mar 15, 2026Updated 3 weeks ago
- Mamba support for transformer lens☆19Sep 17, 2024Updated last year
- Official PyTorch Implementation of the Longhorn Deep State Space Model☆57Dec 4, 2024Updated last year
- ☆11Jul 21, 2024Updated last year
- Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing☆49Jan 27, 2022Updated 4 years ago
- SimKO: Simple Pass@K Policy Optimization☆28Oct 24, 2025Updated 5 months ago