Official Implementation of SEA: Sparse Linear Attention with Estimated Attention Mask (ICLR 2024)
☆11Jun 20, 2025Updated 9 months ago
Alternatives and similar repositories for sea-attention
Users that are interested in sea-attention are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [AAAI 2025] Official Implementation of "HDT: Hierarchical Discrete Transformer for Multivariate Time Series Forecasting"☆16Feb 17, 2025Updated last year
- ☆25May 14, 2019Updated 6 years ago
- Automatic differentiation for Triton Kernels☆29Aug 12, 2025Updated 7 months ago
- Here are some implementations of basic hardware units in RTL language (verilog for now), which can be used for area/power evaluation and …☆14Aug 25, 2023Updated 2 years ago
- ☆32Aug 24, 2022Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Implementation of the paper "Direct Optimization through argmax for discrete Variational Auto-Encoder"☆15Sep 15, 2020Updated 5 years ago
- Official Code Repository for the paper "KALA: Knowledge-Augmented Language Model Adaptation" (NAACL 2022)☆35Oct 17, 2023Updated 2 years ago
- A intelligent matrix format designer for SpMV☆10Oct 10, 2023Updated 2 years ago
- ☆13Jan 7, 2025Updated last year
- A Seq2Seq with attention and copy mechanism for sentence summarization☆13Mar 11, 2019Updated 7 years ago
- An attempt to use financial news to predict stock market☆16Nov 17, 2018Updated 7 years ago
- ☆12Sep 25, 2024Updated last year
- implementing Weight Agnostic Neural Networks to Spiking Neural Networks☆10Jan 26, 2021Updated 5 years ago
- Frequently updated list of dLLM (Diffusion Large Language Models) papers, models, and other resources☆24Jan 30, 2026Updated 2 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- DOSA: Differentiable Model-Based One-Loop Search for DNN Accelerators☆19Oct 10, 2024Updated last year
- [ACL 2024 Findings] CriticBench: Benchmarking LLMs for Critique-Correct Reasoning☆30Mar 5, 2024Updated 2 years ago
- Fast Semantic Feature Extraction using Superpixels for Soft Segmentation (CVIP-2019)☆12Jan 3, 2020Updated 6 years ago
- ☆16Nov 22, 2022Updated 3 years ago
- ☆12Nov 25, 2018Updated 7 years ago
- Official PyTorch implementation of "Rapid Neural Architecture Search by Learning to Generate Graphs from Datasets" (ICLR 2021)☆64Aug 5, 2024Updated last year
- Training-free Post-training Efficient Sub-quadratic Complexity Attention. Implemented with OpenAI Triton.☆150Nov 3, 2025Updated 4 months ago
- Keyformer proposes KV Cache reduction through key tokens identification and without the need for fine-tuning☆57Mar 26, 2024Updated 2 years ago
- [ICLR 2023] RC-MAE☆52Dec 18, 2023Updated 2 years ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- ☆43Feb 21, 2022Updated 4 years ago
- Code base of In-Context Learning for Dialogue State tracking☆45Sep 24, 2023Updated 2 years ago
- ☆80Dec 27, 2024Updated last year
- A simple behavior that can be attached to a Page to display a custom TitleBar with a Full Screen Mode toggle. UWP only.☆12Aug 5, 2015Updated 10 years ago
- Residual vector quantization for KV cache compression in large language model☆12Oct 22, 2024Updated last year
- This is a PyTorch implementation of the paper: "Processing Megapixel Images with Deep Attention-Sampling Models".☆41Oct 3, 2023Updated 2 years ago
- pytorch implementation for "Mutual Information Neural Estimation"☆11Dec 13, 2019Updated 6 years ago
- Multimodal Hashtag Prediction with instagram data & pytorch (2nd Place on OpenResource Hackathon 2019)☆47Jun 12, 2023Updated 2 years ago
- ☆15Jan 12, 2026Updated 2 months ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- A tutorial and example of Rust for C++ programmers☆17Sep 21, 2021Updated 4 years ago
- ☆18Oct 12, 2022Updated 3 years ago
- Tactical Observation of RF GNSS Interference☆14Jun 25, 2020Updated 5 years ago
- HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models☆13Mar 6, 2025Updated last year
- ☆16Dec 11, 2022Updated 3 years ago
- ☆10Nov 21, 2023Updated 2 years ago
- Official Code Repository for the paper "Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-intensive Tasks…☆44Nov 24, 2024Updated last year