FranxYao / Retrieval-Head-with-Flash-AttentionView external linksLinks
Efficient retrieval head analysis with triton flash attention that supports topK probability
☆13Jun 15, 2024Updated last year
Alternatives and similar repositories for Retrieval-Head-with-Flash-Attention
Users that are interested in Retrieval-Head-with-Flash-Attention are comparing it to the libraries listed below
Sorting:
- This repo is to demo the concept of lossless compression with Transformers as encoder and decoder.☆14May 2, 2024Updated last year
- ☆11Jun 15, 2019Updated 6 years ago
- ☆12Jun 30, 2024Updated last year
- ☆14Nov 29, 2023Updated 2 years ago
- The code for the paper *The Sensitivity of Counterfactual Fairness to Unmeasured Confounding* @ UAI 2019☆14Apr 4, 2020Updated 5 years ago
- References for Papers at the Intersection of Causality and Fairness☆18Dec 3, 2018Updated 7 years ago
- [NAACL 2025] Source code for MMEvalPro, a more trustworthy and efficient benchmark for evaluating LMMs☆24Sep 26, 2024Updated last year
- [ICLR2025] Code and data for paper: Not All Heads Matter: A Head-Level KV Cache Compression Method with Integrated Retrieval and Reasonin…☆40Mar 10, 2025Updated 11 months ago
- Variational Graph Convolutional Networks☆23Oct 16, 2020Updated 5 years ago
- [WWW2022] Geometric Graph Representation Learning via Maximizing Rate Reduction☆26May 27, 2022Updated 3 years ago
- Code for Graph Neural Networks Exponentially Lose Expressive Power for Node Classification.☆30May 20, 2020Updated 5 years ago
- The Good, The Bad, and The Greedy: Evaluation of LLMs Should Not Ignore Non-Determinism☆30Jul 17, 2024Updated last year
- [EMNLP'24] LongHeads: Multi-Head Attention is Secretly a Long Context Processor☆31Apr 8, 2024Updated last year
- ☆39May 2, 2024Updated last year
- ☆12Nov 7, 2025Updated 3 months ago
- Pytorch implementation of differentiable group normalization (NeurIPS 2020)☆38Oct 21, 2020Updated 5 years ago
- Matlab/Octave toolbox for deep learning. Includes Deep Belief Nets, Stacked Autoencoders, Convolutional Neural Nets, Convolutional Autoen…☆21Jun 23, 2014Updated 11 years ago
- Train I3D on NTU-RGB+D dataset in keras☆12Feb 5, 2019Updated 7 years ago
- AdaptiveStep: Automatically Dividing Reasoning Step through Model Confidence☆10Mar 2, 2025Updated 11 months ago
- ☆16Jan 30, 2026Updated 2 weeks ago
- simple implementation of Expected Gradients and Integrated Gradients by pytorch☆12May 11, 2022Updated 3 years ago
- ☆10Feb 17, 2019Updated 7 years ago
- Tracking Of Agent (actions and belief) and Spatio-TEmporal Reasoning☆14Feb 7, 2020Updated 6 years ago
- Active Learning with Partial Feedback, ICLR 2019☆11Apr 27, 2020Updated 5 years ago
- The implementation for ICLR 2025 Oral: From Exploration to Mastery: Enabling LLMs to Master Tools via Self-Driven Interactions.☆53Aug 9, 2025Updated 6 months ago
- Bias Correction of Learned Generative Models using Likelihood-Free Importance Weighting☆12Mar 24, 2023Updated 2 years ago
- ACL24☆11Jun 7, 2024Updated last year
- Code for Conformal Counterfactual Inference under Hidden Confounding (KDD’24)☆11Aug 30, 2024Updated last year
- [ICML'25] MELON: Provable Defense Against Indirect Prompt Injection Attacks in AI Agents☆20Jul 31, 2025Updated 6 months ago
- [ICML 2022] "Linearity Grafting: Relaxed Neuron Pruning Helps Certifiable Robustness" by Tianlong Chen*, Huan Zhang*, Zhenyu Zhang, Shiyu…☆17Jun 22, 2022Updated 3 years ago
- Enhances Overleaf by allowing article searches and BibTeX retrieval from DBLP and Google Scholar | 通过允许从 DBLP 和 Google Scholar 进行文章搜索和获取 …☆45Apr 14, 2025Updated 10 months ago
- ☆10May 1, 2023Updated 2 years ago
- This repository is contains several Automated feature selection methods in CTR Predicition.☆10Dec 18, 2022Updated 3 years ago
- Deep Counterfactual Prediction with Categorical Backward Variables☆12Feb 8, 2023Updated 3 years ago
- ☆11Oct 29, 2022Updated 3 years ago
- Codebase accompanying the paper 'Widening the Representation Bottleneck in Neural Machine Translation with Lexical Shortcuts', (Emelin, D…☆11Feb 14, 2023Updated 3 years ago
- Matlab/Octave toolbox for deep learning. Includes Deep Belief Nets, Stacked Autoencoders, Convolutional Neural Nets, Convolutional Autoen…☆10Jul 10, 2013Updated 12 years ago
- LaunchPad is a light-weighted Slurm job launcher designed for hyper-parameter search.☆11Aug 2, 2024Updated last year
- ☆14Dec 25, 2024Updated last year