AlgonetLabs / CableLinks
Context-aware Biases for Length Extrapolation
☆22Updated 7 months ago
Alternatives and similar repositories for Cable
Users that are interested in Cable are comparing it to the libraries listed below
Sorting:
- A Comprehensive Survey on Knowledge Distillation☆57Updated 2 weeks ago
- KD-LoRA: A Hybrid Approach to Efficient Fine-Tuning with LoRA and Knowledge Distillation☆19Updated last year
- The official implementation for MTLoRA: A Low-Rank Adaptation Approach for Efficient Multi-Task Learning (CVPR '24)☆69Updated 6 months ago
- A curated list of Computer Vision related conferences with dates and paper registration deadlines.☆46Updated 2 months ago
- Notes on the Mamba and the S4 model (Mamba: Linear-Time Sequence Modeling with Selective State Spaces)☆178Updated 2 years ago
- [NeurIPS'24 Oral] HydraLoRA: An Asymmetric LoRA Architecture for Efficient Fine-Tuning☆234Updated last year
- Awesome-Low-Rank-Adaptation☆126Updated last year
- Official PyTorch implementation of Agglomerative Token Clustering presented at ECCV 2024☆19Updated last year
- [NeurIPS 2023] Code base for the Renyi Kernel Entropy (RKE) metric for generative models.☆13Updated 6 months ago
- Official PyTorch Implementation of "The Hidden Attention of Mamba Models"☆232Updated 2 months ago
- PyTorch implementation of the Differential-Transformer architecture for sequence modeling, specifically tailored as a decoder-only model …☆85Updated last year
- ☆11Updated 2 years ago
- Official repository of our work "Finding Lottery Tickets in Vision Models via Data-driven Spectral Foresight Pruning" accepted at CVPR 20…☆26Updated 10 months ago
- ☆13Updated 11 months ago
- [NeurIPS 2025 Spotlight] SparseMVC: Probing Cross-view Sparsity Variations for Multi-view Clustering [Pytorch repository]☆36Updated this week
- The official implementation of "2024NeurIPS Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation"☆52Updated last year
- [CVPR 2025] "Towards Universal Soccer Video Understanding".☆206Updated 4 months ago
- X-VARS is a multi-modal large language model designed for understanding football videos from the point of view of a referee. X-VARS can p…☆23Updated last year
- Implementation of the paper: "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆111Updated last month
- Transformer based Decision Networks for MOT☆24Updated last year
- ☆152Updated last year
- [ICML 2024] VQDNA: Unleashing the Power of Vector Quantization for Multi-Species Genomic Sequence Modeling☆10Updated last year
- Task Singular Vectors: Reducing Task Interference in Model Merging. Merge models avoiding task interference through separable models.☆45Updated 3 weeks ago
- Implementation of Switch Transformers from the paper: "Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficien…☆135Updated this week
- Best Papers of Top Venues like CVPR, NeurIPS, ICLR, ICML, ICCV, ECCV, ...☆266Updated 3 weeks ago
- AdaMerging: Adaptive Model Merging for Multi-Task Learning. ICLR, 2024.☆98Updated last year
- The open source implementation of the multi grouped query attention by the paper "GQA: Training Generalized Multi-Query Transformer Model…☆15Updated 2 years ago
- [ICCV23] Robust Mixture-of-Expert Training for Convolutional Neural Networks by Yihua Zhang, Ruisi Cai, Tianlong Chen, Guanhua Zhang, Hua…☆66Updated 2 years ago
- Awesome Low-Rank Adaptation☆59Updated 5 months ago
- [EMNLP 2024 Oral] MatchTime: Towards Automatic Soccer Game Commentary Generation☆90Updated last year