Adamdad / rational_kat_cu
☆42Updated last month
Alternatives and similar repositories for rational_kat_cu:
Users that are interested in rational_kat_cu are comparing it to the libraries listed below
- State Space Models☆64Updated 8 months ago
- Official implementation of "Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers"☆118Updated 5 months ago
- A Triton Kernel for incorporating Bi-Directionality in Mamba2☆60Updated 3 weeks ago
- A More Fair and Comprehensive Comparison between KAN and MLP☆155Updated 5 months ago
- A repository for DenseSSMs☆87Updated 9 months ago
- Awesome list of papers that extend Mamba to various applications.☆129Updated 3 weeks ago
- Official PyTorch Implementation of "The Hidden Attention of Mamba Models"☆209Updated 7 months ago
- My implementation of the original transformer model (Vaswani et al.). I've additionally included the playground.py file for visualizing o…☆43Updated last month
- Official Code for ICLR 2024 Paper: Non-negative Contrastive Learning☆38Updated 8 months ago
- This repository is the official implementation of our Autoregressive Pretraining with Mamba in Vision☆68Updated 7 months ago
- More dimensions = More fun☆21Updated 5 months ago
- HGRN2: Gated Linear RNNs with State Expansion☆52Updated 4 months ago
- The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"☆33Updated 3 months ago
- ☆137Updated 4 months ago
- ☆45Updated 9 months ago
- Official PyTorch Implementation of Gated Delta Networks: Improving Mamba2 with Delta Rule☆73Updated 2 weeks ago
- [ICLR 2024 Spotlight] This is the official PyTorch implementation of "EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Di…☆55Updated 7 months ago
- An efficient pytorch implementation of selective scan in one file, works with both cpu and gpu, with corresponding mathematical derivatio…☆78Updated 10 months ago
- Implementation of Griffin from the paper: "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"☆51Updated 2 months ago
- [ICLR 2024] Improving Convergence and Generalization Using Parameter Symmetries☆29Updated 7 months ago
- PyTorch implementation of "From Sparse to Soft Mixtures of Experts"☆49Updated last year
- Implementation of ViTaR: ViTAR: Vision Transformer with Any Resolution in PyTorch☆30Updated 2 months ago
- [ICML 2024 Oral] This project is the official implementation of our Accurate LoRA-Finetuning Quantization of LLMs via Information Retenti…☆60Updated 9 months ago
- [ECCV 2024] Official pytorch implementation of "Switch Diffusion Transformer: Synergizing Denoising Tasks with Sparse Mixture-of-Experts"☆34Updated 6 months ago
- Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Mode…☆92Updated 4 months ago
- Official Implementation Of The Paper: `DeciMamba: Exploring the Length Extrapolation Potential of Mamba'☆22Updated 5 months ago
- Official code for the paper "Image generation with shortest path diffusion" accepted at ICML 2023.☆22Updated last year
- MambaFormer in-context learning experiments and implementation for https://arxiv.org/abs/2402.04248☆35Updated 7 months ago
- This repo contains the source code for VB-LoRA: Extreme Parameter Efficient Fine-Tuning with Vector Banks (NeurIPS 2024).☆36Updated 3 months ago
- ☆36Updated 8 months ago