lewisjs4 / csce311Links
☆11Updated 2 weeks ago
Alternatives and similar repositories for csce311
Users that are interested in csce311 are comparing it to the libraries listed below
Sorting:
- An approximate implementation of the OpenAI paper - An Empirical Model of Large-Batch Training for MNIST☆11Updated 3 years ago
- Training diffusion model with CIFAR10 dataset(insight from 13 papers)☆15Updated 6 months ago
- Trying out the Mamba architecture on small examples (cifar-10, shakespeare char level etc.)☆47Updated 2 years ago
- [NeurIPS 2024] "AmoebaLLM: Constructing Any-Shape Large Language Models for Efficient and Instant Deployment" by Yonggan Fu, Zhongzhi Yu,…☆18Updated last year
- LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters☆45Updated 6 months ago
- Official PyTorch Implementation of "The Hidden Attention of Mamba Models"☆231Updated 3 months ago
- PyTorch implementation of the Differential-Transformer architecture for sequence modeling, specifically tailored as a decoder-only model …☆86Updated last year
- [NeurIPS 2023] Structural Pruning for Diffusion Models☆216Updated last year
- CVPR2023: Vector Quantization with Self-Attention for Quality-Independent Representation Learning.☆14Updated last year
- Notes on the Mamba and the S4 model (Mamba: Linear-Time Sequence Modeling with Selective State Spaces)☆179Updated 2 years ago
- ☆63Updated last year
- List of AI Internships☆131Updated 2 years ago
- Minimal Mamba-2 implementation in PyTorch☆243Updated last year
- CKA (Centered Kernel Alignment) implemented in PyTorch☆57Updated last month
- Implementation of the paper "Denoising Diffusion Probabilistic Models" in PyTorch☆67Updated 2 years ago
- Reading list for research topics in state-space models☆345Updated 8 months ago
- [ICCV 2025] QuEST: Efficient Finetuning for Low-bit Diffusion Models☆55Updated 7 months ago
- ☆35Updated last year
- DeciMamba: Exploring the Length Extrapolation Potential of Mamba (ICLR 2025)☆32Updated 10 months ago
- [ICLR'24] "DeepZero: Scaling up Zeroth-Order Optimization for Deep Model Training" by Aochuan Chen*, Yimeng Zhang*, Jinghan Jia, James Di…☆70Updated last year
- ☆26Updated last year
- Fine-tuning Vision Transformers on various classification datasets☆114Updated last year
- Curated list of methods that focuses on improving the efficiency of diffusion models☆44Updated last year
- Awesome-Low-Rank-Adaptation☆128Updated last year
- Learnable Semi-structured Sparsity for Vision Transformers and Diffusion Transformers☆14Updated last year
- [NAACL 24 Oral] LoRETTA: Low-Rank Economic Tensor-Train Adaptation for Ultra-Low-Parameter Fine-Tuning of Large Language Models☆39Updated last year
- A simple but robust PyTorch implementation of RetNet from "Retentive Network: A Successor to Transformer for Large Language Models" (http…☆106Updated 2 years ago
- [ICLR 2024 Spotlight] This is the official PyTorch implementation of "EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Di…☆68Updated last year
- Simple, minimal implementation of the Mamba SSM in one pytorch file. Using logcumsumexp (Heisen sequence).☆130Updated last year
- ☆291Updated last year