josehoras / Knowledge-DistillationLinks
☆11Updated 5 years ago
Alternatives and similar repositories for Knowledge-Distillation
Users that are interested in Knowledge-Distillation are comparing it to the libraries listed below
Sorting:
- LoRA and DoRA from Scratch Implementations☆215Updated last year
- ☆133Updated 2 years ago
- From scratch implementation of a vision language model in pure PyTorch☆252Updated last year
- Distributed training (multi-node) of a Transformer model☆90Updated last year
- Conference schedule, top papers, and analysis of the data for NeurIPS 2023!☆120Updated 2 years ago
- LORA: Low-Rank Adaptation of Large Language Models implemented using PyTorch☆119Updated 2 years ago
- First-principle implementations of groundbreaking AI algorithms using a wide range of deep learning frameworks, accompanied by supporting…☆181Updated 5 months ago
- several types of attention modules written in PyTorch for learning purposes☆52Updated last week
- Notes on quantization in neural networks☆114Updated 2 years ago
- Notes on the Mamba and the S4 model (Mamba: Linear-Time Sequence Modeling with Selective State Spaces)☆178Updated 2 years ago
- Notebooks for fine tuning pali gemma☆117Updated 8 months ago
- Combining ViT and GPT-2 for image captioning. Trained on MS-COCO. The model was implemented mostly from scratch.☆48Updated 2 years ago
- A Simplified PyTorch Implementation of Vision Transformer (ViT)☆227Updated last year
- Just some miscellaneous utility functions / decorators / modules related to Pytorch and Accelerate to help speed up implementation of new…☆126Updated last year
- Complete implementation of Llama2 with/without KV cache & inference 🚀☆49Updated last year
- Naively combining transformers and Kolmogorov-Arnold Networks to learn and experiment☆37Updated last year
- PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"☆203Updated last week
- A minimal implementation of LLaVA-style VLM with interleaved image & text & video processing ability.☆97Updated last year
- Making of cuda kernel☆17Updated 7 months ago
- Implementation of the paper "Denoising Diffusion Probabilistic Models" in PyTorch☆67Updated 2 years ago
- [ICCV25] Official Implementation of LeGrad☆86Updated last year
- ☆304Updated 8 months ago
- Basic implementation of ResNet 50, 101, 152 in PyTorch☆124Updated 3 years ago
- Contrastive Reinforcement Learning☆55Updated last week
- Documentation, notes, links, etc for streams.☆84Updated last year
- Implementation of xLSTM in Pytorch from the paper: "xLSTM: Extended Long Short-Term Memory"☆118Updated 2 months ago
- Implementation of DoRA☆307Updated last year
- Implementation of Griffin from the paper: "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"☆56Updated 2 months ago
- Awesome list of papers that extend Mamba to various applications.☆139Updated 6 months ago
- This project is a collection of fine-tuning scripts to help researchers fine-tune Qwen 2 VL on HuggingFace datasets.☆77Updated 5 months ago