josehoras / Knowledge-DistillationLinks
☆11Updated 5 years ago
Alternatives and similar repositories for Knowledge-Distillation
Users that are interested in Knowledge-Distillation are comparing it to the libraries listed below
Sorting:
- From scratch implementation of a vision language model in pure PyTorch☆254Updated last year
- Notes on the Mamba and the S4 model (Mamba: Linear-Time Sequence Modeling with Selective State Spaces)☆179Updated 2 years ago
- LoRA and DoRA from Scratch Implementations☆215Updated last year
- Distributed training (multi-node) of a Transformer model☆93Updated last year
- Conference schedule, top papers, and analysis of the data for NeurIPS 2023!☆120Updated 2 years ago
- LORA: Low-Rank Adaptation of Large Language Models implemented using PyTorch☆122Updated 2 years ago
- First-principle implementations of groundbreaking AI algorithms using a wide range of deep learning frameworks, accompanied by supporting…☆181Updated 6 months ago
- ☆132Updated 2 years ago
- ☆46Updated 8 months ago
- Combining ViT and GPT-2 for image captioning. Trained on MS-COCO. The model was implemented mostly from scratch.☆48Updated 2 years ago
- several types of attention modules written in PyTorch for learning purposes☆53Updated last month
- Implementation of the paper "Denoising Diffusion Probabilistic Models" in PyTorch☆67Updated 2 years ago
- [ICCV25] Official Implementation of LeGrad☆87Updated last year
- PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"☆206Updated 3 weeks ago
- Pytorch implementation of the xLSTM model by Beck et al. (2024)☆181Updated last year
- ☆307Updated 9 months ago
- This repository contains an overview of important follow-up works based on the original Vision Transformer (ViT) by Google.☆192Updated 4 years ago
- A Simplified PyTorch Implementation of Vision Transformer (ViT)☆235Updated last year
- ☆50Updated last year
- Implementation of DoRA☆306Updated last year
- Simple, minimal implementation of the Mamba SSM in one pytorch file. Using logcumsumexp (Heisen sequence).☆130Updated last year
- ☆48Updated 7 months ago
- Playground for Transformers☆53Updated 2 years ago
- Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modeling☆213Updated last week
- A minimal implementation of LLaVA-style VLM with interleaved image & text & video processing ability.☆98Updated last year
- Complete implementation of Llama2 with/without KV cache & inference 🚀☆49Updated last year
- This project is a collection of fine-tuning scripts to help researchers fine-tune Qwen 2 VL on HuggingFace datasets.☆77Updated 6 months ago
- ☆75Updated 9 months ago
- A simple but robust PyTorch implementation of RetNet from "Retentive Network: A Successor to Transformer for Large Language Models" (http…☆106Updated 2 years ago
- Collection of tests performed during the study of the new Kolmogorov-Arnold Neural Networks (KAN)☆41Updated 11 months ago