josehoras / Knowledge-DistillationLinks

☆11

Alternatives and similar repositories for Knowledge-Distillation

Users that are interested in Knowledge-Distillation are comparing it to the libraries listed below

Sorting:

rasbt / dora-from-scratch
LoRA and DoRA from Scratch Implementations
☆214Updated last year
AviSoori1x / seemore
From scratch implementation of a vision language model in pure PyTorch
☆250Updated last year
hkproj / mamba-notes
Notes on the Mamba and the S4 model (Mamba: Linear-Time Sequence Modeling with Selective State Spaces)
☆173Updated last year
hkproj / quantization-notes
Notes on quantization in neural networks
☆105Updated last year
hkproj / pytorch-lora
LORA: Low-Rank Adaptation of Large Language Models implemented using PyTorch
☆117Updated 2 years ago
jacobmarks / awesome-neurips-2023
Conference schedule, top papers, and analysis of the data for NeurIPS 2023!
☆121Updated last year
shreydan / VisionGPT2
Combining ViT and GPT-2 for image captioning. Trained on MS-COCO. The model was implemented mostly from scratch.
☆46Updated 2 years ago
kyegomez / Jamba
PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"
☆195Updated 3 weeks ago
rasbt / cvpr2023
☆134Updated 2 years ago
catid / dora
Implementation of DoRA
☆307Updated last year
hkproj / pytorch-transformer-distributed
Distributed training (multi-node) of a Transformer model
☆86Updated last year
knotgrass / attention
several types of attention modules written in PyTorch for learning purposes
☆52Updated last year
Jaykef / ai-algorithms
First-principle implementations of groundbreaking AI algorithms using a wide range of deep learning frameworks, accompanied by supporting…
☆179Updated 4 months ago
huggingface / transformers-research-projects
Research projects built on top of Transformers
☆100Updated 8 months ago
melisa-writer / short-transformers
Prune transformer layers
☆74Updated last year
PeaBrane / mamba-tiny
Simple, minimal implementation of the Mamba SSM in one pytorch file. Using logcumsumexp (Heisen sequence).
☆127Updated last year
ThinamXx / Meta-llama
Complete implementation of Llama2 with/without KV cache & inference 🚀
☆48Updated last year
hu-po / streamdocs
Documentation, notes, links, etc for streams.
☆84Updated last year
tintn / vision-transformer-from-scratch
A Simplified PyTorch Implementation of Vision Transformer (ViT)
☆220Updated last year
lucidrains / infini-transformer-pytorch
Implementation of Infini-Transformer in Pytorch
☆113Updated 10 months ago
fkodom / yet-another-retnet
A simple but robust PyTorch implementation of RetNet from "Retentive Network: A Successor to Transformer for Large Language Models" (http…
☆106Updated last year
kyegomez / MambaTransformer
Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modeling
☆210Updated last month
kyegomez / Python-Package-Template
A easy, reliable, fluid template for python packages complete with docs, testing suites, readme's, github workflows, linting and much muc…
☆194Updated last month
lucidrains / pytorch-custom-utils
Just some miscellaneous utility functions / decorators / modules related to Pytorch and Accelerate to help speed up implementation of new…
☆123Updated last year
WalBouss / LeGrad
[ICCV25] Official Implementation of LeGrad
☆82Updated last year
lucidrains / agent-attention-pytorch
Implementation of Agent Attention in Pytorch
☆92Updated last year
ariG23498 / quantized-diffusion-inference
Notebook and Scripts that showcase running quantized diffusion models on consumer GPUs
☆38Updated last year
fangyuan-ksgk / Mini-LLaVA
A minimal implementation of LLaVA-style VLM with interleaved image & text & video processing ability.
☆96Updated 11 months ago
hkproj / multi-latent-attention
☆45Updated 5 months ago
fkodom / grouped-query-attention-pytorch
(Unofficial) PyTorch implementation of grouped-query attention (GQA) from "GQA: Training Generalized Multi-Query Transformer Models from …
☆182Updated last year