sachinhosmani / torchvistaLinks
Interactive Pytorch forward pass visualization in notebooks
☆586Updated this week
Alternatives and similar repositories for torchvista
Users that are interested in torchvista are comparing it to the libraries listed below
Sorting:
- Code release for DynamicTanh (DyT)☆1,012Updated 5 months ago
- This repository provides a Python script to fetch and summarize research papers from arXiv using the free Gemini API☆242Updated 6 months ago
- Official repository of my book "A Hands-On Guide to Fine-Tuning LLMs with PyTorch and Hugging Face"☆509Updated last week
- Interactively inspect module inputs, outputs, parameters, and gradients.☆350Updated 4 months ago
- Implementation of Stable Diffusion with PyTorch☆350Updated 7 months ago
- Model Activity Visualiser☆520Updated 5 months ago
- First-principle implementations of groundbreaking AI algorithms using a wide range of deep learning frameworks, accompanied by supporting…☆178Updated 2 months ago
- Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation: https://www.youtube.com/watch?v=vAmKB7iPkWw☆541Updated 9 months ago
- A fully functional and simple Machine Learning library made entirely from scratch with Python.☆297Updated last month
- Unofficial implementation of Titans, SOTA memory for transformers, in Pytorch☆1,455Updated 3 months ago
- A comprehensive book on neural networks and large language models in NLP☆278Updated this week
- Official PyTorch implementation of the paper "Dataset Distillation with Neural Characteristic Function: A Minmax Perspective" (NCFM) in C…☆382Updated 2 weeks ago
- Open-source Multi-agent Poster Generation from Papers☆2,578Updated 2 weeks ago
- Learning Deep Representations of Data Distributions☆289Updated this week
- Train a Language Model with GRPO to create a schedule from a list of events and priorities☆231Updated 4 months ago
- Implementation of all RL algorithms in a simpler way☆1,124Updated 3 weeks ago
- Muon is an optimizer for hidden layers in neural networks☆1,735Updated 2 months ago
- ☆367Updated 5 months ago
- Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation☆443Updated last month
- From scratch implementation of a vision language model in pure PyTorch☆239Updated last year
- Minimal and annotated implementations of key ideas from modern deep learning research.☆1,138Updated 2 months ago
- Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paper☆744Updated last month
- Official implementation of TPA: Tensor ProducT ATTenTion Transformer (T6) (https://arxiv.org/abs/2501.06425)☆386Updated this week
- Building DeepSeek R1 from Scratch☆698Updated 6 months ago
- When it comes to optimizers, it's always better to be safe than sorry☆372Updated 3 weeks ago
- Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning☆3,277Updated 2 months ago
- A collection of tricks and tools to speed up transformer models☆178Updated 2 weeks ago
- Simple project page template for your research paper, built with Astro and Tailwind CSS☆385Updated 2 weeks ago
- Generate a comprehensive review from an arXiv paper, then turn it into a blog post. This project powers the website below for the Hugging…☆791Updated 7 months ago
- Static suckless single batch CUDA-only qwen3-0.6B mini inference engine☆468Updated last week