sachinhosmani / torchvistaLinks
Interactive Pytorch forward pass visualization in notebooks
☆602Updated last week
Alternatives and similar repositories for torchvista
Users that are interested in torchvista are comparing it to the libraries listed below
Sorting:
- Code release for DynamicTanh (DyT)☆1,021Updated 7 months ago
- This repository provides a Python script to fetch and summarize research papers from arXiv using the free Gemini API☆246Updated 7 months ago
- Official repository of my book "A Hands-On Guide to Fine-Tuning LLMs with PyTorch and Hugging Face"☆540Updated 3 weeks ago
- Interactively inspect module inputs, outputs, parameters, and gradients.☆352Updated 5 months ago
- First-principle implementations of groundbreaking AI algorithms using a wide range of deep learning frameworks, accompanied by supporting…☆178Updated 3 months ago
- Model Activity Visualiser☆519Updated 6 months ago
- Labs for MIT 6.S184/6.S975, IAP 2025☆245Updated 5 months ago
- A comprehensive book on neural networks and large language models in NLP☆403Updated 2 weeks ago
- A straightforward method for training your LLM, from downloading data to generating text.☆459Updated 3 months ago
- [NeurIPS 2025 D&B] Open-source Multi-agent Poster Generation from Papers☆2,795Updated 2 weeks ago
- Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paper☆775Updated 2 months ago
- A minimal, easy-to-read PyTorch reimplementation of the Qwen3 and Qwen2.5 VL with a fancy CLI☆179Updated last month
- Muon is an optimizer for hidden layers in neural networks☆1,960Updated 3 months ago
- a way to SSH into Kaggle!☆74Updated last month
- Generate a comprehensive review from an arXiv paper, then turn it into a blog post. This project powers the website below for the Hugging…☆802Updated 8 months ago
- Train a Language Model with GRPO to create a schedule from a list of events and priorities☆242Updated 6 months ago
- Implementation of Stable Diffusion with PyTorch☆353Updated 8 months ago
- Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation (NeurIPS 2025)☆493Updated last month
- A fully functional and simple Machine Learning library made entirely from scratch with Python.☆308Updated this week
- [NeurIPS 2025 Spotlight] TPA: Tensor ProducT ATTenTion Transformer (T6) (https://arxiv.org/abs/2501.06425)☆418Updated last week
- A collection of tricks and tools to speed up transformer models☆182Updated last week
- From scratch implementation of a vision language model in pure PyTorch☆246Updated last year
- Machine Learning Q and AI book☆676Updated 2 months ago
- Automatic Video Generation from Scientific Papers☆1,360Updated 2 weeks ago
- Textbook on reinforcement learning from human feedback☆1,279Updated last week
- H-Net: Hierarchical Network with Dynamic Chunking☆764Updated last month
- When it comes to optimizers, it's always better to be safe than sorry☆376Updated last month
- Contains the public resources of Hands on GenAI book☆202Updated 9 months ago
- A summary of all lucidrains repositores and links to training / research approaches by LAION or other communities.☆316Updated 2 years ago
- A curated collection of papers, tutorials, videos, and other valuable resources related to Mamba.☆656Updated 2 months ago