NanoTorch is Deep Learning Library from scratch using Numpy and Math.
☆21Jul 8, 2024Updated last year
Alternatives and similar repositories for NanoTorch
Users that are interested in NanoTorch are comparing it to the libraries listed below
Sorting:
- Source code to accompany research paper on training multi token prediction language models using self-distillation.☆24Feb 21, 2026Updated last week
- Materials and Information for Ethics of Artificial Intelligence and Machine Learning☆11Sep 29, 2023Updated 2 years ago
- brute but stronger☆11Aug 4, 2022Updated 3 years ago
- ☆18Oct 12, 2022Updated 3 years ago
- ☆31Feb 26, 2026Updated last week
- contains the notebooks that I created for implementing probabilistic matrix factorization in PyTorch for music recommendation engine.☆17Mar 21, 2019Updated 6 years ago
- Custom triton kernels for training Karpathy's nanoGPT.☆19Oct 21, 2024Updated last year
- Algorithms for approximate attention in LLMs☆21Apr 14, 2025Updated 10 months ago
- python experiment management toolset☆16Sep 23, 2019Updated 6 years ago
- Official repository for Beyond Binary Rewards: Training LMs to Reason about Their Uncertainty☆54Aug 20, 2025Updated 6 months ago
- Save, load, host, and share AI model checkpoints without slowing down training. Host on Lightning AI or your own cloud with enterprise-gr…☆42Feb 3, 2026Updated last month
- Adversarial attacks in consensus-based multi-agent reinforcement learning☆25Feb 1, 2023Updated 3 years ago
- More about the exploration-exploitation tradeoff with harder bandits☆24May 12, 2019Updated 6 years ago
- Gemstones: A Model Suite for Multi-Faceted Scaling Laws (NeurIPS 2025)☆33Sep 28, 2025Updated 5 months ago
- ☆33Nov 27, 2023Updated 2 years ago
- ☆28Sep 13, 2021Updated 4 years ago
- This repository includes code to reproduce the tables in "Loss Landscapes are All You Need: Neural Network Generalization Can Be Explaine…☆39Mar 2, 2023Updated 3 years ago
- Codes for paper "Few Shot Network Compression via Cross Distillation", AAAI 2020.☆31Jan 31, 2020Updated 6 years ago
- Training vision models with full-batch gradient descent and regularization☆39Feb 14, 2023Updated 3 years ago
- [NeurIPS 2023] Repetition In Repetition Out: Towards Understanding Neural Text Degeneration from the Data Perspective☆40Oct 17, 2023Updated 2 years ago
- Official implementation of GOAT model (ICML2023)☆38Jul 3, 2023Updated 2 years ago
- A Matlab library for solving optimization problems with forward-backward splitting☆39Dec 25, 2018Updated 7 years ago
- ☆41Jan 3, 2025Updated last year
- ☆76Nov 22, 2025Updated 3 months ago
- ☆40Jul 17, 2022Updated 3 years ago
- KitanaQA: Adversarial training and data augmentation for neural question-answering models☆56Jul 23, 2023Updated 2 years ago
- What do we learn from inverting CLIP models?☆58Mar 6, 2024Updated last year
- Code for the paper "Understanding Generalization through Visualizations"☆65Jan 15, 2021Updated 5 years ago
- Configuration with Dataclasses+YAML+Argparse. Fork of Pyrallis☆79Feb 12, 2026Updated 3 weeks ago
- A fusion of a linear layer and a cross entropy loss, written for pytorch in triton.☆75Aug 2, 2024Updated last year
- Implementation of Double DQN reinforcement learning for OpenAI Gym environments with PyTorch.☆71May 30, 2025Updated 9 months ago
- ☆77Dec 19, 2024Updated last year
- [NeurIPS 2025 Spotlight] Co-Evolving LLM Coder and Unit Tester via Reinforcement Learning☆152Sep 19, 2025Updated 5 months ago
- Official repo for Detecting, Explaining, and Mitigating Memorization in Diffusion Models (ICLR 2024)☆78Apr 3, 2024Updated last year
- [NeurIPS 2024] Goldfish Loss: Mitigating Memorization in Generative LLMs☆94Nov 17, 2024Updated last year
- federated-learning☆86Jan 10, 2023Updated 3 years ago
- A Pytorch implementation of the multi agent deep deterministic policy gradients (MADDPG) algorithm☆380Apr 8, 2021Updated 4 years ago
- We view Large Language Models as stochastic language layers in a network, where the learnable parameters are the natural language prompts…☆95Jul 25, 2024Updated last year
- Code for ACL 2022 paper "BERT Learns to Teach: Knowledge Distillation with Meta Learning".☆87Aug 4, 2022Updated 3 years ago