This code repository contains the code used for my "Optimizing Memory Usage for Training LLMs and Vision Transformers in PyTorch" blog post.
☆93Jul 14, 2023Updated 2 years ago
Alternatives and similar repositories for pytorch-memory-optim
Users that are interested in pytorch-memory-optim are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Gzip and nearest neighbors for text classification☆57Aug 1, 2023Updated 2 years ago
- ☆17Jun 19, 2023Updated 2 years ago
- RAPIDS Deployment Documentation☆15May 13, 2026Updated last week
- ☆10Nov 6, 2024Updated last year
- ☆130Oct 25, 2023Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- My study notes and hands-on projects for CUDA-based GPU programming☆12Dec 11, 2025Updated 5 months ago
- Plan✕ is a platform for creating and publishing digital planning services☆18Updated this week
- Distilling key points, reorganizing, and modestly augmenting the points from books and lectures.☆12Updated this week
- Converting a deep neural network to integer-only inference in native C via uniform quantization and the fixed-point representation.☆26Jan 31, 2022Updated 4 years ago
- [COLM 2024] SKVQ: Sliding-window Key and Value Cache Quantization for Large Language Models☆24Oct 5, 2024Updated last year
- AI-powered browser extension to chat with any webpage☆11Aug 12, 2025Updated 9 months ago
- ☆14Nov 21, 2025Updated 6 months ago
- Supervised instruction finetuning for LLM with HF trainer and Deepspeed☆37Jul 6, 2023Updated 2 years ago
- Python intefrace for evaluation on chatgpt models☆19Feb 13, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Various R lang related material for teaching.☆24Oct 16, 2020Updated 5 years ago
- ☆53Jul 18, 2024Updated last year
- a version of baby agi using dspy and typed predictors☆16Mar 9, 2024Updated 2 years ago
- ☆249Nov 24, 2025Updated 5 months ago
- Discover, analyze and present data from the web and mobile in meaninful ways☆83Jul 16, 2013Updated 12 years ago
- Source code for COLING 2022 paper "Automatic Label Sequence Generation for Prompting Sequence-to-sequence Models"☆24Sep 21, 2022Updated 3 years ago
- ☆28Apr 26, 2023Updated 3 years ago
- A tiny server to run local inference on MLX model in the style of OpenAI☆13Jan 31, 2024Updated 2 years ago
- Finetuning BLOOM on a single GPU using gradient-accumulation☆32Mar 29, 2023Updated 3 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆11Aug 22, 2023Updated 2 years ago
- Identify the unused properties in your CSS☆15Jan 5, 2023Updated 3 years ago
- Official PyTorch Implementation for Vision-Language Models Create Cross-Modal Task Representations, ICML 2025☆33May 1, 2025Updated last year
- ☆13Dec 3, 2021Updated 4 years ago
- ☆42Mar 28, 2024Updated 2 years ago
- An implementation of several unsupervised object discovery models (Slot Attention, SLATE, GNM) in PyTorch with pre-trained models.☆15May 26, 2025Updated 11 months ago
- Object-Centric-Representation Library (OCRL): This repo is to explore OCR on various downstream tasks from supervised learning tasks to R…☆12Feb 23, 2024Updated 2 years ago
- Machine learning for molecular dynamics☆13Jan 9, 2025Updated last year
- Testing paligemma2 finetuning on reasoning dataset☆18Dec 28, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Handy list of network visualisation libraries for R☆12Nov 11, 2019Updated 6 years ago
- ☆75Apr 18, 2026Updated last month
- ☆15Mar 11, 2021Updated 5 years ago
- Advanced Analytics data collection for M365 usage☆24May 11, 2026Updated last week
- All my experiments with the various transformers and various transformer frameworks available☆14Apr 30, 2021Updated 5 years ago
- Implementation from scratch in CUDA C++ of image processing algorithms.☆22Oct 26, 2020Updated 5 years ago
- FIWARE 401: IDM - Managing Users and Organizations☆10May 1, 2026Updated 2 weeks ago