Distributed training (multi-node) of a Transformer model
☆96Apr 10, 2024Updated 2 years ago
Alternatives and similar repositories for pytorch-transformer-distributed
Users that are interested in pytorch-transformer-distributed are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Notes on Direct Preference Optimization☆25Apr 14, 2024Updated 2 years ago
- Notes on the Mistral AI model☆20Dec 27, 2023Updated 2 years ago
- Notes and commented code for RLHF (PPO)☆129Feb 27, 2024Updated 2 years ago
- ML algorithms implementations that are good for learning the underlying principles☆28Dec 7, 2024Updated last year
- ☆15Feb 23, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Anything I read, whether it's a paper, a book, or an article, I'll post here.☆11Feb 13, 2025Updated last year
- Minimal fastai code needed for working with pytorch☆15Aug 25, 2021Updated 4 years ago
- Attention is all you need implementation☆1,199Jun 8, 2024Updated last year
- This repository contains comprehensive pricing and configuration data for LLMs. It powers cost attribution for 200+ enterprises running 4…☆72Apr 12, 2026Updated last week
- A walkthrough of essential DVC features (including tutorial text as well as a working environment).☆17Nov 22, 2022Updated 3 years ago
- ☆19Sep 9, 2024Updated last year
- E-GRPO: High Entropy Steps Drive Effective Reinforcement Learning for Flow Models☆41Jan 5, 2026Updated 3 months ago
- image retrieval/tagging with CLIP☆13Jul 13, 2024Updated last year
- LLaMA 2 implemented from scratch in PyTorch☆369Sep 25, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- [NeurIPS 2025] RL Tango: Reinforcing Generator and Verifier Together for Language Reasoning☆53Oct 23, 2025Updated 5 months ago
- Code, documentation, and tutorials for the DGD model trained on bulk RNA-Seq data.☆13Aug 27, 2024Updated last year
- ☆29Oct 2, 2025Updated 6 months ago
- Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation: https://www.youtube.com/watch?v=vAmKB7iPkWw☆602Dec 6, 2024Updated last year
- [ACL 2023] Solving Math Word Problems via Cooperative Reasoning induced Language Models (LLMs + MCTS + Self-Improvement)☆50Dec 15, 2023Updated 2 years ago
- Notes on the Mamba and the S4 model (Mamba: Linear-Time Sequence Modeling with Selective State Spaces)☆181Jan 7, 2024Updated 2 years ago
- ☆27Jun 6, 2024Updated last year
- Llama3开源模型中文版-全方位测评,基于SuperCLUE基准 | Llama3 Chinese Evaluation with SuperCLUE☆16Apr 21, 2024Updated last year
- ☆14Mar 9, 2023Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- JapaneseArabic Dictionary (日本語・アラビア語辞書) قاموس اللغة اليابانية والعربية (Yomitan)☆19May 20, 2025Updated 10 months ago
- An implementation of online data mixing for the Pile dataset, based on the GPT-NeoX library.☆14Jan 9, 2024Updated 2 years ago
- A comprehensive Model Context Protocol (MCP) server providing advanced access to the UniProt protein database.☆19Dec 21, 2025Updated 3 months ago
- Multi-agent system for booking appointments and generating PDF invoices☆13Jul 16, 2025Updated 9 months ago
- A Catalog lists instruction sets, models available for Indic language☆10Mar 14, 2024Updated 2 years ago
- ☆12Mar 28, 2023Updated 3 years ago
- minimal diffusion transformer in pytorch.☆17Oct 6, 2024Updated last year
- ☆39Apr 5, 2024Updated 2 years ago
- ReMe: A Personalized Cognitive Training Framework Based on an LLM Voice Chatbot for Research☆18Jul 3, 2025Updated 9 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Pretraining summarization models using a corpus of nonsense☆13Sep 28, 2021Updated 4 years ago
- ☆10Jan 28, 2024Updated 2 years ago
- ☆11Mar 5, 2025Updated last year
- Thesis project about Visual Anomaly Detection based on Self Supervised Learning. The model identifies anomalies from information acquired…☆10Apr 14, 2023Updated 3 years ago
- ☆15Apr 1, 2024Updated 2 years ago
- Let ChatGPT answer your Gmail for you☆15Feb 12, 2024Updated 2 years ago
- A personal AI therapist to help you with your mental health☆24Nov 29, 2025Updated 4 months ago