基于PyTorch GPT-2的针对各种数据并行pretrain的研究代码.
☆11Dec 16, 2022Updated 3 years ago
Alternatives and similar repositories for DistributedTrainingGPT2
Users that are interested in DistributedTrainingGPT2 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Optimizing loading training data from cloud bucket storage for cloud-based distributed deep learning. Official repository for Quantifying…☆11Jan 1, 2022Updated 4 years ago
- 基于Lean大佬Lede源码编译。使用 Flippy 的 Openwrt 打包源码,主要制作 Phicomm N1、Amlogic S905x3 的 openwrt 固件及CR660X固件。☆12Oct 4, 2025Updated 8 months ago
- d3 plugin for web interfaces☆14Jul 2, 2020Updated 5 years ago
- ☆10Oct 8, 2021Updated 4 years ago
- A very hacky set of functions for getting plotly to do what I want when doing mech interp research, designed to be compatible with PyTorc…☆14Jun 16, 2023Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- It was may. A tiny OS.☆10Apr 13, 2023Updated 3 years ago
- ☆14Oct 7, 2023Updated 2 years ago
- ☆20Oct 8, 2024Updated last year
- This is a adobe amf library for golang, only amf3 supported for now.☆29May 15, 2013Updated 13 years ago
- ☆17Jun 8, 2019Updated 7 years ago
- 深入ElasticSearch☆17Mar 8, 2016Updated 10 years ago
- TREC QA dataset for question answering cleaned for usage in Question Answering☆14Aug 26, 2019Updated 6 years ago
- OpenWrt上Smartdns的自动守护进程,放到/etc/init.d目录用service smartdnsprocd enable开机自启☆19Jun 21, 2021Updated 5 years ago
- Evaluating Large Language Models with Grid-Based Game Competitions: An Extensible LLM Benchmark and Leaderboard☆25Dec 14, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Code for Paper (Preserving Diversity in Supervised Fine-tuning of Large Language Models)☆58May 12, 2025Updated last year
- 在监控画质下实现对校园自行车的重识别,包含REID模型识别,向量数据库检索,UI展示☆11Feb 13, 2024Updated 2 years ago
- Official Pytorch implementation of "DBS: Dynamic Batch Size for Distributed Deep Neural Network Training"☆23Sep 30, 2021Updated 4 years ago
- My Assignment for CSE 599w http://dlsys.cs.washington.edu/☆15Dec 2, 2019Updated 6 years ago
- Compact and Agent-Native MoE Training System☆209Jun 24, 2026Updated last week
- ☆11Dec 15, 2025Updated 6 months ago
- LongAttn :Selecting Long-context Training Data via Token-level Attention☆15Jul 16, 2025Updated 11 months ago
- ☆12Jun 15, 2023Updated 3 years ago
- 一个高效的前后端集成框架,基于Vite、Vue、Webpack和Node.js。一键启动,开箱即用。 An efficient front-end and back-end integration framework based on Vite, Vue, Webpack…☆18May 8, 2026Updated last month
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Code for EMNLP'24 paper - On Diversified Preferences of Large Language Model Alignment☆16Aug 6, 2024Updated last year
- ☆16May 22, 2023Updated 3 years ago
- Understanding deep networks and large models.☆30Jan 23, 2026Updated 5 months ago
- Use the tokenizer in parallel to achieve superior acceleration☆20Mar 21, 2024Updated 2 years ago
- A tool for PHP multi process asynchronous tasks manage☆18Aug 14, 2018Updated 7 years ago
- Implementation for POET and POET-X for LLM pretraining☆37Jun 9, 2026Updated 3 weeks ago
- Measuring the Signal to Noise Ratio in Language Model Evaluation☆30Aug 19, 2025Updated 10 months ago
- Examples of using Cilium for chaos testing and fault injection☆28Sep 12, 2024Updated last year
- ☆29Aug 6, 2020Updated 5 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- A python implementation of exposure fusion☆19Jan 29, 2022Updated 4 years ago
- ☆25Jun 12, 2023Updated 3 years ago
- 京东小组件☆24Jan 28, 2022Updated 4 years ago
- OPUS-Rota4: A Gradient-Based Protein Side-Chain Modeling Framework Assisted by Deep Learning-Based Predictors☆11Apr 14, 2022Updated 4 years ago
- The Go language implementation of gRPC over QUIC.☆30Dec 2, 2021Updated 4 years ago
- Based on the R1-Zero method, using rule-based rewards and GRPO on the Code Contests dataset.☆18Apr 22, 2025Updated last year
- LCA-on-the-line (ICML 2024 Oral)☆14Feb 13, 2025Updated last year