LydiaXiaohongLi / Megatron-DeepSpeedView external linksLinks
Ongoing research training transformer language models at scale, including: BERT & GPT-2
☆19Jul 20, 2023Updated 2 years ago
Alternatives and similar repositories for Megatron-DeepSpeed
Users that are interested in Megatron-DeepSpeed are comparing it to the libraries listed below
Sorting:
- ☆84Sep 9, 2023Updated 2 years ago
- BLOOM 模型的指令微调☆24Jun 15, 2023Updated 2 years ago
- This is the official repo for the paper "LLM-FE"☆55Feb 3, 2026Updated last week
- Vision-Language Models Toolbox: Your all-in-one solution for multimodal research and experimentation☆12Feb 16, 2025Updated 11 months ago
- ☆22Dec 11, 2025Updated 2 months ago
- Debug DeepSpeed-Chat step by step in IDE (在IDE里一步一步调试DeepSpeed-Chat)☆10Apr 17, 2023Updated 2 years ago
- F-16 is a powerful video large language model (LLM) that perceives high-frame-rate videos, which is developed by the Department of Electr…☆34Jul 3, 2025Updated 7 months ago
- Image Text Segmentation using FAST corner detection and DBSCAN clustering with k-d tree data structure☆13Feb 27, 2019Updated 6 years ago
- Code for the paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers" with GPT-J implementation.☆15Mar 22, 2023Updated 2 years ago
- This iOS app demonstrates how to read PCM samples from a large wave files into a circular buffer, so that they can be processed and playe…☆18Feb 8, 2013Updated 13 years ago
- ☆13May 17, 2025Updated 8 months ago
- Automated detection of exudates from fundus images plays an important role in diabetic retinopathy (DR) screening and evaluation, for whi…☆10Dec 11, 2020Updated 5 years ago
- chinese wwm masking and ngram masking based on jieba☆11Jul 25, 2019Updated 6 years ago
- ☆14Sep 6, 2024Updated last year
- The Levenberg-Marquardt method solves nonlinear least square problem☆12Dec 3, 2019Updated 6 years ago
- Open ChatGLM Eyes to See the World☆13Mar 30, 2023Updated 2 years ago
- ☆10Jun 7, 2025Updated 8 months ago
- [ICDAR 2023] SelfDocSeg: A self-supervised vision-based approach towards Document Segmentation (Oral)☆42Oct 6, 2023Updated 2 years ago
- Using Q-learning to beat a Pong game program☆12Nov 9, 2022Updated 3 years ago
- ☆12May 2, 2022Updated 3 years ago
- 基于PaddleNLP开源的抽取式UIE进行医学命名实体识别(torch实现)☆44Aug 5, 2022Updated 3 years ago
- A toy poker simulator with a pluggable Player interface to implement Agents that play using both rules based strategies and llms.☆11Jan 30, 2026Updated 2 weeks ago
- [ICLR 2023] PyTorch code for DFPC: Data flow driven pruning of coupled channels without data.☆15Aug 25, 2023Updated 2 years ago
- ☆11Jan 8, 2025Updated last year
- 2021届天津大学最新毕设latex模板。☆12May 25, 2021Updated 4 years ago
- frame is an open source portifolio builder for developers where developers can add and manage their information, project, articles and mo…☆14Dec 24, 2024Updated last year
- ☆13Feb 18, 2023Updated 2 years ago
- Multi-Figurative Language Generation (COLING 2022)☆12Jan 30, 2023Updated 3 years ago
- answer's web client by vue webpack☆11Jan 12, 2023Updated 3 years ago
- MXNet finetune baseline (res152) for challenger.ai/competition/scene☆11Sep 24, 2017Updated 8 years ago
- Complementary-Similarity Learning using Quadruplet Network☆13Mar 2, 2020Updated 5 years ago
- A Mac App to find unused source files in XCode project.☆13Sep 24, 2018Updated 7 years ago
- 基于PyTorch GPT-2的针对各种数据并行pretrain的研究代码.☆11Dec 16, 2022Updated 3 years ago
- 百度迁徙指数以及流出去向,全国所有地级市精度☆10May 30, 2020Updated 5 years ago
- This project aim to convert video files to different encrypted pieces and play via only in built player,just like offline download videos…☆13Jan 28, 2019Updated 7 years ago
- Vlaser: Vision-Language-Action Model with Synergistic Embodied Reasoning☆41Dec 17, 2025Updated last month
- Implementation of Unified Embedding: Battle-Tested Feature Representations for Web-Scale ML Systems☆14Nov 11, 2023Updated 2 years ago
- It's a Scrapy based scraper that extracts Images and Videos URLs from Shotdeck which is the largest collection of fully searchable high-d…☆11Jun 18, 2022Updated 3 years ago
- Retrieves parquet files from Hugging Face, identifies and quantifies junky data, duplication, contamination, and biased content in datase…☆53Jul 6, 2023Updated 2 years ago