hpcaitech / GPT-Demo
GPT Demo with hybrid distributed training
☆10Updated last year
Related projects ⓘ
Alternatives and complementary repositories for GPT-Demo
- Virtual Adversarial Training (VAT) techniques in PyTorch☆16Updated 2 years ago
- ☆23Updated 3 years ago
- A collection of models built with ColossalAI☆32Updated 2 years ago
- ☆23Updated 2 years ago
- Implementation of the model: "Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models" in PyTorch☆29Updated last week
- A GPT-based generative LM for combined text and math formulas, leveraging tree-based formula encoding.☆33Updated last year
- Transformers at any scale☆41Updated 10 months ago
- Implementation of the LDP module block in PyTorch and Zeta from the paper: "MobileVLM: A Fast, Strong and Open Vision Language Assistant …☆14Updated 8 months ago
- ☆49Updated last year
- ☆12Updated last year
- A collection of papers tackling automatic fact-checking (particularly of AI-generated content)☆14Updated last year
- possibly useful materials for learning RWKV language model.☆25Updated last year
- Implementation of autoregressive language model using improved Transformer and DeepSpeed pipeline parallelism.☆32Updated 2 years ago
- BigKnow2022: Bringing Language Models Up to Speed☆14Updated last year
- Large Scale Distributed Model Training strategy with Colossal AI and Lightning AI☆58Updated last year
- Source code for the GPT-2 story generation models in the EMNLP 2020 paper "STORIUM: A Dataset and Evaluation Platform for Human-in-the-Lo…☆39Updated 10 months ago
- Repository for Skill Set Optimization☆12Updated 3 months ago
- Modified version of T5-DST for Dialogue State Tracking.☆18Updated 2 years ago
- CogNetX is an advanced, multimodal neural network architecture inspired by human cognition. It integrates speech, vision, and video proce…☆12Updated last week
- ☆17Updated 2 years ago
- In-Context Alignment: Chat with Vanilla Language Models Before Fine-Tuning☆33Updated last year
- Vocabulary Trimming (VT) is a model compression technique, which reduces a multilingual LM vocabulary to a target language by deleting ir…☆31Updated 3 weeks ago
- ☆18Updated 5 months ago
- [ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators☆24Updated last year
- ☆27Updated last year
- A server powering LAION's effort to filter CommonCrawl with CLIP, building a large scale image-text dataset.☆13Updated last year
- Repository for "Self-Distillation for Model Stacking Unlocks Cross-Lingual NLU in 200+ Languages"☆13Updated last month
- ECIR'21: Simplified TinyBERT: Knowledge Distillation for Document Retrieval☆15Updated 3 years ago