Ongoing research training transformer language models at scale, including: BERT & GPT-2
☆20Oct 23, 2023Updated 2 years ago
Alternatives and similar repositories for Megatron-DeepSpeed
Users that are interested in Megatron-DeepSpeed are comparing it to the libraries listed below
Sorting:
- Source-to-Source Debuggable Derivatives in Pure Python☆15Jan 23, 2024Updated 2 years ago
- Lab exercises for the DL4MT winter school at DCU☆15Oct 21, 2015Updated 10 years ago
- Overview of corpora/datasets for Germanic low-resource languages and dialects. Accompanies "A Survey of Corpora for Germanic Low-Resource…☆26Feb 16, 2026Updated last week
- ☆21Mar 3, 2025Updated 11 months ago
- Code for our WOAH@ACL 2021 Paper on Data Integration for Toxic Comment Classification: Making More Than 40 Datasets Easily Accessible in …☆30Nov 25, 2021Updated 4 years ago
- Platform API Project seed☆12Nov 8, 2023Updated 2 years ago
- An Educational Framework Based on PyTorch for Deep Learning Education and Exploration☆10Dec 24, 2023Updated 2 years ago
- A Model Agnostic function to directly remove specified layers from the LLM☆10May 23, 2024Updated last year
- ☆12Dec 8, 2022Updated 3 years ago
- [ACL 2023] Counterspeeches up my sleeve! Intent Distribution Learning and Persistent Fusion for Intent-Conditioned Counterspeech Generati…☆10Sep 23, 2023Updated 2 years ago
- Evaluation of Oasis Platform - simple install, UI and API☆14Feb 9, 2026Updated 2 weeks ago
- Application for Agent re-engineering for better and reliable Gen AI workflows.☆10Jul 20, 2025Updated 7 months ago
- A RAG that can scale 🧑🏻💻☆11May 28, 2024Updated last year
- Automatic Thief Detection via CCTV with Alarm System and Perpetrator Image Capture using YOLOv5 + ROI. This project utilizes computer vis…☆14Oct 21, 2024Updated last year
- Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization☆12Dec 3, 2024Updated last year
- Indexing framework designed for the automated creation of structured knowledge bases in Azure AI Search☆14Jun 18, 2025Updated 8 months ago
- NLPBench: Evaluating NLP-Related Problem-solving Ability in Large Language Models☆10Oct 27, 2023Updated 2 years ago
- See https://github.com/cuda-mode/triton-index/ instead!☆11May 8, 2024Updated last year
- Full List of Bad Words and Top Swear Words Banned by Google. As they closed the api☆12Sep 26, 2018Updated 7 years ago
- Script for using Bing chat like a meal delivery service.☆12Mar 15, 2023Updated 2 years ago
- Cuda extensions for PyTorch☆12Dec 2, 2025Updated 2 months ago
- ☆11Jul 7, 2023Updated 2 years ago
- ☆10Jul 13, 2024Updated last year
- ☆11Jan 10, 2020Updated 6 years ago
- ☆11Aug 15, 2024Updated last year
- Automate Checkmarx Scanning and Onboarding Plus AWS Access☆12Jan 5, 2023Updated 3 years ago
- BFloat16 Fused Adam Operator for PyTorch☆16Nov 16, 2024Updated last year
- First steps in Machine Learning☆12Mar 18, 2015Updated 10 years ago
- alternative remote for Lego Boost with Pythonista and iOS☆10Aug 27, 2017Updated 8 years ago
- TARS: MinMax Token-Adaptive Preference Strategy for Hallucination Reduction in MLLMs☆23Sep 21, 2025Updated 5 months ago
- Effortlessly process invoices with AI! This project uses the Llama3.2 Vision Model for OCR, converting invoice images into structured, ma…☆10Feb 5, 2025Updated last year
- A tool to explore ideas generated from artificial intelligence chats.☆10Apr 3, 2023Updated 2 years ago
- Deep Learning with Multiple Objectives: 2021 edition☆10May 27, 2021Updated 4 years ago
- ☆10Oct 28, 2020Updated 5 years ago
- Python library for Synthetic Data Generation☆52Feb 16, 2026Updated last week
- A list where most values will be None (or default)☆10Jul 19, 2023Updated 2 years ago
- [NAACL 2018] Robust Sequence Labeling with Adversarial Training☆10Sep 30, 2019Updated 6 years ago
- Chain-of-thought 방식을 활용하여 llama2를 fine-tuning☆10Nov 18, 2023Updated 2 years ago
- Trying Tigerbeetle transactional database.☆11Jul 14, 2024Updated last year