Ongoing research training transformer language models at scale, including: BERT & GPT-2
☆19Jul 20, 2023Updated 2 years ago
Alternatives and similar repositories for Megatron-DeepSpeed
Users that are interested in Megatron-DeepSpeed are comparing it to the libraries listed below
Sorting:
- ☆84Sep 9, 2023Updated 2 years ago
- Control 3f robotiq gripper using python and modbus client☆13Jun 27, 2024Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMs☆12Nov 14, 2025Updated 3 months ago
- Vision-Language Models Toolbox: Your all-in-one solution for multimodal research and experimentation☆12Feb 16, 2025Updated last year
- train llama on a single A100 80G node using 🤗 transformers and 🚀 Deepspeed Pipeline Parallelism☆224Nov 21, 2023Updated 2 years ago
- 收录实现中文版ChatGPT的各种技术路线,数据及其他资料☆35Jul 12, 2023Updated 2 years ago
- (撰写ing..)本仓库偏教程性质,以「模型中文化」为一个典型的模型训练问题切入场景,指导读者上手学习LLM二次微调训练。☆37Aug 5, 2024Updated last year
- ☆23Jun 19, 2025Updated 8 months ago
- Awesome Entity Alignment is a collection of EA techniques, including papers, codes, and datasets.☆10Oct 27, 2022Updated 3 years ago
- ☆22Dec 23, 2025Updated 2 months ago
- ☆22Dec 11, 2025Updated 2 months ago
- Open ChatGLM Eyes to See the World☆13Mar 30, 2023Updated 2 years ago
- This iOS app demonstrates how to read PCM samples from a large wave files into a circular buffer, so that they can be processed and playe…☆18Feb 8, 2013Updated 13 years ago
- 使用谷歌翻译进行大规模翻译,免疫封锁☆10Aug 1, 2019Updated 6 years ago
- Code for paper: "Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines"☆11Oct 11, 2024Updated last year
- ☆10Jun 7, 2025Updated 8 months ago
- Using Q-learning to beat a Pong game program☆12Nov 9, 2022Updated 3 years ago
- Remote sensing labwork☆12Feb 27, 2018Updated 8 years ago
- ☆13May 25, 2023Updated 2 years ago
- ☆13May 17, 2025Updated 9 months ago
- [AAAI 2026] Official Code for VQAThinker: Exploring Generalizable and Explainable Video Quality Assessment via Reinforcement Learning☆19Nov 28, 2025Updated 3 months ago
- Official Code for "Painting with Words: Elevating Detailed Image Captioning with Benchmark and Alignment Learning" (ICLR 2025)☆12Mar 6, 2025Updated last year
- [ICDAR 2023] SelfDocSeg: A self-supervised vision-based approach towards Document Segmentation (Oral)☆42Oct 6, 2023Updated 2 years ago
- Search, download Vimeo videos and retrieve metadata in Go.☆11Feb 10, 2022Updated 4 years ago
- Image Text Segmentation using FAST corner detection and DBSCAN clustering with k-d tree data structure☆14Feb 27, 2019Updated 7 years ago
- Finetuning LLaMA with DeepSpeed☆10Apr 14, 2023Updated 2 years ago
- Code for the paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers" with GPT-J implementation.☆15Mar 22, 2023Updated 2 years ago
- Memory experiments with LLMs☆10Mar 31, 2023Updated 2 years ago
- ☆14Sep 6, 2024Updated last year
- flow mirror models from JZX AI Labs☆43Sep 30, 2024Updated last year
- Implementation of Unified Embedding: Battle-Tested Feature Representations for Web-Scale ML Systems☆14Nov 11, 2023Updated 2 years ago
- It's a Scrapy based scraper that extracts Images and Videos URLs from Shotdeck which is the largest collection of fully searchable high-d…☆11Jun 18, 2022Updated 3 years ago
- Expose a server running on your local machine to the internet, like Ngrok, based on Netty☆14Jun 1, 2021Updated 4 years ago
- Multi-Figurative Language Generation (COLING 2022)☆12Jan 30, 2023Updated 3 years ago
- ☆11Jan 8, 2025Updated last year
- ☆13Feb 18, 2023Updated 3 years ago
- [EMNLP'2024 Findings] Explore generated documents for enhanced IR with LLMs. We enhance BM25 to surpass strong dense retriever on many da…☆15Mar 28, 2025Updated 11 months ago
- [ICLR 2023] PyTorch code for DFPC: Data flow driven pruning of coupled channels without data.☆15Aug 25, 2023Updated 2 years ago
- answer's web client by vue webpack☆11Jan 12, 2023Updated 3 years ago