Teacher - student distillation using DeepSpeed
☆19Oct 7, 2022Updated 3 years ago
Alternatives and similar repositories for distill-bloom-deepspeed
Users that are interested in distill-bloom-deepspeed are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- All-in-one repository for Fine-tuning & Pretraining (Large) Language Models☆15Mar 8, 2023Updated 3 years ago
- Leaderboard of Frontier Models for Program Repair https://repairbench.github.io/☆11Oct 26, 2025Updated 5 months ago
- Code search model based the self-attention☆12Oct 16, 2020Updated 5 years ago
- Techniques used to run BLOOM at inference in parallel☆37Oct 21, 2022Updated 3 years ago
- ☆47Aug 5, 2025Updated 7 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Few-Shot Preference Optimization (FSPO) personalizes LLMs by reframing reward modeling as a meta-learning problem, enabling rapid adaptat…☆15Feb 27, 2025Updated last year
- Official implementation of the paper "Pretraining Language Models to Ponder in Continuous Space"☆25Jul 21, 2025Updated 8 months ago
- Efficient Finetuning for OpenAI GPT-OSS☆23Oct 2, 2025Updated 5 months ago
- training BART from scratch☆12Dec 31, 2021Updated 4 years ago
- Directed masked autoencoders☆14Mar 17, 2026Updated last week
- Train your own GPT2!☆14Apr 11, 2023Updated 2 years ago
- ☆14Sep 7, 2022Updated 3 years ago
- ☆15Apr 10, 2023Updated 2 years ago
- ☆16Dec 14, 2022Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Code for reproducing our paper "Low Rank Adapting Models for Sparse Autoencoder Features"☆17Mar 31, 2025Updated 11 months ago
- Set of macros that support type-wide and per-function logging with ability to customize how logs are handled☆16Sep 8, 2025Updated 6 months ago
- ☆15Mar 12, 2024Updated 2 years ago
- Making of cuda kernel☆16May 27, 2025Updated 10 months ago
- ☆48Jan 3, 2026Updated 2 months ago
- Scaling scaling laws with board games.☆53Jul 17, 2023Updated 2 years ago
- Princeton NLP's pre-training library based on fairseq with DeepSpeed kernel integration 🚃☆117Oct 27, 2022Updated 3 years ago
- Data for evaluating GPT-4V☆11Oct 26, 2023Updated 2 years ago
- 나무위키덤프에서 정제된 텍스트를 얻기 위한 NamuwikiExtractor☆19Feb 27, 2022Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- [PACT'24] GraNNDis. A fast and unified distributed graph neural network (GNN) training framework for both full-batch (full-graph) and min…☆10Aug 13, 2024Updated last year
- How well can Text-to-Image Generative Models understand Ethical Natural Language Interventions?☆13Aug 16, 2023Updated 2 years ago
- A Qt5 app that plots timestamped MQTT data – status: unfinished alpha software.☆10May 7, 2022Updated 3 years ago
- Official Pytorch implementation of "Omni-AVSR: Towards Unified Multimodal Speech Recognition with Large Language Models" [IEEE ICASSP 202…☆32Mar 10, 2026Updated 2 weeks ago
- ☆10Apr 29, 2023Updated 2 years ago
- ☆11Feb 2, 2026Updated last month
- ☆17Jul 10, 2022Updated 3 years ago
- Spline interpolation with FITPACK for xtensor.☆14Apr 10, 2018Updated 7 years ago
- Open Source + Multilingual MLLM + Fine-tuning + Distillation + More efficient models and learning + ?☆18Jan 31, 2025Updated last year
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- Code for the paper "CoS: Enhancing Personalization and Mitigating Bias with Context Steering"☆20Dec 13, 2024Updated last year
- ☆21Jun 1, 2025Updated 9 months ago
- ☆14Jun 20, 2022Updated 3 years ago
- Open-source AI acceleration on FPGA: from ONNX to RTL☆52Updated this week
- ☆14Aug 3, 2024Updated last year
- Rubik ESP32 esp-idf Device driver library.☆12Jul 3, 2021Updated 4 years ago
- ☆13May 25, 2023Updated 2 years ago