Teacher - student distillation using DeepSpeed
☆19Oct 7, 2022Updated 3 years ago
Alternatives and similar repositories for distill-bloom-deepspeed
Users that are interested in distill-bloom-deepspeed are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Implementation of the algorithm described in "Multi-sentence compression: Finding shortest paths in word graphs" by Katja Filippova.☆12Apr 27, 2015Updated 11 years ago
- Synthetic Data Generation with Execution-Based Verification and Grounding for LLM Training.☆22Feb 7, 2025Updated last year
- Code search model based the self-attention☆12Oct 16, 2020Updated 5 years ago
- Techniques used to run BLOOM at inference in parallel☆37Oct 21, 2022Updated 3 years ago
- [ICLR 2025] No Preference Left Behind: Group Distributional Preference Optimization☆16Apr 21, 2025Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Efficient Finetuning for OpenAI GPT-OSS☆24Oct 2, 2025Updated 8 months ago
- Contains the code for my Imperial College London Master's thesis on text summarization☆11Oct 25, 2022Updated 3 years ago
- On-the-fly Definition Augmentation of LLMs for Biomedical NER☆14Apr 14, 2025Updated last year
- Train your own GPT2!☆14Apr 11, 2023Updated 3 years ago
- Directed masked autoencoders☆15Mar 25, 2026Updated 2 months ago
- LLM for solidity smart contract automated program repair☆18Mar 5, 2025Updated last year
- ☆14Sep 7, 2022Updated 3 years ago
- ☆16Dec 14, 2022Updated 3 years ago
- Code for reproducing our paper "Low Rank Adapting Models for Sparse Autoencoder Features"☆17Mar 31, 2025Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- A model implementation of sessions for koa using postgres as the backend☆10Oct 16, 2017Updated 8 years ago
- Making of cuda kernel☆17May 27, 2025Updated last year
- Princeton NLP's pre-training library based on fairseq with DeepSpeed kernel integration 🚃☆117Oct 27, 2022Updated 3 years ago
- 나무위키덤프에서 정제된 텍스트를 얻기 위한 NamuwikiExtractor☆19Feb 27, 2022Updated 4 years ago
- Data for evaluating GPT-4V☆11Oct 26, 2023Updated 2 years ago
- ☆17Oct 30, 2022Updated 3 years ago
- Official Pytorch implementation of "Omni-AVSR: Towards Unified Multimodal Speech Recognition with Large Language Models" [IEEE ICASSP 202…☆37Mar 10, 2026Updated 3 months ago
- ☆16Mar 12, 2024Updated 2 years ago
- ☆12Feb 2, 2026Updated 4 months ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- ☆17Jul 10, 2022Updated 3 years ago
- Open Source + Multilingual MLLM + Fine-tuning + Distillation + More efficient models and learning + ?☆18Jan 31, 2025Updated last year
- Code for the paper "CoS: Enhancing Personalization and Mitigating Bias with Context Steering"☆20Dec 13, 2024Updated last year
- ☆14Jun 20, 2022Updated 3 years ago
- Rubik ESP32 esp-idf Device driver library.☆12Jul 3, 2021Updated 4 years ago
- Structured argument extraction for Korean☆22Feb 17, 2022Updated 4 years ago
- Verilog code for a low power RFID chip that will communicate with I2C sensors.☆13Apr 18, 2014Updated 12 years ago
- Calculating FLOPs of Pre-trained Models in NLP☆18Mar 29, 2021Updated 5 years ago
- ☆16Sep 4, 2025Updated 9 months ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Self-Supervised Speech Pre-training and Representation Learning Toolkit.☆10Feb 29, 2024Updated 2 years ago
- Word2vec Model Reader for Node.js Client☆13May 8, 2019Updated 7 years ago
- EBAZ4205 Board FPGA project☆15May 14, 2026Updated last month
- Qwen2 VL Fine Tuning using Llama Factory☆19Sep 7, 2024Updated last year
- ☆11Nov 27, 2022Updated 3 years ago
- ☆11Sep 25, 2020Updated 5 years ago
- [WACV2023] This is the official PyTorch impelementation of our paper "[Rethinking Rotation in Self-Supervised Contrastive Learning: Adapt…☆12Feb 24, 2023Updated 3 years ago