mbalesni/deepspeed_llama

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/mbalesni/deepspeed_llama)

mbalesni / deepspeed_llama

Finetuning LLaMA with DeepSpeed

☆10

Alternatives and similar repositories for deepspeed_llama

Users that are interested in deepspeed_llama are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

rioyokotalab / Megatron-Llama2
View on GitHub
2023 ABCI Llama-2 継続学習プロジェクト
☆14Jan 22, 2024Updated 2 years ago
AmpereComputingAI / ampere_model_library
View on GitHub
AML's goal is to make benchmarking of various AI architectures on Ampere CPUs a pleasurable experience :)
☆23Feb 26, 2026Updated 5 months ago
IronySuzumiya / NiuDianNao
View on GitHub
A simple cycle-accurate DaDianNao simulator
☆13Mar 27, 2019Updated 7 years ago
Ryu1845 / hyena-jax
View on GitHub
Implementation of Hyena Hierarchy in JAX
☆10Apr 30, 2023Updated 3 years ago
dwfault / CollAFLplusplus
View on GitHub
Implement CollAFL using LLVM LTO pass on afl++.
☆12Sep 24, 2020Updated 5 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
AdrianBZG / LLM-distributed-finetune
View on GitHub
Tune efficiently any LLM model from HuggingFace using distributed training (multiple GPU) and DeepSpeed. Uses Ray AIR to orchestrate the …
☆60Jun 20, 2023Updated 3 years ago
upward-spiral-research / x-proxy
View on GitHub
An API for simplifying X requests for a single authenticated account
☆26Dec 20, 2024Updated last year
wenmin92 / NCE2
View on GitHub
《新概念英语2》GitBook，本书仅包含课文，作为语料库，制成电子书，方便搜索。
☆14May 10, 2017Updated 9 years ago
gccnlp / Light-PEFT
View on GitHub
[ACL 2024 Findings] Light-PEFT: Lightening Parameter-Efficient Fine-Tuning via Early Pruning
☆13Sep 2, 2024Updated last year
M-Taghizadeh / flan-t5-base-imdb-text-classification
View on GitHub
In this implementation, using the Flan T5 large language model, we performed the Text Classification task on the IMDB dataset and obtaine…
☆23May 12, 2023Updated 3 years ago
johnpzh / parallel_ANNS
View on GitHub
Parallel Approximate Nearest Neighbor Search
☆14Nov 12, 2022Updated 3 years ago
Crypt0knights / OpenTrack
View on GitHub
An Efficient Supply Chain Management System using Blockchain & Machine Learning.
☆10Nov 27, 2019Updated 6 years ago
ycao5602 / KAFAL
View on GitHub
Code for the paper "Knowledge-Aware Federated Active Learning with Non-IID Data", ICCV2023
☆10Sep 8, 2023Updated 2 years ago
apuaaChen / gcnLib
View on GitHub
☆10Aug 2, 2021Updated 4 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
Viibrant / MineGen
View on GitHub
🏛️ Generating Minecraft Schematics
☆17Jun 10, 2025Updated last year
SuDIS-ZJU / rookies
View on GitHub
Rookie's guide
☆14Aug 10, 2024Updated last year
philschmid / deep-learning-habana-huggingface
View on GitHub
☆33Dec 9, 2022Updated 3 years ago
PKUZHOU / PetS-ATC-2022
View on GitHub
☆10Sep 14, 2023Updated 2 years ago
awasthiabhijeet / Error-Driven-ASR-Personalization
View on GitHub
Code for "Error-driven Fixed-Budget ASR Personalization for Accented Speakers" in ICASSP 2021
☆11Jun 13, 2021Updated 5 years ago
HuangOwen / Quantization-Variation
View on GitHub
[TMLR] Official PyTorch implementation of paper "Quantization Variation: A New Perspective on Training Transformers with Low-Bit Precisio…
☆50Sep 27, 2024Updated last year
Daftstone / TrialAttack
View on GitHub
Tensorflow implementation of TrialAttack (Triple Adversarial Learning for Influence based Poisoning Attack in Recommender Systems. KDD 20…
☆12Sep 2, 2021Updated 4 years ago
qgwang-hust / GraSU
View on GitHub
A Fast Graph Update Library for FPGA-based Dynamic Graph Processing
☆10Dec 20, 2021Updated 4 years ago
harvard-cns / Harvard-CNS-Seminar
View on GitHub
Reading seminar in Harvard Cloud Networking and Systems Group
☆16Aug 29, 2022Updated 3 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
mlsysAE2022 / ae_mlsys_gnn
View on GitHub
☆11Mar 9, 2022Updated 4 years ago
harvard-edge / Gables
View on GitHub
☆15Apr 3, 2020Updated 6 years ago
wangqinsi1 / CoreInfer
View on GitHub
This is the official Python version of CoreInfer: Accelerating Large Language Model Inference with Semantics-Inspired Adaptive Sparse Act…
☆18Oct 25, 2024Updated last year
UlugbekSalaev / UzTransliterator
View on GitHub
UzTransliterator | State-of-the-art machine transliteration tool for Uzbek language
☆13Jan 6, 2026Updated 6 months ago
wuch15 / FedAttack
View on GitHub
Source code of FedAttack.
☆11Feb 9, 2022Updated 4 years ago
KyberNetwork / bridge_eth_smart_contracts
View on GitHub
☆16Feb 17, 2019Updated 7 years ago
mustard-seed / SparseDNNAccelerator
View on GitHub
Sparse CNN Accelerator targeting Intel FPGA
☆15Aug 26, 2021Updated 4 years ago
microideax / T-DLA
View on GitHub
☆20Dec 3, 2019Updated 6 years ago
YangCao28 / nano-SGLang
View on GitHub
Nano SGLang
☆16Jul 21, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Haskely / gsm8k-rft-llama7b-u13b_evaluation
View on GitHub
测试 https://huggingface.co/OFA-Sys/gsm8k-rft-llama7b-u13b 的 GSM8K 分数
☆15Aug 10, 2023Updated 2 years ago
R-Stefano / Remote-Sensing-Analysis
View on GitHub
Implementing a remote sensing object detector using Tensorflow object detection API
☆21Jul 31, 2019Updated 6 years ago
CodeByPinar / Diabetes_Health_Prediction_and_Analysis
View on GitHub
A comprehensive project to predict and analyze diabetes health data using advanced machine learning models, including Logistic Regression…
☆10Jun 12, 2024Updated 2 years ago
YaelBenShalom / Objects-Recognition-and-Classification
View on GitHub
Objects recognition and classification using machine learning, computer vision and real-time object detection algorithm
☆14Sep 14, 2022Updated 3 years ago
UCLA-SEAL / HeteroGen
View on GitHub
HeteroGen: transpiling C to heterogeneous HLS code with automated test generation and program repair (ASPLOS 2022)
☆16Sep 25, 2024Updated last year
SuDIS-ZJU / nlcTables
View on GitHub
☆15Jan 27, 2026Updated 6 months ago
goodreasonai / praetor-data
View on GitHub
Praetor is a lightweight finetuning data and prompt management tool
☆67Nov 16, 2024Updated last year