This repo contains some extensions of deepspeed-chat for fine-tuning LLMs (SFT+RLHF).
☆21Jul 2, 2024Updated last year
Alternatives and similar repositories for DeepSpeed-Chat-Extension
Users that are interested in DeepSpeed-Chat-Extension are comparing it to the libraries listed below
Sorting:
- This repository contains the code for SFT, RLHF, and DPO, designed for vision-based LLMs, including the LLaVA models and the LLaMA-3.2-vi…☆117Jun 18, 2025Updated 8 months ago
- Beyond Decoder-only: Large Language Models Can be Good Encoders for Machine Translation☆28Jun 30, 2025Updated 8 months ago
- Codebase for fine-tuning Llama2 70B to generate math test questions and answers.☆11Aug 30, 2024Updated last year
- Concurrency library☆17Oct 13, 2024Updated last year
- ☆11Dec 23, 2024Updated last year
- Repo for paper "CODIS: Benchmarking Context-Dependent Visual Comprehension for Multimodal Large Language Models".☆12Oct 14, 2024Updated last year
- Python Inference Script(PyIS)☆19Aug 30, 2022Updated 3 years ago
- ☆10Apr 7, 2024Updated last year
- Models for packages and the resources they contain.☆14Mar 10, 2024Updated last year
- Material parsers and other tools, scripts Initially developed for Grobid Superconductor☆13Feb 21, 2025Updated last year
- An active inference model of Lacanian psychoanalysis☆15Jun 7, 2025Updated 8 months ago
- Develop C++/CUDA extensions with PyTorch like Python scripts☆10Jan 7, 2026Updated last month
- Original VinVL visual backbone with simplified APIs to easily extract features, boxes, object detections, in a few lines of Python code.☆11Nov 27, 2022Updated 3 years ago
- [AAAI2024] An official pytorch implement of the paper: Vision-Language Pre-training with Object Contrastive Learning for 3D Scene Underst…☆13Dec 8, 2024Updated last year
- CANdle - a library for using USB-FDCAN dongle and communicating with md80 drives☆15Sep 15, 2025Updated 5 months ago
- This library implements functions and classes for mesh registration, data augmentation, and data normalisation.☆11Oct 7, 2024Updated last year
- DICE: Detecting In-distribution Data Contamination with LLM's Internal State☆11Sep 21, 2024Updated last year
- ☆11Apr 6, 2024Updated last year
- Smallest ellipse covering a finite set of points☆14Jan 3, 2025Updated last year
- 🧩 Design-Information-Modeling for Kit-of-Parts 🏘️☆16Updated this week
- [KDD24-ADS] R-Eval: A Unified Toolkit for Evaluating Domain Knowledge of Retrieval Augmented Large Language Models☆11Apr 9, 2024Updated last year
- TiC: Exploring Vision Transformer in Convolution☆11Oct 24, 2023Updated 2 years ago
- ☆11Jan 19, 2025Updated last year
- Artifact for TOSEM Submission: GiantRepair☆13Jun 26, 2024Updated last year
- Position Coupling: Improving Length Generalization of Arithmetic Transformers Using Task Structure (NeurIPS 2024) + Arithmetic Transfor…☆14Oct 26, 2025Updated 4 months ago
- f-PO: Generalizing Preference Optimization with f-divergence Minimization☆13Apr 2, 2025Updated 11 months ago
- A dependency injection library for python, aimed for the least amount of magic.☆12Feb 23, 2022Updated 4 years ago
- The project for speech translation☆12Sep 28, 2023Updated 2 years ago
- [IROS2025]Adjacent-view Transformers for Supervised Surround-view Depth Estimation☆14Nov 14, 2025Updated 3 months ago
- A Novel Semantic Segmentation Network using Enhanced Boundaries in Cluttered Scenes (WACV 2025)☆11Aug 11, 2025Updated 6 months ago
- Label shift estimation for transfer difficulty with Familiarity.☆10Feb 4, 2025Updated last year
- Official Repository for paper "Ontology-Free General-Domain Knowledge Graph-to-Text Generation Dataset Synthesis using Large Language Mod…☆14Nov 25, 2024Updated last year
- R package for metabolic enzyme enrichment anaylsis☆13Oct 24, 2025Updated 4 months ago
- A thread-safe vector database for model inference inside LMDB.☆15Feb 18, 2026Updated last week
- ☆11Jan 3, 2024Updated 2 years ago
- Official Implementation of "The Graph Database Interface: Scaling Online Transactional and Analytical Graph Workloads to Hundreds of Thou…☆14Jul 2, 2025Updated 8 months ago
- Ghostfolio-feeder extends Ghostfolio by adding market data via internal APIs☆18Jul 24, 2025Updated 7 months ago
- CRISPR, faster, better – The Crackling method for whole-genome target detection☆10Jan 11, 2024Updated 2 years ago
- text-only training or language-free training for multimodal tasks (image/audio/video caption, retrieval, text2image)☆12Oct 15, 2024Updated last year