pacman100 / accelerate-deepspeed-testLinks
Testing DeepSpeed integration in π€ Accelerate
β11Updated 3 years ago
Alternatives and similar repositories for accelerate-deepspeed-test
Users that are interested in accelerate-deepspeed-test are comparing it to the libraries listed below
Sorting:
- evolve llm training instruction, from english instruction to any language.β119Updated 2 years ago
- This hands-on lab aims to alleviate some of that headache by demonstrating how to create/augment a QnA dataset from complex unstructured β¦β64Updated 9 months ago
- [ACL 2024] LangBridge: Multilingual Reasoning Without Multilingual Supervisionβ95Updated last year
- An unofficial implementation of SOLAR-10.7B model and the newly proposed interlocked-DUS(iDUS) implementation and experiment details.β14Updated last year
- Sakura-SOLAR-DPO: Merge, SFT, and DPOβ116Updated 2 years ago
- MultilingualSIFT: Multilingual Supervised Instruction Fine-tuningβ96Updated 2 years ago
- β16Updated last year
- [NAACL 2024] Official repository for "KTRL+F: Knowledge-Augmented In-Document Search"β23Updated last year
- β32Updated 2 years ago
- This repository provides the code for applying Contrastive Learning Penalty Loss (CLPL) and Mixture of Experts (MoE) to the BGE-M3 text eβ¦β11Updated last year
- β129Updated last year
- [ACL 2023] Code and Data Repo for Paper "Element-aware Summary and Summary Chain-of-Thought (SumCoT)"β53Updated 2 years ago
- PyTorch reimplementation of the paper "SimCLS: A Simple Framework for Contrastive Learning of Abstractive Summarization"β16Updated 4 years ago
- Continue Pretraining T5 on custom dataset based on available pretrained model checkpointsβ38Updated 4 years ago
- ACL 2023 short: Balancing Lexical and Semantic Quality in Abstractive Summarizationβ16Updated 2 years ago
- official repository for ListT5β48Updated 2 months ago
- Train π€transformers with DeepSpeed: ZeRO-2, ZeRO-3β23Updated 4 years ago
- β20Updated last year
- The git repository of Modular Prompted Chatbot paperβ35Updated 2 years ago
- Alpaca-lora for huggingface implementation using Deepspeed and FullyShardedDataParallelβ24Updated 2 years ago
- [AAAI 2024] Investigating the Effectiveness of Task-Agnostic Prefix Prompt for Instruction Followingβ78Updated last year
- β57Updated last year
- [NeurIPS 2024] Train LLMs with diverse system messages reflecting individualized preferences to generalize to unseen system messagesβ52Updated 5 months ago
- SECOM: On Memory Construction and Retrieval for Personalized Conversational Agents, ICLR 2025β53Updated 11 months ago
- Benchmarking library for RAGβ255Updated last week
- Finetune mistral-7b-instruct for sentence embeddingsβ88Updated last year
- π’ Data Toolkit for Sailor Language Modelsβ95Updated 11 months ago
- [ICLR 2023] Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learnersβ116Updated 7 months ago
- Source code of the paper: RetrievalQA: Assessing Adaptive Retrieval-Augmented Generation for Short-form Open-Domain Question Answering [Fβ¦β68Updated last year
- Keep Me Updated! Memory Management in Long-term Conversations (Findings of EMNLP 2022)β33Updated 3 years ago