OFA-Sys/DiverseEvol

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/OFA-Sys/DiverseEvol)

OFA-Sys / DiverseEvol

Self-Evolved Diverse Data Sampling for Efficient Instruction Tuning

☆88

Alternatives and similar repositories for DiverseEvol

Users that are interested in DiverseEvol are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

AAAI-DISIM-UnivAQ / DALI
View on GitHub
DALI Multi Agent System Framework
☆43Mar 24, 2026Updated 3 months ago
kevinscaria / TarGEN
View on GitHub
Targeted Data Generation with Large Language Models
☆19Jun 25, 2024Updated 2 years ago
Blue-Raincoat / SelectIT
View on GitHub
☆24Oct 14, 2024Updated last year
ysh-1998 / CoWPiRec
View on GitHub
The official implementation for Collaborative Word-based Pre-trained Item Representation for Transferable Recommendation.
☆25Jan 30, 2024Updated 2 years ago
IronBeliever / CaR
View on GitHub
Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation
☆91Nov 13, 2024Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
lunyiliu / CoachLM
View on GitHub
Code and data for CoachLM, an automatic instruction revision approach LLM instruction tuning.
☆60Mar 20, 2024Updated 2 years ago
aitor-martinez-seras / SNN-Automotive-Object-Detection
View on GitHub
Code of the paper "Efficient Object Detection in Autonomous Driving using Spiking Neural Networks: Performance, Energy Consumption Analys…
☆27Dec 13, 2023Updated 2 years ago
hanningzhang / ER-PRM
View on GitHub
☆20Dec 14, 2024Updated last year
liuxy1103 / GRDBIS
View on GitHub
☆21Nov 9, 2025Updated 8 months ago
hkust-nlp / deita
View on GitHub
Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]
☆599Dec 9, 2024Updated last year
giangdip2410 / HyperRouter
View on GitHub
Code for this paper "HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts via HyperNetwork"
☆33Nov 29, 2023Updated 2 years ago
lucidrains / self-rewarding-lm-pytorch
View on GitHub
Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI
☆1,411Apr 11, 2024Updated 2 years ago
RUCAIBox / ChainLM
View on GitHub
☆31Mar 23, 2024Updated 2 years ago
zjukg / KnowPAT
View on GitHub
[Paper][ACL 2024 Findings] Knowledgeable Preference Alignment for LLMs in Domain-specific Question Answering
☆193Jun 10, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
kyegomez / SelfExtend
View on GitHub
Implementation of SelfExtend from the paper "LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning" from Pytorch and Zeta
☆13Nov 11, 2024Updated last year
ANTONIOPSD / CaptionIMG
View on GitHub
Simple program to manually caption your images (or any other file types) so you can use them for AI training
☆37Mar 20, 2023Updated 3 years ago
0nutation / SpeechAgents
View on GitHub
SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems
☆87Jan 9, 2024Updated 2 years ago
NVlabs / STL
View on GitHub
Official Pytorch Implementation of Self-emerging Token Labeling
☆35Mar 27, 2024Updated 2 years ago
lucy3 / whos_filtered
View on GitHub
☆15Oct 4, 2024Updated last year
EvanZhuang / MetaTree
View on GitHub
Official implementation of MetaTree: Learning a Decision Tree Algorithm with Transformers
☆115Sep 13, 2024Updated last year
foreverlasting1202 / QuestA
View on GitHub
☆22Jan 2, 2026Updated 6 months ago
IST-DASLab / SparseFinetuning
View on GitHub
Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry
☆43Jan 15, 2024Updated 2 years ago
cxcscmu / Montessori-Instruct
View on GitHub
Official repository for Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning [ICLR 2025]
☆51Jan 24, 2025Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
locuslab / scaling_laws_data_filtering
View on GitHub
☆64Apr 9, 2024Updated 2 years ago
UKPLab / arxiv2025-inherent-limits-plms
View on GitHub
Code repository for the paper "The Inherent Limits of Pretrained LLMs: The Unexpected Convergence of Instruction Tuning and In-Context Le…
☆14Jan 16, 2025Updated last year
zorazrw / filco
View on GitHub
[Preprint] Learning to Filter Context for Retrieval-Augmented Generaton
☆198Apr 6, 2024Updated 2 years ago
xypan0 / G-DIG
View on GitHub
☆12Jun 30, 2024Updated 2 years ago
cxcscmu / MATES
View on GitHub
Official repository for MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models [NeurIPS 2024]
☆80Nov 14, 2024Updated last year
PlusLabNLP / Active-IT
View on GitHub
Code for our EMNLP-2023 paper: "Active Instruction Tuning: Improving Cross-Task Generalization by Training on Prompt Sensitive Tasks"
☆26Nov 16, 2023Updated 2 years ago
Gahyeonkim09 / AAPL
View on GitHub
AAPL: Adding Attributes to Prompt Learning for Vision-Language Models (CVPRw 2024)
☆34May 8, 2024Updated 2 years ago
sunlicai / HiCMAE
View on GitHub
[Information Fusion 2024] HiCMAE: Hierarchical Contrastive Masked Autoencoder for Self-Supervised Audio-Visual Emotion Recognition
☆121Aug 29, 2025Updated 10 months ago
GeneZC / MiniMA
View on GitHub
Code for paper titled "Towards the Law of Capacity Gap in Distilling Language Models"
☆102Jul 9, 2024Updated 2 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
heyblackC / BetterMixture-Top1-Solution
View on GitHub
天池算法比赛《BetterMixture - 大模型数据混合挑战赛》的第一名top1解决方案
☆33Jul 7, 2024Updated 2 years ago
OFA-Sys / gsm8k-ScRel
View on GitHub
Codes and Data for Scaling Relationship on Learning Mathematical Reasoning with Large Language Models
☆268Sep 12, 2024Updated last year
yixuantt / PoolingAndAttn
View on GitHub
"Pooling And Attention: What Are Effective Designs For LLM-Based Embedding Models?"
☆39Nov 13, 2024Updated last year
ConiferLM / Conifer
View on GitHub
Conifer: Improving Complex Constrained Instruction-Following Ability of Large Language Models
☆91Apr 4, 2024Updated 2 years ago
thomasgauthier / LLM-self-play
View on GitHub
Minimal implementation of the Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models paper (ArXiv 20232401.01335)
☆29Mar 1, 2024Updated 2 years ago
limenlp / safer-instruct
View on GitHub
This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"
☆17Feb 22, 2024Updated 2 years ago
AxelSorensenDev / Eevee
View on GitHub
An Easy Annotation Tool for Natural Language Processing
☆12May 17, 2024Updated 2 years ago