FlagOpen/Infinity-Instruct

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/FlagOpen/Infinity-Instruct)

FlagOpen / Infinity-Instruct

☆51

Alternatives and similar repositories for Infinity-Instruct

Users that are interested in Infinity-Instruct are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

flagos-ai / DeepSeek-V4-FlagOS
View on GitHub
☆16Updated this week
google-research-datasets / QuoteSum
View on GitHub
QuoteSum is a textual QA dataset containing Semi-Extractive Multi-source Question Answering (SEMQA) examples written by humans, based on …
☆13Mar 25, 2024Updated 2 years ago
FlagOpen / TACO
View on GitHub
☆239Feb 28, 2026Updated 4 months ago
miaoyuchun / InfoRM
View on GitHub
The official implementation of InfoRM [NeurIPS 2024].
☆16Oct 25, 2025Updated 8 months ago
yuzhaouoe / pretraining-data-packing
View on GitHub
[ACL'24 Oral] Analysing The Impact of Sequence Composition on Language Model Pre-Training
☆24Aug 18, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
thunlp / CSS-LM
View on GitHub
CSS-LM: Contrastive Semi-supervised Fine-tuning of Pre-trained Language Models
☆11Jul 1, 2023Updated 3 years ago
csitfun / ConTRoL-dataset
View on GitHub
Dataset for AAAI paper "Natural Language Inference in Context - Investigating Contextual Reasoning over Long Texts"
☆11Nov 18, 2022Updated 3 years ago
USTC-StarTeam / ZIP
View on GitHub
arXiv 2024 | ZIP: entropy-law data selection for efficient LLM alignment.
☆28Jun 10, 2026Updated last month
feiyang-k / AutoScale
View on GitHub
Official Code Repository for [AutoScale📈: Scale-Aware Data Mixing for Pre-Training LLMs] Published as a conference paper at **COLM 2025*…
☆14Aug 8, 2025Updated 11 months ago
UpstageAI / evalverse-IFEval
View on GitHub
Submodule of evalverse forked from [google-research/instruction_following_eval](https://github.com/google-research/google-research/tree/m…
☆15May 4, 2024Updated 2 years ago
IBM / ColPret
View on GitHub
Efficient Scaling laws and collaborative pretraining.
☆22Updated this week
jukofyork / aiassistant
View on GitHub
An AI-powered coding assistant plugin for the Eclipse IDE.
☆14Oct 28, 2025Updated 8 months ago
mdy666 / Scalable-Flash-Native-Sparse-Attention
View on GitHub
☆48Dec 13, 2025Updated 7 months ago
zwhe99 / LLM-MT-Eval
View on GitHub
{DeepL, Google, WMT-Best, davinci-003, turbo, gpt-4} × {En-De, En-Cs, En-Ru, En-Zh, De-Fr, En-Ja, Uk-En, Uk-Cs, En-Hr, En-Ha, En-Is}
☆14Jun 18, 2023Updated 3 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
sail-sg / regmix
View on GitHub
[ICLR 2025] 🧬 RegMix: Data Mixture as Regression for Language Model Pre-training (Spotlight)
☆194Feb 17, 2025Updated last year
AI45Lab / DEAN
View on GitHub
☆11Oct 25, 2024Updated last year
THUIR / T2Ranking
View on GitHub
T2Ranking: A large-scale Chinese benchmark for passage ranking.
☆161Jul 3, 2023Updated 3 years ago
daeveraert / gradient-information-optimization
View on GitHub
Implementation of Gradient Information Optimization (GIO) for effective and scalable training data selection
☆14Jun 22, 2023Updated 3 years ago
Olivia-fsm / DoGE
View on GitHub
Codebase for ICML submission "DOGE: Domain Reweighting with Generalization Estimation"
☆21Feb 29, 2024Updated 2 years ago
openfeedback / superhf
View on GitHub
Open-source Human Feedback Library
☆11Oct 25, 2023Updated 2 years ago
tanganke / pareto_set_learning
View on GitHub
Code for paper "Towards Efficient Pareto Set Approximation via Weight-Ensembling Mixture of Experts"
☆11Sep 13, 2024Updated last year
FlagAI-Open / Aquila2
View on GitHub
The official repo of Aquila2 series proposed by BAAI, including pretrained & chat large language models.
☆446Oct 11, 2024Updated last year
dheeraj7596 / Small2Large
View on GitHub
☆18Feb 20, 2024Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
namespace-Pt / UltraGist
View on GitHub
☆18Dec 2, 2024Updated last year
HKUST-KnowComp / IntentionQA
View on GitHub
Code and data for the paper: IntentionQA: A Benchmark for Evaluating Purchase Intention Comprehension Abilities of Large Language Models …
☆12Apr 27, 2024Updated 2 years ago
OpenBMB / Tell_Me_More
View on GitHub
Repo for paper "Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven Agents"
☆65Feb 20, 2024Updated 2 years ago
LLaMafia / SFT_function_learning
View on GitHub
Explore what LLMs are really leanring over SFT
☆28Mar 30, 2024Updated 2 years ago
mlwu22 / RED
View on GitHub
Implementation code for ACL2024：Advancing Parameter Efficiency in Fine-tuning via Representation Editing
☆15Apr 20, 2024Updated 2 years ago
xhwang22 / Awesome-Reward-Hacking
View on GitHub
A curated list of papers and resources on Reward Hacking, Emergent Misalignment, and Proxy Exploitation in Large Models
☆41Apr 17, 2026Updated 3 months ago
CASIA-LM / MoDS
View on GitHub
☆153Apr 16, 2024Updated 2 years ago
VOID001 / simple-tg-bot
View on GitHub
Simple Telegram bot Framework
☆10Apr 8, 2017Updated 9 years ago
zhang-wei-chao / DC-PDD
View on GitHub
This repository presents the original implementation of Pretraining Data Detection for Large Language Models: A Divergence-based Calibrat…
☆23May 21, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
qhjqhj00 / MetaAgent
View on GitHub
MetaAgent: Toward Self-Evolving Agent via Tool Meta-Learning
☆47Sep 3, 2025Updated 10 months ago
Trustworthy-ML-Lab / ThinkEdit
View on GitHub
[EMNLP 25] An effective and interpretable weight-editing method for mitigating overly short reasoning in LLMs, and a mechanistic study un…
☆19Dec 17, 2025Updated 7 months ago
psunlpgroup / FoVer
View on GitHub
This repository includes code and materials for the paper "Efficient PRM Training Data Synthesis via Formal Verification" (ACL 2026 Findi…
☆18Apr 7, 2026Updated 3 months ago
wac81 / wiki_word2vec
View on GitHub
☆12Jan 22, 2017Updated 9 years ago
CryptoAILab / MergeGuard
View on GitHub
[CCS-LAMPS'24] LLM IP Protection Against Model Merging
☆16Oct 14, 2024Updated last year
thunlp / ChatEval
View on GitHub
Codes for our paper "ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate"
☆340Oct 19, 2024Updated last year
spyysalo / s800
View on GitHub
Tools for working with the S800 corpus
☆12Sep 17, 2020Updated 5 years ago