Jiachen-T-Wang/GREATS

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Jiachen-T-Wang/GREATS)

Jiachen-T-Wang / GREATS

☆20

Alternatives and similar repositories for GREATS

Users that are interested in GREATS are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

reds-lab / projektor
View on GitHub
This is an official repository for "Performance Scaling via Optimal Transport: Enabling Data Selection from Partially Revealed Sources" (…
☆14Oct 26, 2023Updated 2 years ago
simplelifetime / TIVE
View on GitHub
Less is More: High-value Data Selection for Visual Instruction Tuning
☆20Jan 18, 2025Updated last year
hrtan / MoSo
View on GitHub
[NeurIPS-2023] The PyTorch Implementation of MoSo. The algorithms are based on our paper: "Data Pruning via Moving-one-Sample-out". MoSo …
☆10May 21, 2026Updated 2 months ago
CodeCreator / WebOrganizer
View on GitHub
Organize the Web: Constructing Domains Enhances Pre-Training Data Curation
☆83May 2, 2025Updated last year
VITA-Group / ProgressiveDD
View on GitHub
[ICLR 2024] "Data Distillation Can Be Like Vodka: Distilling More Times For Better Quality" by Xuxi Chen*, Yu Yang*, Zhangyang Wang, Baha…
☆15May 18, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
PAIR-code / pretraining-tda
View on GitHub
☆33Feb 11, 2025Updated last year
HazyResearch / skill-it
View on GitHub
Skill-It! A Data-Driven Skills Framework for Understanding and Training Language Models
☆48Oct 31, 2023Updated 2 years ago
ablghtianyi / ICL_Modular_Arithmetic
View on GitHub
☆19Mar 25, 2025Updated last year
HazyResearch / aioli
View on GitHub
Aioli: A unified optimization framework for language model data mixing
☆33Jan 17, 2025Updated last year
hamishivi / automated-instruction-selection
View on GitHub
Exploration of automated dataset selection approaches at large scales.
☆55Mar 4, 2025Updated last year
reds-lab / LAVA
View on GitHub
This is an official repository for "LAVA: Data Valuation without Pre-Specified Learning Algorithms" (ICLR2023).
☆54Jun 5, 2024Updated 2 years ago
Evanwu1125 / LiteCoT
View on GitHub
☆17Jun 10, 2025Updated last year
alon-albalak / data-selection-survey
View on GitHub
A Survey on Data Selection for Language Models
☆261Apr 29, 2025Updated last year
TristanThrush / perplexity-correlations
View on GitHub
Simple and scalable tools for data-driven pretraining data selection.
☆30Jun 9, 2025Updated last year
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
MadryLab / journey-TRAK
View on GitHub
Code for the paper "The Journey, Not the Destination: How Data Guides Diffusion Models"
☆26Dec 12, 2023Updated 2 years ago
JJchy / CG_score
View on GitHub
Data Valuation without Training of a Model, submitted to ICLR'23
☆22Dec 30, 2022Updated 3 years ago
foreverlasting1202 / QuestA
View on GitHub
☆22Jan 2, 2026Updated 6 months ago
hendrydong / NTK-and-MF-examples
View on GitHub
Visualization of mean field and neural tangent kernel regime
☆23Jul 25, 2024Updated 2 years ago
microsoft / data-efficacy
View on GitHub
Data Efficacy for Language Model Training
☆52May 29, 2026Updated 2 months ago
jhejna / remix
View on GitHub
☆44Aug 26, 2024Updated last year
maximek3 / MIMIC-NLE
View on GitHub
☆21Jul 25, 2022Updated 4 years ago
IDEA-XL / SubgDiff
View on GitHub
The official implementation of NeurIPS2024 paper "SubgDiff: A Subgraph Diffusion Model to Improve Molecular Representation Learning."
☆11May 28, 2025Updated last year
ZhaoYuTJPU / MSSGCL
View on GitHub
the source code of IJCAI 2023 paper "Multi-Scale subgraph contrastive learning"
☆11Apr 25, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
reds-lab / CLIP-MIA
View on GitHub
This is an official repository for Practical Membership Inference Attacks Against Large-Scale Multi-Modal Models: A Pilot Study (ICCV2023…
☆26Sep 29, 2023Updated 2 years ago
princeton-nlp / QuRating
View on GitHub
[ICML 2024] Selecting High-Quality Data for Training Language Models
☆204Dec 8, 2025Updated 7 months ago
xiaoxiaokuye / ICML_2024_ai4sci_paper
View on GitHub
☆10Jun 10, 2024Updated 2 years ago
marshallmurphy / solar-system-threejs
View on GitHub
☆14Jul 29, 2020Updated 5 years ago
lucy3 / whos_filtered
View on GitHub
☆15Oct 4, 2024Updated last year
ChenglinYu / BHN
View on GitHub
☆10May 28, 2023Updated 3 years ago
GraphMoLab / Graph2Token
View on GitHub
☆13Jul 2, 2025Updated last year
g-benton / hessian-eff-dim
View on GitHub
Public Codebase for Rethinking Parameter Counting: Effective Dimensionality Revisited
☆37Dec 27, 2022Updated 3 years ago
r-three / AttriBoT
View on GitHub
Code for AttriBoT from "AttriBoT: A Bag of Tricks for Efficiently Approximating Leave-One-Out Context Attribution"
☆15Apr 21, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
DandanGuo1993 / reweight-imbalance-classification-with-OT
View on GitHub
☆13Nov 8, 2022Updated 3 years ago
RobertTLange / deep-rl-tutorial
View on GitHub
A Tutorial on Deep Reinforcement Learning in PyTorch
☆34Jul 6, 2023Updated 3 years ago
A4Bio / ADesigner
View on GitHub
The official implementation of the AAAI'24 paper Cross-Gate MLP with Protein Complex Invariant Embedding is A One-Shot Antibody Designer.
☆12Dec 28, 2023Updated 2 years ago
Egg-Hu / LoRA-Recycle
View on GitHub
[CVPR 2025] LoRA Recycle: Unlocking Tuning-Free Few-Shot Adaptability in Visual Foundation Models by Recycling Pre-Tuned LoRAs
☆14Jun 20, 2025Updated last year
yegcjs / mixinglaws
View on GitHub
☆113Jul 15, 2025Updated last year
clementbernardd / Count-Based-Exploration
View on GitHub
Our version of #Exploration: A Study of Count-Based Explorationfor Deep Reinforcement Learning for a class project
☆17Apr 30, 2021Updated 5 years ago
Scriddie / Varsortability
View on GitHub
Implementations of var-sortability, sortnregress, and chain-orientation as presented in the article "Beware of the Simulated DAG": https:…
☆15Nov 2, 2023Updated 2 years ago