Survey on Data-centric Large Language Models
☆89Jul 8, 2024Updated last year
Alternatives and similar repositories for Data-centric_multimodal_LLM
Users that are interested in Data-centric_multimodal_LLM are comparing it to the libraries listed below
Sorting:
- MetaLadder: Ascending Mathematical Solution Quality via Analogical-Problem Reasoning Transfer (EMNLP 2025)☆11Apr 18, 2025Updated 10 months ago
- From Accuracy to Robustness: A Study of Rule- and Model-based Verifiers in Mathematical Reasoning.☆25Oct 7, 2025Updated 4 months ago
- Official PyTorch implementation of RACRO (https://www.arxiv.org/abs/2506.04559)☆19Jul 1, 2025Updated 8 months ago
- This is the repo for the paper Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining.☆46Aug 22, 2025Updated 6 months ago
- Dataflow-MM, multi-media operators for Dataflow. We aim to prepare data for Multimodal Large Language Models.☆31Feb 25, 2026Updated last week
- ☆16Sep 4, 2025Updated 6 months ago
- DataFlex is a data-centric training framework that enhances model performance by either selecting the most influential samples, optimizin…☆113Feb 4, 2026Updated last month
- The official pytorch implementation of Exploring the Interactive Guidance for Unified and Effective Image Matting [TOMM 2025]☆24Nov 24, 2025Updated 3 months ago
- ☆14Mar 20, 2025Updated 11 months ago
- WisdoMentor - Series: A LLM for undergraduates | 博导智言(辅助大学生 学习)☆13May 9, 2024Updated last year
- Github repository for "Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging" (ICML 2025)☆88Sep 23, 2025Updated 5 months ago
- [CVPR 2025] DiscoVLA: Discrepancy Reduction in Vision, Language, and Alignment for Parameter-Efficient Video-Text Retrieval☆21Jun 23, 2025Updated 8 months ago
- A project designed to build and render a full Minecraft crafting tree.☆10Aug 10, 2021Updated 4 years ago
- PGRAG☆52Jul 16, 2024Updated last year
- Code for paper "W-RAG: Weakly Supervised Dense Retrieval in RAG for Open-domain Question Answering"☆15Oct 2, 2025Updated 5 months ago
- Code for EMNLP2023 paper "MolCA: Molecular Graph-Language Modeling with Cross-Modal Projector and Uni-Modal Adapter".☆12Dec 27, 2023Updated 2 years ago
- ☆15Oct 4, 2024Updated last year
- Structuring Hour-Long Videos into Navigable Chapters and Hierarchical Summaries☆34Nov 19, 2025Updated 3 months ago
- LoPA: Scaling dLLM Inference via Lookahead Parallel Decoding☆35Jan 16, 2026Updated last month
- ☆21Jul 21, 2025Updated 7 months ago
- ☆19Jun 4, 2025Updated 9 months ago
- ☆12Jan 10, 2025Updated last year
- ☆35May 16, 2025Updated 9 months ago
- Official PyTorch implementation for Distilling Dataset into Neural Field [ICLR 2025]☆15Mar 20, 2025Updated 11 months ago
- ☆15May 30, 2025Updated 9 months ago
- Diffusion-based generative drug-like molecular editing with chemical natural language☆18Dec 22, 2024Updated last year
- This repo contains code for the paper "Both Text and Images Leaked! A Systematic Analysis of Data Contamination in Multimodal LLM"☆18Oct 17, 2025Updated 4 months ago
- [ICML 2025] Code for "R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts"☆19Mar 10, 2025Updated 11 months ago
- Fine-tuning Galactica and Gemma to operate on SMILES. Integrates into a molecular optimization algorithm.☆36Feb 20, 2025Updated last year
- ☆113Sep 11, 2025Updated 5 months ago
- ☆14Mar 15, 2024Updated last year
- Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers☆28Mar 1, 2025Updated last year
- Official Implementation of paper "Distilling Long-tailed Datasets" [CVPR 2025]☆19Aug 13, 2025Updated 6 months ago
- [ICCV 2025] The official implementation of the paper “Street-to-Satellite Image Synthesis with Diffusion Models and BEV Paradigm”☆82Oct 17, 2025Updated 4 months ago
- 基于Llama3,通过进一步CPT,SFT,ORPO得到的中文版Llama3☆17Apr 24, 2024Updated last year
- Official Code for our paper: "Language Models Learn to Mislead Humans via RLHF""☆19Oct 11, 2024Updated last year
- [ICLR 2025] Official code of "Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization"☆18Jun 1, 2024Updated last year
- ☆17Feb 4, 2025Updated last year
- This is the official implementation of the paper “Griffin: Towards a Graph-Centric Relational Database Foundation Model.”☆34Sep 25, 2025Updated 5 months ago