haihua0913 / awesome-dq4mlLinks

Useful resources on data quality for machine learning and artificial intelligence.

☆19

Alternatives and similar repositories for awesome-dq4ml

Users that are interested in awesome-dq4ml are comparing it to the libraries listed below

Sorting:

chenzhongwu20 / RuleRAG_ICL_FT
RuleRAG: Rule-guided Retrieval-Augmented Generation with Language Models for Question Answering
☆22Updated 6 months ago
NEUIR / M2RAG
This is the code repo for our paper "Benchmarking Retrieval-Augmented Generation in Multi-Modal Contexts".
☆32Updated 2 months ago
RUCAIBox / ChainLM
☆28Updated last year
sam234990 / ArchRAG
Hierarchical RAG
☆12Updated this week
zhaochenyang20 / Prompt2Model-Self-Guide
SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning. COLM 2024 Accepted Paper
☆32Updated last year
MLLM-Data-Contamination / MM-Detect
This repo contains code for the paper "Both Text and Images Leaked! A Systematic Analysis of Data Contamination in Multimodal LLM"
☆14Updated 2 months ago
ignorejjj / LongRefiner
The code for paper: Hierarchical Document Refinement for Long-context Retrieval-augmented Generation
☆19Updated this week
tigerchen52 / awesome_role_of_small_models
a curated list of the role of small models in the LLM era
☆100Updated 8 months ago
du-nlp-lab / MLR-Copilot
☆65Updated 2 months ago
opendatalab / OHR-Bench
OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation
☆75Updated 2 months ago
TIGER-AI-Lab / StructLM
Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)
☆76Updated 7 months ago
NeverMoreLCH / SearchLVLMs
Repository for the NeurIPS 2024 paper "SearchLVLMs: A Plug-and-Play Framework for Augmenting Large Vision-Language Models by Searching Up…
☆24Updated 5 months ago
EliasLumer / Graph-RAG-Tool-Fusion-ToolLinkOS
Official repository of Graph RAG-Tool Fusion and ToolLinkOS dataset.
☆12Updated 3 months ago
wjbmattingly / qwen2-vl-finetune-huggingface
This project is a collection of fine-tuning scripts to help researchers fine-tune Qwen 2 VL on HuggingFace datasets.
☆69Updated 8 months ago
InternLM / Condor
[ACL 2025] An official pytorch implement of the paper: Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement
☆27Updated last week
Quehry / HelloBench
HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models
☆45Updated 6 months ago
sylvain-wei / 24-Game-Reasoning
超简单复现Deepseek-R1-Zero和Deepseek-R1，以「24点游戏」为例。通过zero-RL、SFT以及SFT+RL，以激发LLM的自主验证反思能力。 About Clean, minimal, accessible reproduction of Dee…
☆18Updated 2 months ago
Ingvarstep / open-r1-text2graph
Open replication of DeepSeek R1 for text-to-graph extraction.
☆94Updated 4 months ago
lfy79001 / TableQAKit
A Toolkit for Table-based Question Answering
☆112Updated last year
Vision-CAIR / dochaystacks
Document Haystacks: Vision-Language Reasoning Over Piles of 1000+ Documents, CVPR 2025
☆18Updated 4 months ago
xverse-ai / XVERSE-MoE-A36B
XVERSE-MoE-A36B: A multilingual large language model developed by XVERSE Technology Inc.
☆38Updated 8 months ago
liangyuwang / zo2
ZO2 (Zeroth-Order Offloading): Full Parameter Fine-Tuning 175B LLMs with 18GB GPU Memory
☆95Updated last month
RhapsodyAILab / MiniCPM-V-Embedding
☆29Updated 9 months ago
XiaoduoAILab / XmodelVLM
☆68Updated 11 months ago
cxcscmu / RAGViz
Official repository for RAGViz: Diagnose and Visualize Retrieval-Augmented Generation [EMNLP 2024]
☆83Updated 4 months ago
yale-nlp / MCTS-RAG
☆47Updated 3 months ago
cfkgc-paper / CFKGC-paper
☆17Updated 6 months ago
zjunlp / OneKE
[WWW 2025] A Dockerized Schema-Guided LLM Agent-based Knowledge Extraction System.
☆70Updated last week
BaichuanSEED / BaichuanSEED.github.io
Official Repository for Paper "BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Compet…
☆18Updated 9 months ago
IDEA-FinAI / RagVL
Official PyTorch Implementation of MLLM Is a Strong Reranker: Advancing Multimodal Retrieval-augmented Generation via Knowledge-enhanced …
☆78Updated 6 months ago