haihua0913 / awesome-dq4mlLinks
Useful resources on data quality for machine learning and artificial intelligence.
☆22Updated 8 months ago
Alternatives and similar repositories for awesome-dq4ml
Users that are interested in awesome-dq4ml are comparing it to the libraries listed below
Sorting:
- a curated list of the role of small models in the LLM era☆111Updated last year
- Official repository for RAGViz: Diagnose and Visualize Retrieval-Augmented Generation [EMNLP 2024]☆88Updated 11 months ago
- (ICCV 2025) OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation☆95Updated last month
- The All-in-one Judge Models introduced by Opencompass☆115Updated 5 months ago
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning. COLM 2024 Accepted Paper☆32Updated last year
- ☆96Updated last year
- ☆84Updated last year
- This is the code repo for our paper "Benchmarking Retrieval-Augmented Generation in Multi-Modal Contexts".☆41Updated 3 months ago
- Search, organize, discover anything!☆48Updated last year
- PGRAG☆52Updated last year
- The code for paper: Decoupled Planning and Execution: A Hierarchical Reasoning Framework for Deep Search☆63Updated 6 months ago
- [ACL 2025] An official pytorch implement of the paper: Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement☆39Updated 7 months ago
- Data and Code for EMNLP 2025 Findings Paper "MCTS-RAG: Enhancing Retrieval-Augmented Generation with Monte Carlo Tree Search"☆84Updated 2 months ago
- ☆67Updated 9 months ago
- Open replication of DeepSeek R1 for text-to-graph extraction.☆99Updated 11 months ago
- Our 2nd-gen LMM☆34Updated last year
- Code and data for CoachLM, an automatic instruction revision approach LLM instruction tuning.☆60Updated last year
- The official implementation of "LevelRAG: Enhancing Retrieval-Augmented Generation with Multi-hop Logic Planning over Rewriting Augmented…☆47Updated 9 months ago
- Implementation and evaluation of multimodal RAG with text and image inputs for industrial applications☆65Updated last year
- [EMNLP 2024] LongRAG: A Dual-perspective Retrieval-Augmented Generation Paradigm for Long-Context Question Answering☆118Updated 11 months ago
- 最简易的R1结果在小模型上的复现,阐述类O1与DeepSeek R1最重要的本质。Think is all your need。利用实验佐证,对于强推理能力,think思考过程性内容是AGI/ASI的核心。☆45Updated 11 months ago
- Self-Evolved Diverse Data Sampling for Efficient Instruction Tuning☆86Updated 2 years ago
- ☆31Updated last year
- The huggingface implementation of Fine-grained Late-interaction Multi-modal Retriever.☆104Updated 7 months ago
- ☆29Updated last year
- An implementation of "M3DOCRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding" by Jaemin Cho, Debanj…☆47Updated last year
- A Toolkit for Table-based Question Answering☆115Updated 2 years ago
- Automatic prompt optimization framework for multi-step agent tasks.☆36Updated last year
- [EMNLP 2025] Code for paper "Table-R1: Inference-Time Scaling for Table Reasoning"☆29Updated 7 months ago
- ☆54Updated last year