ai8hyf / TF-ID
TF-ID: Table/Figure IDentifier for academic papers
β231Updated 9 months ago
Alternatives and similar repositories for TF-ID:
Users that are interested in TF-ID are comparing it to the libraries listed below
- Recipes for learning, fine-tuning, and adapting ColPali to your multimodal RAG use cases. π¨π»βπ³β279Updated 2 weeks ago
- Vision Document Retrieval (ViDoRe): Benchmark. Evaluation code for the ColPali paper.β201Updated 2 weeks ago
- [CVPR 2025] A Comprehensive Benchmark for Document Parsing and Evaluationβ409Updated 3 weeks ago
- Structured information extraction from documentsβ315Updated 7 months ago
- UniTable: Towards a Unified Table Foundation Modelβ465Updated 11 months ago
- Code for explaining and evaluating late chunking (chunked pooling)β377Updated 4 months ago
- Vision-Augmented Retrieval and Generation (VARAG) - Vision first RAG Engineβ453Updated 3 months ago
- Official repo for "LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs".β230Updated 8 months ago
- Extract structured text from pdfs quicklyβ471Updated 2 months ago
- LitePali is a minimal, efficient implementation of ColPali for image retrieval and indexing, optimized for cloud deployment.β49Updated 7 months ago
- A fast, lightweight and easy-to-use Python library for splitting text into semantically meaningful chunks.β299Updated last month
- A High-efficiency Open-source Toolkit for Table-to-Latex Taskβ235Updated 4 months ago
- β180Updated 3 weeks ago
- Use late-interaction multi-modal models such as ColPali in just a few lines of code.β777Updated 3 months ago
- OpenResearcher, an advanced Scientific Research Assistantβ443Updated 6 months ago
- β113Updated 2 weeks ago
- β265Updated 10 months ago
- YOLO models trained by DocLayNet - power your Document Intelligent by Layout Analysisβ103Updated last month
- β144Updated last week
- InstructDoc: A Dataset for Zero-Shot Generalization of Visual Document Understanding with Instructions (AAAI2024)β160Updated 11 months ago
- A simple tool that let's you explore different possible paths that an LLM might sample.β165Updated 3 weeks ago
- [EMNLP 2024: Demo Oral] RAGLAB: A Modular and Research-Oriented Unified Framework for Retrieval-Augmented Generationβ296Updated 6 months ago
- The official repository of NodeRAGβ212Updated last month
- OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generationβ72Updated last month
- Benchmark various LLM Structured Output frameworks: Instructor, Mirascope, Langchain, LlamaIndex, Fructose, Marvin, Outlines, etc on taskβ¦β165Updated 7 months ago
- LLM-driven automated knowledge graph construction from text using DSPy and Neo4j.β179Updated last year
- Official Implementation of "Multi-Head RAG: Solving Multi-Aspect Problems with LLMs"β206Updated 6 months ago
- [NAACL 2024] Visually Guided Generative Text-Layout Pre-training for Document Intelligenceβ144Updated 7 months ago
- β222Updated 5 months ago
- A Faster LayoutReader Model based on LayoutLMv3, Sort OCR bboxes to reading order.β215Updated 11 months ago