ai8hyf / TF-ID
TF-ID: Table/Figure IDentifier for academic papers
β228Updated 6 months ago
Alternatives and similar repositories for TF-ID:
Users that are interested in TF-ID are comparing it to the libraries listed below
- A Comprehensive Benchmark for Document Parsing and Evaluationβ199Updated this week
- Recipes for learning, fine-tuning, and adapting ColPali to your multimodal RAG use cases. π¨π»βπ³β246Updated last month
- Code for explaining and evaluating late chunking (chunked pooling)β307Updated 3 weeks ago
- Structured information extraction from documentsβ297Updated 3 months ago
- Vision Document Retrieval (ViDoRe): Benchmark. Evaluation code for the ColPali paper.β164Updated last month
- Vision-Augmented Retrieval and Generation (VARAG) - Vision first RAG Engineβ426Updated this week
- Extract structured text from pdfs quicklyβ378Updated this week
- Official repo for "LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs".β209Updated 4 months ago
- [NAACL 2024] Visually Guided Generative Text-Layout Pre-training for Document Intelligenceβ134Updated 4 months ago
- official code for "Fox: Focus Anywhere for Fine-grained Multi-page Document Understanding"β135Updated 7 months ago
- β258Updated 6 months ago
- An enterprise-grade AI retriever designed to streamline AI integration into your applications, ensuring cutting-edge accuracy.β279Updated this week
- A fast, lightweight and easy-to-use Python library for splitting text into semantically meaningful chunks.β222Updated last week
- Benchmark various LLM Structured Output frameworks: Instructor, Mirascope, Langchain, LlamaIndex, Fructose, Marvin, Outlines, etc on taskβ¦β139Updated 3 months ago
- From scratch implementation of a vision language model in pure PyTorchβ189Updated 8 months ago
- DocLLM: A layout-aware generative language model for multimodal document understandingβ119Updated last year
- Framework for enhancing LLMs for RAG tasks using fine-tuning.β522Updated 3 weeks ago
- β196Updated last month
- HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieval Results in RAG Systemsβ303Updated this week
- This is the official repository for Auto-RAG.β179Updated last week
- The latest graphrag interface is used, using the local ollama to provide the LLM interface.Support for using the pip installationβ134Updated 3 months ago
- β109Updated this week
- Solving data for LLMs - Create quality synthetic datasets!β143Updated 2 months ago
- [ACL 2024] This is the code repo for our ACLβ24 paper "Cleaner Pretraining Corpus Curation with Neural Web Scraping".β219Updated 4 months ago
- LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QAβ451Updated 2 weeks ago
- Code for Husky, an open-source language agent that solves complex, multi-step reasoning tasks. Husky v1 addresses numerical, tabular and β¦β331Updated 7 months ago
- LLM-driven automated knowledge graph construction from text using DSPy and Neo4j.β161Updated 9 months ago
- Use late-interaction multi-modal models such as ColPali in just a few lines of code.β693Updated 2 months ago
- a collection of resources around LLMs, aggregated for the workshop "Mastering LLMs: End-to-End Fine-Tuning and Deployment" by Dan Becker β¦β107Updated 7 months ago
- β206Updated 6 months ago