ai8hyf / TF-ID
TF-ID: Table/Figure IDentifier for academic papers
β228Updated 7 months ago
Alternatives and similar repositories for TF-ID:
Users that are interested in TF-ID are comparing it to the libraries listed below
- A Comprehensive Benchmark for Document Parsing and Evaluationβ230Updated this week
- Recipes for learning, fine-tuning, and adapting ColPali to your multimodal RAG use cases. π¨π»βπ³β253Updated last month
- Vision Document Retrieval (ViDoRe): Benchmark. Evaluation code for the ColPali paper.β173Updated this week
- Code for explaining and evaluating late chunking (chunked pooling)β321Updated last month
- Structured information extraction from documentsβ305Updated 4 months ago
- Extract structured text from pdfs quicklyβ412Updated this week
- [NAACL 2024] Visually Guided Generative Text-Layout Pre-training for Document Intelligenceβ137Updated 5 months ago
- OpenResearcher, an advanced Scientific Research Assistantβ423Updated 4 months ago
- [ACL 2024] This is the code repo for our ACLβ24 paper "Cleaner Pretraining Corpus Curation with Neural Web Scraping".β224Updated 5 months ago
- InstructDoc: A Dataset for Zero-Shot Generalization of Visual Document Understanding with Instructions (AAAI2024)β157Updated 8 months ago
- UniTable: Towards a Unified Table Foundation Modelβ428Updated 8 months ago
- Vision-Augmented Retrieval and Generation (VARAG) - Vision first RAG Engineβ428Updated last month
- Framework agnostic computer vision inference. Run 1000+ models by changing only one line of code. Supports models from transformers, timmβ¦β132Updated 2 months ago
- β173Updated last week
- Code for Husky, an open-source language agent that solves complex, multi-step reasoning tasks. Husky v1 addresses numerical, tabular and β¦β335Updated 8 months ago
- Solving data for LLMs - Create quality synthetic datasets!β145Updated 3 weeks ago
- Official repo for "LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs".β220Updated 5 months ago
- OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generationβ62Updated 2 weeks ago
- a collection of resources around LLMs, aggregated for the workshop "Mastering LLMs: End-to-End Fine-Tuning and Deployment" by Dan Becker β¦β109Updated 8 months ago
- An enterprise-grade AI retriever designed to streamline AI integration into your applications, ensuring cutting-edge accuracy.β279Updated this week
- A prompting libraryβ157Updated 4 months ago
- LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QAβ464Updated last month
- Use late-interaction multi-modal models such as ColPali in just a few lines of code.β726Updated 2 weeks ago
- Lightweight, performant, deep table extractionβ404Updated 2 months ago
- Unattended Lightweight Text Classifiers with LLM Embeddingsβ183Updated 5 months ago
- official code for "Fox: Focus Anywhere for Fine-grained Multi-page Document Understanding"β138Updated 8 months ago
- Benchmark various LLM Structured Output frameworks: Instructor, Mirascope, Langchain, LlamaIndex, Fructose, Marvin, Outlines, etc on taskβ¦β147Updated 4 months ago
- This is a repository of RALM surveys containing a summary of state-of-the-art RAG and other technologiesβ194Updated 7 months ago
- awesome synthetic (text) datasetsβ259Updated 3 months ago
- This project showcases an LLMOps pipeline that fine-tunes a small-size LLM model to prepare for the outage of the service LLM.β295Updated 2 months ago