ai8hyf / TF-IDView external linksLinks
TF-ID: Table/Figure IDentifier for academic papers
☆245Jul 12, 2024Updated last year
Alternatives and similar repositories for TF-ID
Users that are interested in TF-ID are comparing it to the libraries listed below
Sorting:
- ☆20Jan 27, 2024Updated 2 years ago
- Chinese Mathematical Formula Detection (MFD) Dataset 中文文档数学公式检测数据集☆34Dec 21, 2022Updated 3 years ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆267Dec 4, 2025Updated 2 months ago
- Small python package to measure OCR quality and other related metrics.☆27Feb 19, 2024Updated last year
- Extract structured text from pdfs quickly☆661Jun 11, 2025Updated 8 months ago
- A Comprehensive Toolkit for High-Quality PDF Content Extraction☆9,358Jan 3, 2025Updated last year
- Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-…☆3,852May 17, 2025Updated 8 months ago
- tiny_fnc_engine is a minimal python library that provides a flexible engine for calling functions extracted from a LLM.☆38Sep 11, 2024Updated last year
- Detect and extract tables to markdown and csv☆754Jan 24, 2025Updated last year
- ☆17Jan 23, 2021Updated 5 years ago
- A proxy for minimax-m2, enabling interleaved thinking, and tool calls.☆38Nov 21, 2025Updated 2 months ago
- anything you want can be built with morph cloud☆26Oct 14, 2025Updated 4 months ago
- ☆50Mar 14, 2024Updated last year
- mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding☆2,368May 30, 2025Updated 8 months ago
- ReadingBank: A Benchmark Dataset for Reading Order Detection☆116Aug 26, 2024Updated last year
- Fast, High-Fidelity LLM Decoding with Regex Constraints☆21Jul 26, 2024Updated last year
- ☆18Jan 3, 2024Updated 2 years ago
- Math OCR model that outputs LaTeX and markdown☆1,110Jan 29, 2025Updated last year
- DocLayNet: A Large Human-Annotated Dataset for Document-Layout Analysis☆411Feb 1, 2023Updated 3 years ago
- ☆67Mar 4, 2024Updated last year
- Machine Learning Serving focused on GenAI with simplicity as the top priority.☆59Jan 5, 2026Updated last month
- A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.☆1,594Dec 20, 2025Updated last month
- Explore 160+ notebook visual analytics tools in your browser!☆67Mar 29, 2024Updated last year
- A High-efficiency Open-source Toolkit for Table-to-Latex Task☆274Dec 6, 2025Updated 2 months ago
- A Language and Live Runtime for Styling and Labeling Typeset Math Formulas☆26Oct 29, 2023Updated 2 years ago
- The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.☆2,503Feb 3, 2026Updated last week
- Ultra low overhead NVIDIA GPU telemetry plugin for telegraf with memory temperature readings.☆63Jul 8, 2024Updated last year
- ☆102Dec 23, 2024Updated last year
- minimal pytorch implementation of bm25 (with sparse tensors)☆104Oct 28, 2025Updated 3 months ago
- An automated tool for discovering insights from research papaer corpora☆137Jun 8, 2024Updated last year
- ☆91Jul 4, 2025Updated 7 months ago
- DB-based Optical Chemical Structure Recognition☆12Sep 12, 2022Updated 3 years ago
- High-performance tokenized language data-loader for Python C++ extension☆14Jul 22, 2024Updated last year
- Multi-person podcast audio to videocast☆10Sep 28, 2024Updated last year
- Seamless Voice Interactions with LLMs☆12Oct 28, 2023Updated 2 years ago
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.☆2,885Updated this week
- C++ inference wrappers for running blazing fast embedding services on your favourite serverless like AWS Lambda. By Prithivi Da, PRs welc…☆23Mar 4, 2024Updated last year
- ☆50Jun 13, 2024Updated last year
- awesome synthetic (text) datasets☆323Jan 8, 2026Updated last month