Command line tool to extract figures, tables, and captions from scholarly documents in PDF form.
β130Apr 9, 2018Updated 8 years ago
Alternatives and similar repositories for pdffigures
Users that are interested in pdffigures are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Companion code to the paper "Extracting Scientific Figures with Distantly Supervised Neural Networks" π€β148Jun 14, 2022Updated 3 years ago
- Multi-Entity Extraction Framework for Academic Documents (with default extraction tools)β31Oct 3, 2023Updated 2 years ago
- A place to collect and share knowledge about liberating data from PDFsβ55Jan 30, 2022Updated 4 years ago
- REV: Reverse-Engineering Visualizationsβ61Jun 3, 2019Updated 7 years ago
- PDF Extraction Toolkitβ43Nov 23, 2020Updated 5 years ago
- GPU virtual machines on DigitalOcean Gradient AI β’ AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- table understanding dataset for comparative evaluation of different table understanding algorithmsβ13Jun 15, 2018Updated 7 years ago
- Examples or utilizing Microsoft Academic for conducting covid-19 researchβ23Dec 26, 2022Updated 3 years ago
- A slim, non-SWIG Python adapter to CTesseract (Tesseract OCR for C).β24Apr 25, 2014Updated 12 years ago
- Content ExtRactor and MINErβ512Jun 30, 2022Updated 3 years ago
- Tools for Natural Language Text aware PDF structure analysisβ15Mar 11, 2022Updated 4 years ago
- Supervised learning of morphologyβ28Jan 17, 2017Updated 9 years ago
- Re-usable low-level ML componentsβ10Oct 31, 2018Updated 7 years ago
- Various attempts at scanning aerial imagery to detect baseball diamonds.β17Jul 13, 2014Updated 11 years ago
- Genomic Visualization Catalogβ13Oct 6, 2022Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer β’ AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Science-parse version 2β257Nov 20, 2019Updated 6 years ago
- A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.β2,257Jun 24, 2022Updated 3 years ago
- Some ideas on making Bags into Git repositoriesβ16Dec 23, 2014Updated 11 years ago
- perspective correction for document imageβ20Feb 7, 2013Updated 13 years ago
- Code for the paper: "Mining Algorithm Roadmap in Scientific Publications" - KDD 2019β23Jul 22, 2023Updated 2 years ago
- β19Dec 19, 2018Updated 7 years ago
- DFKI Layout Detection for OCR-Dβ47May 1, 2025Updated last year
- β12Apr 24, 2017Updated 9 years ago
- β24Mar 3, 2024Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Toolbox for deep, resilient, markup-invariant linking into HTML documents without their cooperationβ26Dec 8, 2022Updated 3 years ago
- Prototype your Jupyter Widget in the browser with anywidget and JupyterLite π‘β17Apr 7, 2025Updated last year
- Quick start for MicroFlo on Arduino - clone and go!β15Dec 31, 2017Updated 8 years ago
- This sample .Net application shows you how to use the .Net SDK to read and write files to Azure Data Lake Store, and do other filesystem β¦β10Oct 18, 2023Updated 2 years ago
- Deep learning based page layout analysisβ197Apr 24, 2019Updated 7 years ago
- Core UI Module for After the Deadlineβ20Mar 5, 2022Updated 4 years ago
- An Apache Arrow-backed file format for pre-projected, pre-triangulated maps, including dot density algorithms and regl visualization.β18Feb 10, 2023Updated 3 years ago
- An implementation of DTW for spoken term detection. Including non-constrained, segmental DTW, slope-constrained versions. For more detailβ¦β15Jun 2, 2019Updated 7 years ago
- A collection of Jupyter notebooks.β11Oct 31, 2017Updated 8 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits β’ AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Phylogenetic Application written in OCaml and Cβ20Jan 29, 2020Updated 6 years ago
- A C# library to extract tabular data from PDFs (port of camelot Python version using PdfPig).β35Feb 4, 2022Updated 4 years ago
- Digital Library of the Middle East web application, based on Spotlightβ21Updated this week
- β16Dec 6, 2014Updated 11 years ago
- Full dataset of Reuters composed of 8,551,441 news titles, links and timestamps (Jan 2007 - Aug 2016).β22Aug 17, 2016Updated 9 years ago
- Sample CloudFormation template to create spot fleet requestβ11Mar 23, 2016Updated 10 years ago
- Allow anyone with a modern browser to stream a 1GB, 10GB, 100GB, or 1TB file over the Internet and into a happy home.β32Oct 7, 2018Updated 7 years ago