Pattern-based table discovery in Open Data CSV files
☆25Dec 8, 2022Updated 3 years ago
Alternatives and similar repositories for Pytheas
Users that are interested in Pytheas are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Rule-based spreadsheet data extraction and transformation☆15Feb 20, 2023Updated 3 years ago
- ☆14May 6, 2018Updated 8 years ago
- Code and Benchmarks for JOSIE (SIGMOD 2019)☆19Apr 13, 2023Updated 3 years ago
- Named Entity Disambiguation and Linking☆16May 24, 2024Updated 2 years ago
- ☆27Jan 31, 2019Updated 7 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆16Feb 21, 2024Updated 2 years ago
- Entity linking, entity typing and relation extraction: Matching CSV to a Wikibase instance (e.g., Wikidata) via Meta-lookup☆71Jun 9, 2025Updated 11 months ago
- Implementation of algorithms for semantic table implementation, including the TableMiner+ method☆19Sep 1, 2022Updated 3 years ago
- Repository for "Towards Robust Named Entity Recognition for Historic German"☆18Dec 11, 2020Updated 5 years ago
- A Jupyter notebook extension to centralize and manage data☆15Dec 22, 2022Updated 3 years ago
- Cloud and Kubernetes configuration for deployment for wbstack.com. You'll want to look at the wikibase.cloud deploy repository soon!☆12Feb 9, 2024Updated 2 years ago
- T2K Match is a matching algorithm optimised to match millions of web tables to a central knowledge base.☆21May 5, 2018Updated 8 years ago
- Adds a reconciliation API endpoint to Datasette, based on the Reconciliation Service API specification.☆24Feb 2, 2024Updated 2 years ago
- This repository is for ExcelTableCNN project - open source automatic table detection on Excel sheets with computer vision☆15Jan 31, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- NewsEye / READ OCR training dataset from Austrian Newspapers (1864–1911)☆18Oct 31, 2025Updated 6 months ago
- generate shape expressions from CSV☆11Apr 21, 2026Updated last month
- ☆22Jan 3, 2023Updated 3 years ago
- Code repository for Mondrian, a project for multiregion template recognition in spreadsheets.☆14May 25, 2022Updated 4 years ago
- This repository contains the Wikibase configuration of the EU Knowledge Graph☆14May 4, 2026Updated 3 weeks ago
- Wikidata tool to create lexemes with pre-populated forms (e. g. declensions or conjugations)☆12May 21, 2026Updated last week
- https://dl.acm.org/doi/10.1145/3657281☆97Apr 25, 2024Updated 2 years ago
- Adult IPTV offers high-quality streaming of explicit content, including live channels and on-demand videos, tailored for adult entertainm…☆17Aug 26, 2024Updated last year
- Wikibase extension that allows defining RDF mappings for Wikibase Entities☆16May 21, 2026Updated last week
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆47May 13, 2026Updated 2 weeks ago
- Python package to reconcile DataFrames☆24Feb 15, 2023Updated 3 years ago
- A spaCy wrapper of OpenTapioca for named entity linking on Wikidata☆95Feb 5, 2026Updated 3 months ago
- PhD thesis: "Knowledge Graph Construction from Heterogeneous Data Sources exploiting Declarative Mapping Rules"☆14Mar 24, 2022Updated 4 years ago
- ☆27May 24, 2018Updated 8 years ago
- Example SPARQL queries, mostly for working with ZBW data sets☆16Oct 8, 2025Updated 7 months ago
- The second version of Chronas in beta stage☆27Apr 16, 2026Updated last month
- Twitter stream and social network crawling tools☆17Nov 17, 2016Updated 9 years ago
- tesseractXplore a tesseract ease of use gui with full control☆28Nov 10, 2021Updated 4 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- CodeQL and Binary Ninja scripts to accompany the blog post☆11Feb 3, 2023Updated 3 years ago
- The code base for paper: "ReAcTable: Enhancing ReAct for Table Question Answering"☆37Apr 28, 2024Updated 2 years ago
- Extracting Entities with Limited Evidence☆16Dec 26, 2022Updated 3 years ago
- Identifying Historical People, Places and other Entities: Shared Task on Named Entity Recognition and Linking on Historical Newspapers at…☆21Aug 1, 2024Updated last year
- Code and models for our CLEF-HIPE (Named Entity Processing on Historical Newspapers) submissions☆20Mar 27, 2023Updated 3 years ago
- European Parliament website Python scraper☆12Oct 19, 2016Updated 9 years ago
- Document of the work done by NFDI Section (Meta)Data Working Group on Ontology Harmonization and Mapping☆18May 13, 2026Updated 2 weeks ago