zanibbi / SymbolScraperLinks
Apache PDFBox extension for precisely extracting character/symbol locations and identities from born-digital PDF files.
β19Updated 3 years ago
Alternatives and similar repositories for SymbolScraper
Users that are interested in SymbolScraper are comparing it to the libraries listed below
Sorting:
- Companion code to the paper "Extracting Scientific Figures with Distantly Supervised Neural Networks" π€β141Updated 3 years ago
- β91Updated 3 years ago
- Direct Attentive Dependency Parserβ54Updated last year
- Data/Code Repository for https://api.semanticscholar.org/CorpusID:218470122β135Updated 11 months ago
- β14Updated 3 years ago
- LongSumm - Scientific Document Summarization Taskβ74Updated 3 years ago
- β17Updated 2 years ago
- Workshop Home Page for Benchmarking: Past, Present and Futureβ35Updated 3 years ago
- Science-parse version 2β245Updated 5 years ago
- Extracting scientific claims from biomedical abstracts (powered by AllenNLP)β144Updated 4 years ago
- Factorization of the neural parameter space for zero-shot multi-lingual and multi-task transferβ39Updated 4 years ago
- Source code accompanying the KONVENS 2019 paper "Does BERT Make Any Sense? Interpretable Word Sense Disambiguation with Contextualized Emβ¦β65Updated 5 years ago
- Neuralized version of the Reference String Parser component of the ParsCit package.β81Updated 3 years ago
- A framework for building semantic parsers (including neural module networks) with AllenNLP, built by the authors of AllenNLPβ108Updated 3 years ago
- Code and material for the AllenNLP Guideβ87Updated last year
- A project about benchmarking and evaluating existing PDF extraction tools on their semantic abilities to extract the body texts from PDF β¦β68Updated 4 years ago
- UFSAC is a resource containing all WordNet Sense Annotated Corpora, and a Java library for manipulating themβ38Updated 3 years ago
- β40Updated 3 years ago
- Repository with code for MaChAmp: https://aclanthology.org/2021.eacl-demos.22/β87Updated 2 months ago
- Implementation of Marge, Pre-training via Paraphrasing, in Pytorchβ76Updated 4 years ago
- One million English sentences, each split into two sentences that together preserve the original meaning, extracted from Wikipedia edits.β123Updated 6 years ago
- SciWING is a modern toolkit for scientific document processing from WING-NUSβ63Updated 2 years ago
- A web application that interfaces two GEC systems. [web instance is down]β31Updated 11 months ago
- Language Modelling Makes Sense - WSD (and more) with Contextual Embeddingsβ95Updated 2 years ago
- Code for the paper SciCo: Hierarchical Cross-Document Coreference for Scientific Concepts (AKBC 2021). https://openreview.net/forum?id=OFβ¦β29Updated 3 years ago
- LM Pretraining with PyTorch/TPUβ134Updated 5 years ago
- Repo for Aspire - A scientific document similarity model based on matching fine-grained aspects of scientific papers.β54Updated last year
- β58Updated 3 years ago
- A dataset of atomic wikipedia edits containing insertions and deletions of a contiguous chunk of text in a sentence. This dataset contaiβ¦β106Updated 6 years ago
- β18Updated 2 years ago