zanibbi / SymbolScraperLinks
Apache PDFBox extension for precisely extracting character/symbol locations and identities from born-digital PDF files.
☆19Updated 3 months ago
Alternatives and similar repositories for SymbolScraper
Users that are interested in SymbolScraper are comparing it to the libraries listed below
Sorting:
- Workshop Home Page for Benchmarking: Past, Present and Future☆35Updated 4 years ago
- Direct Attentive Dependency Parser☆54Updated last year
- Data/Code Repository for https://api.semanticscholar.org/CorpusID:218470122☆137Updated last year
- ☆95Updated 3 years ago
- Incorporating VIsual LAyout Structures for Scientific Text Classification☆179Updated 2 years ago
- Companion code to the paper "Extracting Scientific Figures with Distantly Supervised Neural Networks" 🤖☆143Updated 3 years ago
- SciWING is a modern toolkit for scientific document processing from WING-NUS☆63Updated 2 years ago
- NaturalProofs: Mathematical Theorem Proving in Natural Language (NeurIPS 2021 Datasets & Benchmarks)☆134Updated 3 years ago
- Superfast CUDA implementation of Word2Vec and Latent Dirichlet Allocation (LDA)☆45Updated 4 years ago
- Science-parse version 2☆251Updated 6 years ago
- Repository with code for MaChAmp: https://aclanthology.org/2021.eacl-demos.22/☆90Updated 7 months ago
- A framework for building semantic parsers (including neural module networks) with AllenNLP, built by the authors of AllenNLP☆107Updated 3 years ago
- Converter from UD-trees to BART representation☆36Updated last year
- Datasets I have created for scientific summarization, and a trained BertSum model☆116Updated 6 years ago
- Code to reproduce the experiments from the paper.☆103Updated 2 years ago
- Extracting scientific claims from biomedical abstracts (powered by AllenNLP)☆143Updated 4 years ago
- A project about benchmarking and evaluating existing PDF extraction tools on their semantic abilities to extract the body texts from PDF …☆69Updated 5 years ago
- Code and material for the AllenNLP Guide☆86Updated 2 years ago
- Dataset accompanying the SPECTER model☆142Updated 3 years ago
- A web application that interfaces two GEC systems. [web instance is down]☆32Updated last year
- ☆58Updated 4 years ago
- One million English sentences, each split into two sentences that together preserve the original meaning, extracted from Wikipedia edits.☆123Updated 6 years ago
- LongSumm - Scientific Document Summarization Task☆74Updated 3 years ago
- Implementation of Marge, Pre-training via Paraphrasing, in Pytorch☆76Updated 4 years ago
- UFSAC is a resource containing all WordNet Sense Annotated Corpora, and a Java library for manipulating them☆38Updated 3 years ago
- Framework for information extraction from tables☆40Updated 6 years ago
- Hyperparameter Search for AllenNLP☆140Updated 10 months ago
- CharBERT: Character-aware Pre-trained Language Model (COLING2020)☆121Updated 4 years ago
- Multitask Learning with Pretrained Transformers☆39Updated 4 years ago
- ☆17Updated 2 years ago