brienna / arxiv
Fetches, extracts, and parses data from the arxiv bucket on Amazon S3
☆15Updated 5 years ago
Alternatives and similar repositories for arxiv:
Users that are interested in arxiv are comparing it to the libraries listed below
- Code for SaGe subword tokenizer (EACL 2023)☆24Updated 3 months ago
- A simple semantic search engine for scientific papers.☆28Updated last year
- ☆10Updated 4 years ago
- Bayesian Assessment of Hypotheses☆24Updated last year
- ☆19Updated 2 years ago
- S2APLER: S2 Agglomeration of Papers with Low Error Rate (it's for academic paper clustering)☆16Updated last year
- Finds linguistic patterns effortlessly☆35Updated last year
- SMASHED is a toolkit designed to apply transformations to samples in datasets, such as fields extraction, tokenization, prompting, batchi…☆32Updated 10 months ago
- FAMIE: A Fast Active Learning Framework for Multilingual Information Extraction☆24Updated 2 years ago
- Implementation for EACL 2021 paper "Scientific Discourse Tagging for Evidence Extraction".☆20Updated 3 years ago
- ☆12Updated 2 years ago
- MinHash implementation in Python☆11Updated 7 months ago
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆47Updated last year
- Adaptation of TextWorld for materials synthesis procedures analysis using Text To Quest System☆9Updated last year
- Weakly Supervised Text-to-SQL Parsing through Question Decomposition☆22Updated last year
- A tidy and complete archive of metadata for papers on arxiv.org, 1993-2019☆28Updated 5 years ago
- Large-scale query-focused multi-document Summarization dataset☆10Updated 3 years ago
- An example of how to use spaCy for extremely large files without running into memory issues☆36Updated 2 years ago
- MultiCite code and data. Models are available on Huggingface.☆31Updated 2 years ago
- Code for Stage-wise Fine-tuning for Graph-to-Text Generation☆26Updated 2 years ago
- code for generating a high-quality knowledge graph with metadata about datasets and links to publications☆23Updated 2 years ago
- Data programming by demonstration for information extraction and span annotation☆35Updated 3 years ago
- Given a pair of sentences (premise, hypothesis), the decomposed graph entailment model (DGEM) predicts whether the premise can be used to…☆52Updated 4 years ago
- ☆34Updated 2 years ago
- A multi-threaded C++ implementation of Nickel & Kiela's "Poincare Embeddings" paper from NIPS 2017, following the implementation of the a…☆17Updated 6 years ago
- Analyze Argumentation and Rhetorical Aspects in Scientific Writing.☆19Updated 2 years ago
- Scripts for downloading and pre-processing the `proof-pile`, a high quality dataset of mathematical text and code.☆19Updated 2 years ago
- Official details for: [1803.08493] Context is Everything: Finding Meaning Statistically in Semantic Spaces☆39Updated 5 years ago
- Converter from UD-trees to BART representation☆36Updated last year
- SciWING is a modern toolkit for scientific document processing from WING-NUS☆63Updated last year