A data set based on all arXiv publications, pre-processed for NLP, including structured full-text and citation network
☆297Sep 28, 2024Updated last year
Alternatives and similar repositories for unarXive
Users that are interested in unarXive are comparing it to the libraries listed below
Sorting:
- Repository for NAACL 2019 paper on Citation Intent prediction☆129Dec 1, 2019Updated 6 years ago
- S2ORC: The Semantic Scholar Open Research Corpus: https://www.aclweb.org/anthology/2020.acl-main.447/☆1,020Apr 26, 2024Updated last year
- Official dataset repository for "SciReviewGen: A Large-scale Dataset for Automatic Literature Review Generation."☆19Jun 4, 2023Updated 2 years ago
- Dataset and model in the paper "SciXGen: A Scientific Paper Dataset for Context-Aware Text Generation"☆13Feb 14, 2022Updated 4 years ago
- Measuring the Evolution of a Scientific Field through Citation Frames☆63Oct 5, 2018Updated 7 years ago
- Can LLMs generate code-mixed sentences through zero-shot prompting?☆11Apr 18, 2023Updated 2 years ago
- Pretraining Efficiently on S2ORC!☆180Oct 23, 2024Updated last year
- This repository provides details and links to the ACL anthology corpus/collection including .bib, .pdf and grobid extractions of the pdfs☆190Oct 12, 2023Updated 2 years ago
- ☆20Feb 17, 2024Updated 2 years ago
- MultiCite code and data. Models are available on Huggingface.☆33May 10, 2022Updated 3 years ago
- SPECTER: Document-level Representation Learning using Citation-informed Transformers☆572Jun 12, 2023Updated 2 years ago
- GSM-Plus: Data, Code, and Evaluation for Enhancing Robust Mathematical Reasoning in Math Word Problems.☆64Jul 8, 2024Updated last year
- Implementation of the LDP module block in PyTorch and Zeta from the paper: "MobileVLM: A Fast, Strong and Open Vision Language Assistant …☆15Mar 11, 2024Updated last year
- ScienceMeter: Tracking Scientific Knowledge Updates in Language Models☆17Jun 28, 2025Updated 8 months ago
- Parsers for scientific papers (PDF2JSON, TEX2JSON, JATS2JSON)☆457Apr 11, 2024Updated last year
- Code for the Master Thesis "Enhancing the Microsoft Academic Knowledge Graph"☆14Sep 28, 2020Updated 5 years ago
- Incorporating VIsual LAyout Structures for Scientific Text Classification☆179Mar 18, 2023Updated 2 years ago
- AASC: ACL Anthology Sentence Corpus☆20Oct 28, 2020Updated 5 years ago
- Science Parse parses scientific papers (in PDF form) and returns them in structured form.☆696May 26, 2024Updated last year
- ☆15Jul 9, 2025Updated 8 months ago
- Dataset accompanying the SPECTER model☆143Dec 19, 2022Updated 3 years ago
- ☆760May 22, 2023Updated 2 years ago
- Data/Code Repository for https://api.semanticscholar.org/CorpusID:218470122☆139Jul 25, 2024Updated last year
- ☆17Feb 16, 2024Updated 2 years ago
- ☆18Oct 22, 2022Updated 3 years ago
- Data and Code for EMNLP 2022 paper "ReasTAP: Injecting Table Reasoning Skills During Pre-training via Synthetic Reasoning Examples"☆15Jun 4, 2023Updated 2 years ago
- Adding new tasks to T0 without catastrophic forgetting☆33Oct 20, 2022Updated 3 years ago
- ☆19Jan 18, 2022Updated 4 years ago
- Code for Evaluating Explanations for Reading Comprehension with Realistic Counterfactuals.☆18Apr 25, 2021Updated 4 years ago
- A BERT model for scientific text.☆1,671Feb 22, 2022Updated 4 years ago
- A machine learning software for extracting information from scholarly documents☆4,686Updated this week
- RWKV model implementation☆37Jul 15, 2023Updated 2 years ago
- Official implementation of the ACL 2024: Scientific Inspiration Machines Optimized for Novelty☆93Apr 13, 2024Updated last year
- ☆25Apr 21, 2021Updated 4 years ago
- The project page for "SCITAB: A Challenging Benchmark for Compositional Reasoning and Claim Verification on Scientific Tables"☆23Dec 21, 2023Updated 2 years ago
- Code for the paper "Decomposing the Enigma: Subgoal-based Demonstration Learning for Formal Theorem Proving"☆19May 25, 2023Updated 2 years ago
- library supporting NLP and CV research on scientific papers☆789Nov 8, 2024Updated last year
- Neuralized version of the Reference String Parser component of the ParsCit package.☆81May 27, 2022Updated 3 years ago
- OpenICL is an open-source framework to facilitate research, development, and prototyping of in-context learning.☆584Oct 3, 2023Updated 2 years ago