IllDepence / unarXiveView external linksLinks
A data set based on all arXiv publications, pre-processed for NLP, including structured full-text and citation network
☆297Sep 28, 2024Updated last year
Alternatives and similar repositories for unarXive
Users that are interested in unarXive are comparing it to the libraries listed below
Sorting:
- Repository for NAACL 2019 paper on Citation Intent prediction☆129Dec 1, 2019Updated 6 years ago
- S2ORC: The Semantic Scholar Open Research Corpus: https://www.aclweb.org/anthology/2020.acl-main.447/☆1,009Apr 26, 2024Updated last year
- Official dataset repository for "SciReviewGen: A Large-scale Dataset for Automatic Literature Review Generation."☆18Jun 4, 2023Updated 2 years ago
- Dataset and model in the paper "SciXGen: A Scientific Paper Dataset for Context-Aware Text Generation"☆13Feb 14, 2022Updated 4 years ago
- Can LLMs generate code-mixed sentences through zero-shot prompting?☆11Apr 18, 2023Updated 2 years ago
- Measuring the Evolution of a Scientific Field through Citation Frames☆63Oct 5, 2018Updated 7 years ago
- This repository provides details and links to the ACL anthology corpus/collection including .bib, .pdf and grobid extractions of the pdfs☆189Oct 12, 2023Updated 2 years ago
- ☆20Feb 17, 2024Updated 2 years ago
- MultiCite code and data. Models are available on Huggingface.☆33May 10, 2022Updated 3 years ago
- SPECTER: Document-level Representation Learning using Citation-informed Transformers☆571Jun 12, 2023Updated 2 years ago
- GSM-Plus: Data, Code, and Evaluation for Enhancing Robust Mathematical Reasoning in Math Word Problems.☆64Jul 8, 2024Updated last year
- Implementation of the LDP module block in PyTorch and Zeta from the paper: "MobileVLM: A Fast, Strong and Open Vision Language Assistant …☆15Mar 11, 2024Updated last year
- Parsers for scientific papers (PDF2JSON, TEX2JSON, JATS2JSON)☆456Apr 11, 2024Updated last year
- Incorporating VIsual LAyout Structures for Scientific Text Classification☆179Mar 18, 2023Updated 2 years ago
- AASC: ACL Anthology Sentence Corpus☆20Oct 28, 2020Updated 5 years ago
- Science Parse parses scientific papers (in PDF form) and returns them in structured form.☆694May 26, 2024Updated last year
- ☆15Jul 9, 2025Updated 7 months ago
- Dataset accompanying the SPECTER model☆143Dec 19, 2022Updated 3 years ago
- Aligned, Review-Informed Edits of Scientific Papers☆55Jul 5, 2023Updated 2 years ago
- ☆760May 22, 2023Updated 2 years ago
- Data/Code Repository for https://api.semanticscholar.org/CorpusID:218470122☆138Jul 25, 2024Updated last year
- ☆18Oct 22, 2022Updated 3 years ago
- Data and Code for EMNLP 2022 paper "ReasTAP: Injecting Table Reasoning Skills During Pre-training via Synthetic Reasoning Examples"☆15Jun 4, 2023Updated 2 years ago
- ☆17Feb 16, 2024Updated 2 years ago
- Adding new tasks to T0 without catastrophic forgetting☆33Oct 20, 2022Updated 3 years ago
- Repo for Aspire - A scientific document similarity model based on matching fine-grained aspects of scientific papers.☆54Aug 20, 2023Updated 2 years ago
- Code for Evaluating Explanations for Reading Comprehension with Realistic Counterfactuals.☆18Apr 25, 2021Updated 4 years ago
- ☆19Jan 18, 2022Updated 4 years ago
- A BERT model for scientific text.☆1,669Feb 22, 2022Updated 3 years ago
- A JAX implementation of the continuous time formulation of Consistency Models☆85Apr 7, 2023Updated 2 years ago
- library supporting NLP and CV research on scientific papers☆788Nov 8, 2024Updated last year
- A machine learning software for extracting information from scholarly documents☆4,645Updated this week
- A set of scripts to grab public datasets from resources related to arXiv☆475May 20, 2024Updated last year
- ☆20Jan 15, 2024Updated 2 years ago
- The project page for "SCITAB: A Challenging Benchmark for Compositional Reasoning and Claim Verification on Scientific Tables"☆23Dec 21, 2023Updated 2 years ago
- Code for the paper "Decomposing the Enigma: Subgoal-based Demonstration Learning for Formal Theorem Proving"☆19May 25, 2023Updated 2 years ago
- ☆25Apr 21, 2021Updated 4 years ago
- Neuralized version of the Reference String Parser component of the ParsCit package.☆81May 27, 2022Updated 3 years ago
- Data and tools for generating and inspecting OLMo pre-training data.☆1,404Nov 5, 2025Updated 3 months ago