allenai / S2APLERLinks
S2APLER: S2 Agglomeration of Papers with Low Error Rate (it's for academic paper clustering)
☆21Updated 3 months ago
Alternatives and similar repositories for S2APLER
Users that are interested in S2APLER are comparing it to the libraries listed below
Sorting:
- Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings (EMNLP 2022 paper)☆74Updated last month
- ☆116Updated 3 months ago
- MultiCite code and data. Models are available on Huggingface.☆32Updated 3 years ago
- Dataset accompanying the SPECTER model☆143Updated 3 years ago
- ☆59Updated 4 years ago
- A data set based on all arXiv publications, pre-processed for NLP, including structured full-text and citation network☆297Updated last year
- SciGen☆24Updated 4 years ago
- SciRepEval benchmark training and evaluation scripts☆80Updated last week
- Pretraining Efficiently on S2ORC!☆179Updated last year
- ☆38Updated 3 years ago
- Repo for Aspire - A scientific document similarity model based on matching fine-grained aspects of scientific papers.☆54Updated 2 years ago
- multimodal document analysis☆166Updated 2 months ago
- Simple Questions Generate Named Entity Recognition Datasets (EMNLP 2022)☆76Updated 2 years ago
- Data and code for the paper "CiteWorth: Cite-Worthiness Detection for Improved Scientific Document Understanding"☆14Updated 3 years ago
- Incorporating VIsual LAyout Structures for Scientific Text Classification☆179Updated 2 years ago
- A Test Collection of Computer Science Papers for Faceted Query by Example☆22Updated 4 years ago
- Data/Code Repository for https://api.semanticscholar.org/CorpusID:218470122☆138Updated last year
- The autoregressive information extraction system GenIE (Generative Information Extraction) implemented in PyTorch.☆105Updated 2 years ago
- Dense hybrid representations for text retrieval☆64Updated 2 years ago
- Multidocument Summarization for Literature Review Shared Task 2022☆30Updated 3 years ago
- GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embeddings☆44Updated last year
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆64Updated last year
- The official repository for Efficient Long-Text Understanding Using Short-Text Models (Ivgi et al., 2022) paper☆70Updated 2 years ago
- Submission archive for the MS MARCO passage ranking leaderboard☆13Updated 2 years ago
- Long-context pretrained encoder-decoder models☆96Updated 3 years ago
- Aligned, Review-Informed Edits of Scientific Papers☆55Updated 2 years ago
- Data and models for the SciFact verification task.☆248Updated 2 years ago
- Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval (NeurIPS'21)☆47Updated 4 years ago
- This repository contains ScholarQABench data and evaluation pipeline.☆94Updated 5 months ago
- ☆47Updated 3 years ago