webis-de / scidata22-stereo-scientific-text-reuseLinks
☆11Updated 6 months ago
Alternatives and similar repositories for scidata22-stereo-scientific-text-reuse
Users that are interested in scidata22-stereo-scientific-text-reuse are comparing it to the libraries listed below
Sorting:
- Data and code for the paper "CiteWorth: Cite-Worthiness Detection for Improved Scientific Document Understanding"☆14Updated 2 years ago
- Code for equipping pretrained language models (BART, GPT-2, XLNet) with commonsense knowledge for generating implicit knowledge statement…☆16Updated 3 years ago
- The official implementation of the iConference 2022 paper "Identifying Machine-Paraphrased Plagiarism".☆17Updated 2 years ago
- Repo for Aspire - A scientific document similarity model based on matching fine-grained aspects of scientific papers.☆53Updated last year
- ☆21Updated 3 years ago
- ☆22Updated 5 months ago
- Source code and data for Like a Good Nearest Neighbor☆29Updated 5 months ago
- A BERT-based application for reusable text classification at scale☆38Updated last year
- Analysis of gutenberg dataset☆44Updated 6 years ago
- StAtutory Reasoning Assessment☆13Updated 2 years ago
- Legal document similarity - Code, data, and models for the ICAIL 2021 paper "Evaluating Document Representations for Content-based Legal …☆32Updated 4 years ago
- The official repository for the LREC 2022 paper "D3: A Massive Dataset of Scholarly Metadata for Analyzing the State of Computer Science …☆27Updated 2 years ago
- Official codebase accompanying our ACL 2022 paper "RELiC: Retrieving Evidence for Literary Claims" (https://relic.cs.umass.edu).☆20Updated 3 years ago
- PyTAIL - Interactive and Incremental Learning of NLP Models with Human in the Loop for Online Data☆13Updated 2 years ago
- ☆19Updated 3 years ago
- ☆17Updated 2 years ago
- SPRINT Toolkit helps you evaluate diverse neural sparse models easily using a single click on any IR dataset.☆45Updated last year
- Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings (EMNLP 2022 paper)☆69Updated 2 years ago
- ☆55Updated last year
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆48Updated last year
- PropSegmEnt is an annotated dataset for segmenting English text into propositions, and recognizing proposition-level entailment relations…☆19Updated 2 years ago
- Corresponding code repo for the paper at COLING 2020 - ARGMIN 2020: "DebateSum: A large-scale argument mining and summarization dataset"☆54Updated 3 years ago
- Ranking of fine-tuned HF models as base models.☆35Updated last month
- Data and code for the SciFact-Open task☆26Updated last year
- Wikipedia based dataset to train relationship classifiers and fact extraction models☆25Updated 4 years ago
- ☆22Updated 2 years ago
- FAMIE: A Fast Active Learning Framework for Multilingual Information Extraction☆24Updated 3 years ago
- T-Projection is a method to perform high-quality Annotation Projection of Sequence Labeling datasets.☆12Updated last year
- ☆33Updated 2 years ago
- This repository contains the code for the paper 'PARM: Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval' pu…☆40Updated 3 years ago