dell-research-harvard / NEWS-COPY
Noise-robust de-duplication at scale
☆19Updated 2 years ago
Alternatives and similar repositories for NEWS-COPY:
Users that are interested in NEWS-COPY are comparing it to the libraries listed below
- Package to extract connotation frames☆85Updated last year
- Tools to train and explore diachronic word embeddings from Big Historical Data☆23Updated 3 months ago
- Mining Legal Arguments in Court Decisions - Data and software☆68Updated last year
- T-Projection is a method to perform high-quality Annotation Projection of Sequence Labeling datasets.☆12Updated last year
- Code for the paper "Modeling Information Change in Science Communication with Semantically Matched Paraphrases" from EMNLP 2022☆13Updated 2 years ago
- Data for the HIPE 2022 shared task.☆18Updated last year
- Study of semantic evolution of words over time☆20Updated 2 years ago
- Neural Language Models for Historical Research☆26Updated 6 months ago
- Repository for the paper "MultiNERD: A Multilingual, Multi-Genre and Fine-Grained Dataset for Named Entity Recognition (and Disambiguatio…☆44Updated last year
- Repo for Aspire - A scientific document similarity model based on matching fine-grained aspects of scientific papers.☆52Updated last year
- Learning from Neighbors: Unsupervised Text Classification☆17Updated 2 years ago
- ☆22Updated 4 years ago
- ☆15Updated 7 years ago
- Repository for Zheng and Guha et al., 2021, "When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD Data…☆88Updated 2 years ago
- BERT and ELECTRA models trained on Europeana Newspapers☆38Updated 3 years ago
- MultiCite code and data. Models are available on Huggingface.☆31Updated 2 years ago
- This repository provides details and links to the ACL anthology corpus/collection including .bib, .pdf and grobid extractions of the pdfs☆179Updated last year
- An easy-to-use API for analyzing INCEpTION annotation projects.☆17Updated last year
- A model(ing framework) for sample efficient OCR☆57Updated 2 years ago
- ☆26Updated 3 years ago
- Data and code for the paper "CiteWorth: Cite-Worthiness Detection for Improved Scientific Document Understanding"☆14Updated 2 years ago
- 🧪 Cutting-edge experimental spaCy components and features☆98Updated last year
- ☆52Updated last year
- ☆16Updated 3 months ago
- Code for measuring novelty in science using publication text☆26Updated 2 months ago
- Information and data related to the ProtestNews shared task at CASE @ ACL-IJCNLP 2021 workshop☆43Updated 2 years ago
- Python Multilingual Ucrel Semantic Analysis System☆32Updated 8 months ago
- Evaluate language models using multiple choice items☆13Updated 3 weeks ago
- The official repository for the LREC 2022 paper "D3: A Massive Dataset of Scholarly Metadata for Analyzing the State of Computer Science …☆27Updated 2 years ago
- ☆13Updated 3 years ago