allenai / SPECTER2
☆84Updated 8 months ago
Alternatives and similar repositories for SPECTER2:
Users that are interested in SPECTER2 are comparing it to the libraries listed below
- SciRepEval benchmark training and evaluation scripts☆71Updated 8 months ago
- Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings (EMNLP 2022 paper)☆66Updated 2 years ago
- Dataset accompanying the SPECTER model☆129Updated 2 years ago
- Repo for Aspire - A scientific document similarity model based on matching fine-grained aspects of scientific papers.☆50Updated last year
- A data set based on all arXiv publications, pre-processed for NLP, including structured full-text and citation network☆282Updated 3 months ago
- MultiCite code and data. Models are available on Huggingface.☆29Updated 2 years ago
- Semantic Scholar's Author Disambiguation Algorithm & Evaluation Suite☆91Updated 11 months ago
- multimodal document analysis☆161Updated 7 months ago
- ☆33Updated last year
- Pretraining Efficiently on S2ORC!☆147Updated 2 months ago
- Parsers for scientific papers (PDF2JSON, TEX2JSON, JATS2JSON)☆351Updated 9 months ago
- Robust and fast topic models with sentence-transformers.☆42Updated last week
- spaCy-wrap is a wrapper library for spaCy for including fine-tuned transformers from Huggingface in your spaCy pipeline allowing you to i…☆46Updated 9 months ago
- 💫 SpaCy wrapper for ConceptNet 💫☆89Updated last year
- ☆153Updated 6 months ago
- A Python library aimed at dissecting and augmenting NER training data.☆57Updated last year
- S2APLER: S2 Agglomeration of Papers with Low Error Rate (it's for academic paper clustering)☆16Updated last year
- Aligned, Review-Informed Edits of Scientific Papers☆48Updated last year
- Data and code for the paper "CiteWorth: Cite-Worthiness Detection for Improved Scientific Document Understanding"☆14Updated 2 years ago
- A multi-lingual approach to AllenNLP CoReference Resolution along with a wrapper for spaCy.☆104Updated 9 months ago
- GraphER: A Structure-aware Text-to-Graph Model for Entity and Relation Extraction☆63Updated 5 months ago
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆56Updated 5 months ago
- A BERT-based application for reusable text classification at scale☆37Updated last year
- This repository provides scripts for evaluating NLP models on the LEXTREME benchmark, a set of diverse multilingual tasks in legal NLP☆21Updated last year
- Datasets collection and preprocessings framework for NLP extreme multitask learning☆170Updated last week
- ☆52Updated 10 months ago
- FastFit ⚡ When LLMs are Unfit Use FastFit ⚡ Fast and Effective Text Classification with Many Classes☆181Updated 3 months ago
- Coreference resolution for English, French, German and Polish, optimised for limited training data and easily extensible for further lang…☆119Updated 8 months ago
- [EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.☆104Updated 8 months ago
- Multidocument Summarization for Literature Review Shared Task 2022☆28Updated 2 years ago