epfl-dlab / homepage2vec
Language-Agnostic Website Embedding and Classification
☆43Updated last year
Alternatives and similar repositories for homepage2vec
Users that are interested in homepage2vec are comparing it to the libraries listed below
Sorting:
- Repo for Aspire - A scientific document similarity model based on matching fine-grained aspects of scientific papers.☆52Updated last year
- MultiCite code and data. Models are available on Huggingface.☆31Updated 3 years ago
- Legal document similarity - Code, data, and models for the ICAIL 2021 paper "Evaluating Document Representations for Content-based Legal …☆32Updated 4 years ago
- ☆87Updated 3 years ago
- Corpus of Attribution-Annotated news articles covering the campaigns during the year leading up to the 2016 US Presidential election.☆20Updated 6 years ago
- Implementation, trained models and result data for the paper "Pairwise Multi-Class Document Classification for Semantic Relations between…☆31Updated last year
- Repository for the paper "MultiNERD: A Multilingual, Multi-Genre and Fine-Grained Dataset for Named Entity Recognition (and Disambiguatio…☆44Updated last year
- SciWING is a modern toolkit for scientific document processing from WING-NUS☆63Updated 2 years ago
- A multilingual lexicon of words to hurt.☆89Updated 6 months ago
- Asent is a python library for performing efficient and transparent sentiment analysis using spaCy.☆118Updated last year
- A list of ethics related resources for researchers and practitioners of Natural Language Processing and Computational Linguistics☆33Updated last year
- Summary Explorer is a tool to visually explore the state-of-the-art in text summarization.☆44Updated last year
- Code accompanying the submission "Structural Text Segmentation of Legal Documents" by Aumiller et al.☆96Updated last year
- Reimplementation of a BERT based model (Shi et al, 2019), currently the state-of-the-art for English SRL. This model implements also pred…☆70Updated 3 years ago
- Code for our WOAH@ACL 2021 Paper on Data Integration for Toxic Comment Classification: Making More Than 40 Datasets Easily Accessible in …☆29Updated 3 years ago
- A spaCy wrapper of OpenTapioca for named entity linking on Wikidata☆94Updated 2 years ago
- A spaCy wrapper for DBpedia Spotlight☆109Updated 2 years ago
- Code for Relevance-guided Supervision for OpenQA with ColBERT (TACL'21)☆41Updated 3 years ago
- ☆38Updated 5 months ago
- Mr. TyDi is a multi-lingual benchmark dataset built on TyDi, covering eleven typologically diverse languages.☆75Updated 3 years ago
- Sentence embeddings for unsupervised event detection in the Twitter stream: study on English and French corpora☆31Updated 2 months ago
- Source code and data for Like a Good Nearest Neighbor☆28Updated 4 months ago
- Code & Data for Comparative Opinion Summarization via Collaborative Decoding (Iso et al; Findings of ACL 2022)☆21Updated 2 months ago
- This repository contains the code for the paper 'PARM: Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval' pu…☆40Updated 3 years ago
- Contrastive Fact Verification☆71Updated 2 years ago
- MTab: Entity Search and Table Annotation with Wikidata, Wikipedia, and DBpedia☆31Updated 2 years ago
- [LREC 2022] An off-the-shelf pre-trained Tweet NLP Toolkit (NER, tokenization, lemmatization, POS tagging, dependency parsing) + Tweeban…☆104Updated last year
- Analyze Argumentation and Rhetorical Aspects in Scientific Writing.☆19Updated 2 years ago
- Repository for the paper "Named Entity Recognition for Entity Linking: What Works and What's Next" (EMNLP 2021).☆75Updated 3 years ago
- Measure the readability of a given text using surface characteristics☆79Updated 3 months ago