epfl-dlab / homepage2vec
Language-Agnostic Website Embedding and Classification
☆41Updated last year
Alternatives and similar repositories for homepage2vec:
Users that are interested in homepage2vec are comparing it to the libraries listed below
- Repo for Aspire - A scientific document similarity model based on matching fine-grained aspects of scientific papers.☆51Updated last year
- The official code for PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization☆156Updated 2 years ago
- ☆57Updated 2 years ago
- Mr. TyDi is a multi-lingual benchmark dataset built on TyDi, covering eleven typologically diverse languages.☆74Updated 3 years ago
- FactSumm: Factual Consistency Scorer for Abstractive Summarization☆110Updated last year
- Code for Relevance-guided Supervision for OpenQA with ColBERT (TACL'21)☆41Updated 3 years ago
- ☆85Updated 3 years ago
- Code used to create the Linked WikiText-2 dataset☆17Updated last year
- Repository for the paper "Named Entity Recognition for Entity Linking: What Works and What's Next" (EMNLP 2021).☆75Updated 2 years ago
- Summary Explorer is a tool to visually explore the state-of-the-art in text summarization.☆44Updated 9 months ago
- The NewSHead dataset is a multi-doc headline dataset used in NHNet for training a headline summarization model.☆37Updated 3 years ago
- The CleanCoNLL dataset from our EMNLP 2023 paper where we corrected annotation errors and inconsistencies in CoNLL-03.☆23Updated 7 months ago
- Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings (EMNLP 2022 paper)☆67Updated 2 years ago
- Shared task hosted by IBM in the ArgMining workshop in EMNLP☆30Updated 3 years ago
- Code for A Hierarchical Model for Data-to-Text Generation (Rebuffel, Soulier, Scoutheeten, Gallinari; ECIR 2020)☆82Updated last year
- Pytorch Implementation of EncT5: Fine-tuning T5 Encoder for Non-autoregressive Tasks☆63Updated 3 years ago
- Code, data, and pretrained models for the paper "Generating Wikipedia Article Sections from Diverse Data Sources"☆20Updated 4 years ago
- The data and the PyTorch implementation for the models and experiments in the paper "Exploiting Asymmetry for Synthetic Training Data Gen…☆60Updated last year
- No Parameter Left Behind: How Distillation and Model Size Affect Zero-Shot Retrieval☆28Updated 2 years ago
- ☆74Updated 3 years ago
- This is the code for loading the SenseBERT model, described in our paper from ACL 2020.☆44Updated last year
- Code for the paper "Modeling Information Change in Science Communication with Semantically Matched Paraphrases" from EMNLP 2022☆13Updated 2 years ago
- Code and resources for the paper "BERT-QE: Contextualized Query Expansion for Document Re-ranking".☆50Updated 3 years ago
- Codebase, data and models for the SummaC paper in TACL☆87Updated 3 weeks ago
- An implementation of GrASP (Shnarch et. al., 2017)☆21Updated 2 years ago
- SummScreen: A Dataset for Abstractive Screenplay Summarization (ACL 2022)☆35Updated 2 years ago
- A multilingual version of MS MARCO passage ranking dataset☆143Updated last year
- 🦮 Code and pretrained models for Findings of ACL 2022 paper "LaPraDoR: Unsupervised Pretrained Dense Retriever for Zero-Shot Text Retrie…☆49Updated 2 years ago
- Repository for the paper "MultiNERD: A Multilingual, Multi-Genre and Fine-Grained Dataset for Named Entity Recognition (and Disambiguatio…☆43Updated last year
- Collection of NLP model explanations and accompanying analysis tools☆145Updated last year