Heidelberg-NLP / LMs4Implicit-Knowledge-Generation
Code for equipping pretrained language models (BART, GPT-2, XLNet) with commonsense knowledge for generating implicit knowledge statements between two sentences, by (i) finetuning the models on corpora enriched with implicit information; and by (ii) constraining models with key concepts and commonsense knowledge paths connecting them.
☆16Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for LMs4Implicit-Knowledge-Generation
- ☆19Updated 2 years ago
- SeqScore: Scoring for named entity recognition and other sequence labeling tasks☆20Updated last month
- Corpus exploration platform using advanced tools such as interactive summarization and multi document coreference resolution☆11Updated last year
- Training T5 to perform numerical reasoning.☆23Updated 3 years ago
- The corresponding code for our paper: "Exploring the Challenges of Open Domain Multi-Document Summarization". Do not hesitate to open an …☆31Updated last year
- The official implementation of "Distilling Relation Embeddings from Pre-trained Language Models, EMNLP 2021 main conference", a high-qual…☆46Updated last year
- ☆22Updated 2 years ago
- No Parameter Left Behind: How Distillation and Model Size Affect Zero-Shot Retrieval☆27Updated 2 years ago
- The dataset and code for ACL 2022 paper "SciNLI: A Corpus for Natural Language Inference on Scientific Text" are released here.☆25Updated last year
- Embedding Recycling for Language models☆38Updated last year
- Efficient-Sentence-Embedding-using-Discrete-Cosine-Transform☆17Updated 4 years ago
- Code for the paper SciCo: Hierarchical Cross-Document Coreference for Scientific Concepts (AKBC 2021). https://openreview.net/forum?id=OF…☆25Updated 3 years ago
- ☆73Updated 3 years ago
- Implementation, trained models and result data for the paper "Aspect-based Document Similarity for Research Papers" #COLING2020☆62Updated 6 months ago
- ☆13Updated last year
- Research code for the paper "How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models"☆26Updated 3 years ago
- Code for the paper: Saying No is An Art: Contextualized Fallback Responses for Unanswerable Dialogue Queries☆19Updated 2 years ago
- PropSegmEnt is an annotated dataset for segmenting English text into propositions, and recognizing proposition-level entailment relations…☆18Updated last year
- This repository contains the code for the paper 'PARM: Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval' pu…☆40Updated 2 years ago
- Corresponding code repo for the paper at COLING 2020 - ARGMIN 2020: "DebateSum: A large-scale argument mining and summarization dataset"☆53Updated 2 years ago
- ☆20Updated last year
- Starbucks: Improved Training for 2D Matryoshka Embeddings☆17Updated last month
- Code for Paper "Target-oriented Fine-tuning for Zero-Resource Named Entity Recognition"☆21Updated 2 years ago
- A embed able annotation tool for end to end cross document co-reference☆41Updated last year
- Google's BigBird (Jax/Flax & PyTorch) @ 🤗Transformers☆47Updated last year
- INCOME: An Easy Repository for Training and Evaluation of Index Compression Methods in Dense Retrieval. Includes BPR and JPQ.☆22Updated last year
- ☆37Updated this week
- Implementation, trained models and result data for the paper "Pairwise Multi-Class Document Classification for Semantic Relations between…☆32Updated last year
- Wikipedia based dataset to train relationship classifiers and fact extraction models☆25Updated 3 years ago