ltgoslo / NorQuAD
Norwegian question answering dataset
☆13Updated 9 months ago
Related projects ⓘ
Alternatives and complementary repositories for NorQuAD
- Natural language understanding benchmarks for Norwegian☆14Updated 10 months ago
- The CleanCoNLL dataset from our EMNLP 2023 paper where we corrected annotation errors and inconsistencies in CoNLL-03.☆19Updated 4 months ago
- LTG-Bert☆29Updated 10 months ago
- T-Projection is a method to perform high-quality Annotation Projection of Sequence Labeling datasets.☆11Updated last year
- ☆20Updated last year
- Simple-to-use scoring function for arbitrarily tokenized texts.☆32Updated 3 weeks ago
- The official repository for Toxic Commons and Celadon. Toxicity Classification for public domain data.☆9Updated last week
- Minimum Bayes Risk Decoding for Hugging Face Transformers☆56Updated 5 months ago
- A python package to run inference with HuggingFace language and vision-language checkpoints wrapping many convenient features.☆25Updated 2 months ago
- Starbucks: Improved Training for 2D Matryoshka Embeddings☆17Updated last month
- SeqScore: Scoring for named entity recognition and other sequence labeling tasks☆21Updated last month
- ☆19Updated 2 years ago
- Master thesis: Exploring bias in German NLG (GPT-3 & GerPT-2). Applies regard classification and bias mitigation triggers.☆14Updated last month
- Source code and data for Like a Good Nearest Neighbor☆28Updated 9 months ago
- Code for SaGe subword tokenizer (EACL 2023)☆22Updated this week
- Ranking of fine-tuned HF models as base models.☆35Updated last year
- FAMIE: A Fast Active Learning Framework for Multilingual Information Extraction☆23Updated 2 years ago
- Code associated with the paper "Entropy-based Attention Regularization Frees Unintended Bias Mitigation from Lists"☆46Updated 2 years ago
- Data and code for the paper "CiteWorth: Cite-Worthiness Detection for Improved Scientific Document Understanding"☆14Updated 2 years ago
- ☆22Updated 2 years ago
- Noise-robust de-duplication at scale☆15Updated last year
- Code for our EMNLP '22 paper "Fixing Model Bugs with Natural Language Patches"☆19Updated last year
- This repo contains code for the paper "Psychologically-informed chain-of-thought prompts for metaphor understanding in large language mod…☆14Updated last year
- ☆21Updated 4 months ago
- Code & data for EMNLP 2020 paper "MOCHA: A Dataset for Training and Evaluating Reading Comprehension Metrics".☆16Updated 2 years ago
- Experiments on including metadata such as URLs, timestamps, website descriptions and HTML tags during pretraining.☆30Updated last year
- This repository implements the interaction with DBLP, information extraction and pre-processing of papers, and a client to store data to …☆10Updated last year
- The official repository for "Evaluating Entity Disambiguation and the Role of Popularity in Retrieval-Based NLP" published in ACL-IJNLP 2…☆18Updated 2 years ago
- Preprocessing and analysis for training SNOMED-CT concept embeddings from CORD-19 corpus☆14Updated last year
- ☆15Updated 3 months ago