crux82 / squad-it
A large scale dataset for Question Answering in Italian
โ26Updated 6 years ago
Alternatives and similar repositories for squad-it:
Users that are interested in squad-it are comparing it to the libraries listed below
- AlBERTo the first italian BERT model for Twitter languange understandingโ72Updated 4 years ago
- GilBERTo: A pretrained language model based on RoBERTa for Italianโ73Updated 5 years ago
- Materials for "IT5: Large-scale Text-to-text Pretraining for Italian Language Understanding and Generation" ๐ฎ๐นโ30Updated 8 months ago
- ๐ฎ๐น Italian BERT and ELECTRA models (incl. evaluation)โ18Updated 2 years ago
- Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Doโฆโ79Updated 7 months ago
- Examples for aligning, padding and batching sequence labeling data (NER) for use with pre-trained transformer modelsโ65Updated 2 years ago
- BERT and ELECTRA models trained on Europeana Newspapersโ37Updated 3 years ago
- As good as new. How to successfully recycle English GPT-2 to make models for other languages (ACL Findings 2021)โ48Updated 3 years ago
- UmBERTo: an Italian Language Model trained with Whole Word Masking.โ104Updated 2 years ago
- โ15Updated 7 years ago
- A Word Sense Disambiguation system integrating implicit and explicit external knowledge.โ68Updated 3 years ago
- A spaCy custom component that extracts and normalizes temporal expressionsโ54Updated 2 years ago
- Use BERT to Fill in the Blanksโ82Updated 3 years ago
- A collection of Italian benchmarks for LLM evaluationโ26Updated 2 months ago
- ๐งช Cutting-edge experimental spaCy components and featuresโ96Updated 9 months ago
- BERT models for many languages created from Wikipedia textsโ33Updated 4 years ago
- This repository contains the code for the paper 'PARM: Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval' puโฆโ40Updated 3 years ago
- Fine-tune transformers with pytorch-lightningโ44Updated 2 years ago
- A simple neural truecaser written in pytorch and allennlp.โ33Updated 8 months ago
- Augmenty is an augmentation library based on spaCy for augmenting texts.โ151Updated 8 months ago
- An open information extraction system that provides compact extractionsโ91Updated 2 years ago
- Repository for the paper "MultiNERD: A Multilingual, Multi-Genre and Fine-Grained Dataset for Named Entity Recognition (and Disambiguatioโฆโ43Updated last year
- Language Modelling Makes Sense - WSD (and more) with Contextual Embeddingsโ95Updated last year
- Sentence transformers models for SpaCyโ107Updated last year
- โ64Updated 2 years ago
- Implementation, trained models and result data for the paper "Aspect-based Document Similarity for Research Papers" #COLING2020โ62Updated 9 months ago
- Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2โฆโ67Updated 2 years ago
- Anonymization of legal cases (Fr) based on Flair embeddingsโ88Updated 4 years ago
- โ33Updated 3 years ago
- Identifying Historical People, Places and other Entities: Shared Task on Named Entity Recognition and Linking on Historical Newspapers atโฆโ22Updated 6 months ago