ikergarcia1996 / T-Projection
T-Projection is a method to perform high-quality Annotation Projection of Sequence Labeling datasets.
☆12Updated last year
Alternatives and similar repositories for T-Projection:
Users that are interested in T-Projection are comparing it to the libraries listed below
- Data and code for the paper "CiteWorth: Cite-Worthiness Detection for Improved Scientific Document Understanding"☆14Updated 2 years ago
- ☆22Updated 2 months ago
- GC4LM: A Colossal (Biased) language model for German☆13Updated 3 years ago
- SeqScore: Scoring for named entity recognition and other sequence labeling tasks☆23Updated last month
- Repository with code for MaChAmp: https://aclanthology.org/2021.eacl-demos.22/☆86Updated last week
- The dataset and code for ACL 2022 paper "SciNLI: A Corpus for Natural Language Inference on Scientific Text" are released here.☆27Updated last year
- Semantically Structured Sentence Embeddings☆65Updated 5 months ago
- UDapter is a multilingual dependency parser that uses "contextual" adapters together with language-typology features for language-specifi…☆31Updated 2 years ago
- ☆26Updated last month
- Code for SaGe subword tokenizer (EACL 2023)☆24Updated 4 months ago
- Code for our TSD paper "TOKEN is a MASK: Few-shot Named Entity Recognition with Pre-trained Language Models"☆14Updated 2 years ago
- ☆19Updated 2 years ago
- PropSegmEnt is an annotated dataset for segmenting English text into propositions, and recognizing proposition-level entailment relations…☆19Updated 2 years ago
- ParaNames: A multilingual resource for parallel names☆31Updated 10 months ago
- MultiCite code and data. Models are available on Huggingface.☆31Updated 2 years ago
- LTG-Bert☆32Updated last year
- GSRL is a seq2seq model for end-to-end dependency- and span-based SRL (IJCAI2021).☆18Updated 3 years ago
- ☆17Updated last year
- Statistics on multilingual datasets☆17Updated 2 years ago
- Repo for Aspire - A scientific document similarity model based on matching fine-grained aspects of scientific papers.☆51Updated last year
- Starbucks: Improved Training for 2D Matryoshka Embeddings☆19Updated 2 months ago
- A survey of corpora for Germanic low-resource languages and dialects☆25Updated 4 months ago
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆57Updated 8 months ago
- The corresponding code for our paper: "Exploring the Challenges of Open Domain Multi-Document Summarization". Do not hesitate to open an …☆32Updated last year
- Corpus exploration platform using advanced tools such as interactive summarization and multi document coreference resolution☆12Updated last year
- GLADIS: A General and Large Acronym Disambiguation Benchmark (EACL 23)☆16Updated 9 months ago
- Automatically detect errors in annotated corpora.☆47Updated last year
- Code and data for "Superbizarre Is Not Superb: Derivational Morphology Improves BERT's Interpretation of Complex Words"☆16Updated 3 years ago
- ☆13Updated 3 years ago
- Python source code for EMNLP 2021 Findings paper: "Subword Mapping and Anchoring Across Languages".☆13Updated 3 years ago