FUZHIYI / TACOView external linksLinks
Code for the ACL 2022 paper "Contextual Representation Learning beyond Masked Language Modeling"
☆33Oct 23, 2022Updated 3 years ago
Alternatives and similar repositories for TACO
Users that are interested in TACO are comparing it to the libraries listed below
Sorting:
- [NAACL 2022] "Learning to Win Lottery Tickets in BERT Transfer via Task-agnostic Mask Training", Yuanxin Liu, Fandong Meng, Zheng Lin, Pe…☆15Oct 18, 2022Updated 3 years ago
- Official Code for 'EPiDA: An Easy Plug-in Data Augmentation Framework for High Performance Text Classification' - NAACL 2022☆23May 9, 2022Updated 3 years ago
- Dataset and baseline for Coling 2022 long paper (oral): "ConFiguRe: Exploring Discourse-level Chinese Figures of Speech"☆13Jul 27, 2023Updated 2 years ago
- [NAACL 2022] TreeMix: Compositional Constituency-based Data Augmentation for Natural Language Understanding☆10Jul 15, 2023Updated 2 years ago
- GisPy: A Tool for Measuring Gist Inference Score in Text https://aclanthology.org/2022.wnu-1.5/☆13Jul 1, 2024Updated last year
- ☆14Feb 3, 2021Updated 5 years ago
- German Language Understanding Evaluation Benchmark @NAACL24☆22Dec 11, 2025Updated 2 months ago
- PathPiece tokenizer☆13Nov 10, 2024Updated last year
- A self-updating GitHub profile 🐯☆15Feb 6, 2026Updated last week
- 天池“公益AI之星”挑战赛-新冠疫情相似句对判定大赛☆16Apr 12, 2020Updated 5 years ago
- Virtual Data Augmentation: A Robust and General Framework for Fine-tuning Pre-trained Models☆16Sep 13, 2021Updated 4 years ago
- [EMNLP 2021] Efficient Contrastive Learning via Novel Data Augmentation and Curriculum Learning☆17Jun 28, 2025Updated 7 months ago
- One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation☆48Oct 20, 2025Updated 3 months ago
- The official repository for Toxic Commons and Celadon. Toxicity Classification for public domain data.☆22Nov 10, 2024Updated last year
- ✂️ Sentence segmentation with wtpsplit's state-of-the-art Segment any Text (SaT) models☆35Oct 1, 2025Updated 4 months ago
- ☆80Jul 11, 2022Updated 3 years ago
- Tool for the Automatic Analysis of Syntactic Sophistication and Complexity☆31Nov 4, 2023Updated 2 years ago
- pytorch版bert权重转tf☆22May 19, 2020Updated 5 years ago
- Natural Universal Trigger Search (NUTS)☆21Apr 17, 2021Updated 4 years ago
- ACL 2022(findings): A Sentence is Worth 128 Pseudo Tokens: A Semantic-Aware Contrastive Learning Framework for Sentence Embeddings☆18Mar 23, 2022Updated 3 years ago
- My NER Experiments with ModernBERT and Ettin☆26Jul 17, 2025Updated 6 months ago
- quica is a tool to run inter coder agreement pipelines in an easy and effective ways. Multiple measures are run and results are collected…☆23Nov 9, 2020Updated 5 years ago
- Multilingual Open Text☆25May 8, 2025Updated 9 months ago
- Supervised Contrastive Learning for Downstream Optimized Sequence Representations☆26Nov 9, 2021Updated 4 years ago
- Code for SaGe subword tokenizer (EACL 2023)☆27Nov 30, 2024Updated last year
- Small python package to measure OCR quality and other related metrics.☆27Feb 19, 2024Updated last year
- German Parliamentary Corpus (GerParCor)☆27Jan 14, 2026Updated last month
- ☆30Feb 3, 2026Updated last week
- Research code for the paper "How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models"☆28Oct 3, 2021Updated 4 years ago
- Multi-sense word embeddings from visual co-occurrences☆25Sep 5, 2019Updated 6 years ago
- Code for "BERTifying the Hidden Markov Model for Multi-Source Weakly Supervised Named Entity Recognition"☆32Jun 20, 2023Updated 2 years ago
- Source code for ACL 2022 paper "Self-contrastive Decorrelation for Sentence Embeddings".☆26Mar 10, 2025Updated 11 months ago
- Code for ACL 2022 paper "Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation"☆30Apr 2, 2022Updated 3 years ago
- Code for paper Sentence-aware Contrastive Learning for Open-Domain Passage Retrieval, Accepted by ACL2022 Main Conference, Long Paper☆30Mar 12, 2022Updated 3 years ago
- A sample Java gRPC client for the Salesforce Pub/Sub API☆12Oct 9, 2024Updated last year
- Source code for "UniRE: A Unified Label Space for Entity Relation Extraction.", ACL2021. It is based on our NERE toolkit (https://github.…☆122Apr 13, 2022Updated 3 years ago
- ☆37Sep 22, 2021Updated 4 years ago
- Code for the NAACL 2022 long paper "DiffCSE: Difference-based Contrastive Learning for Sentence Embeddings"☆296Oct 27, 2022Updated 3 years ago
- CCQA A New Web-Scale Question Answering Dataset for Model Pre-Training☆32Jul 20, 2022Updated 3 years ago