w11wo / nlp-datasetsView external linksLinks
A collection of various NLP datasets, mainly Indonesia-related languages.
☆15Apr 23, 2022Updated 3 years ago
Alternatives and similar repositories for nlp-datasets
Users that are interested in nlp-datasets are comparing it to the libraries listed below
Sorting:
- ☆11Aug 26, 2021Updated 4 years ago
- ☯️ AllenNLP training configurations for promising models on Named Entity Recognition. (BiLSTM-CRF, BiLSTM-CNN-CRF, BERT, BERT-CRF)☆15Nov 26, 2020Updated 5 years ago
- DefSent: Sentence Embeddings using Definition Sentences☆22Aug 5, 2021Updated 4 years ago
- DMV/CCM implementation☆17Jul 14, 2016Updated 9 years ago
- A Japanese dependency parser based on BERT☆23Oct 26, 2022Updated 3 years ago
- ☆19May 23, 2024Updated last year
- An annotation tool for grounding of formulae☆24May 28, 2024Updated last year
- benchmarks for LLM tokenizers☆16Jan 15, 2026Updated 3 weeks ago
- NLP Datasets for Indonesian☆126Feb 11, 2023Updated 3 years ago
- Collection of links to blogs/ resources on various ML topics☆13Jun 15, 2022Updated 3 years ago
- This repository is about how to build an SQLite version of the Arabic WordNet database.☆10Mar 19, 2019Updated 6 years ago
- A curated list of research papers and resources on Indonesian languages☆40Mar 21, 2024Updated last year
- OpenNMT Colab Tutorial Pytorch && Tensorflow☆31Nov 18, 2019Updated 6 years ago
- A tool to collect/validate audio recordings from workers on Amazon Mechanical Turk. Written in Python/Flask. (originally hosted on github…☆14Dec 19, 2022Updated 3 years ago
- 🍺 a Homebrew keg that specialized in Natural Language Processing.☆22May 23, 2018Updated 7 years ago
- RWKV is a RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best …☆10Nov 3, 2023Updated 2 years ago
- Node.js wrapper for the GLTF2Loader library from Three.js☆10Nov 8, 2017Updated 8 years ago
- Vector Symbolic Architecture library☆11Mar 27, 2023Updated 2 years ago
- A set of base classes in order to perfom training scripts for Neural Networs ( by means of SNNS) and SVM ( by means of SVM Light and SVM …☆14Jun 24, 2011Updated 14 years ago
- [AAAI'23] FinalMLP: An Enhanced Two-Stream MLP Model for CTR Prediction https://arxiv.org/abs/2304.00902☆10Apr 9, 2023Updated 2 years ago
- MG top-down beam parsing☆13Jul 2, 2018Updated 7 years ago
- ☆10Jun 4, 2020Updated 5 years ago
- No Gabut Challenge Submission☆10Mar 2, 2021Updated 4 years ago
- Named Entity (NER) annotations of the Hebrew Treebank (Haaretz newspaper) corpus, including: morpheme and token level NER labels, nested …☆10Dec 27, 2021Updated 4 years ago
- This is implementation examples by Chainer.☆11Apr 7, 2018Updated 7 years ago
- Performs tasks together with GPT.☆13Apr 4, 2023Updated 2 years ago
- A collection of English tweets annotated in Universal Dependencies.☆39Oct 20, 2021Updated 4 years ago
- Script sederhana untuk mengubah aksara latin menjadi aksara Jawa☆35May 2, 2023Updated 2 years ago
- Japanese BERT trained on Aozora Bunko and Wikipedia, pre-tokenized by MeCab with UniDic & SudachiPy☆40Aug 8, 2020Updated 5 years ago
- Deep learning for named entity recognition on CoNLL-2003☆10Dec 23, 2016Updated 9 years ago
- Extra badges for App Store, Product Hunt and Hatena bookmarks☆11Sep 21, 2023Updated 2 years ago
- ☆11Feb 11, 2020Updated 6 years ago
- ☆10Mar 20, 2021Updated 4 years ago
- An experiment with movie scenes and contrastive learning☆11Feb 1, 2025Updated last year
- 🎯 Speech Recognition Challenge by Speech Lab - IIT Madras☆11Nov 5, 2020Updated 5 years ago
- Behavioral probing of language acquisition models at the lexical and syntactic level☆17Jul 17, 2023Updated 2 years ago
- A home for your agents skills. Create, manage, share skills between agents easily.☆17Updated this week
- Transform audio files into mel spectrograms for text-to-speech model training☆11Aug 25, 2021Updated 4 years ago
- ☆12Jul 5, 2023Updated 2 years ago