w11wo/nlp-datasets

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/w11wo/nlp-datasets)

w11wo / nlp-datasets

A collection of various NLP datasets, mainly Indonesia-related languages.

☆15

Alternatives and similar repositories for nlp-datasets

Users that are interested in nlp-datasets are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

falaktheoptimist / awesome-ml-resources
View on GitHub
Collection of links to blogs/ resources on various ML topics
☆14Jun 15, 2022Updated 4 years ago
tatHi / optok
View on GitHub
☆10Aug 26, 2021Updated 4 years ago
rifkybujana / IndoBERT-QA
View on GitHub
indoBERT Base-Uncased fine-tuned on Translated Squad v2.0
☆19Dec 24, 2024Updated last year
Parkchanjun / OpenNMT-Colab-Tutorial
View on GitHub
OpenNMT Colab Tutorial Pytorch && Tensorflow
☆32Nov 18, 2019Updated 6 years ago
omotolani12 / Building-an-Advanced-RAG-Chatbot-with-Knowledge-Graphs
View on GitHub
☆12Jun 12, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
DAMO-NLP-SG / AdamergeX
View on GitHub
☆11Apr 2, 2024Updated 2 years ago
IndoNLP / nusax
View on GitHub
High-quality parallel resource on sentiment analysis for 10 low-resource Indonesian languages, English, and Indonesian (Outstanding Paper…
☆116May 8, 2023Updated 3 years ago
himkt / allennlp-NER
View on GitHub
☯️ AllenNLP training configurations for promising models on Named Entity Recognition. (BiLSTM-CRF, BiLSTM-CNN-CRF, BERT, BERT-CRF)
☆15Nov 26, 2020Updated 5 years ago
khavitidala / WhatsApp-ChatBot-of-Al-Qur-an
View on GitHub
Al-Qur'an yang dikemas dalam bentuk ChatBot
☆15Dec 1, 2020Updated 5 years ago
aikindergarten / fasthugs
View on GitHub
Training HuggingFace models using fastai
☆11Jul 22, 2021Updated 5 years ago
obedtandadjaja / Coffee-Shop-Cashier
View on GitHub
A simple Java cashier program (with touch-screen GUI)
☆11Jan 22, 2023Updated 3 years ago
li-plus / nanoRLHF
View on GitHub
Train a tiny LLaMA model from scratch to repeat your words using Reinforcement Learning from Human Feedback (RLHF)
☆18May 23, 2024Updated 2 years ago
Muhammad-Yunus / Belajar-OpenCV-ObjectDetection
View on GitHub
Belajar OpenCV Object Detection
☆18Nov 14, 2025Updated 8 months ago
MagicDash91 / All-of-Data-Science-Project
View on GitHub
This repository is for all of my Data Science Project and Portfolio
☆16Apr 24, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
louisowen6 / quora_paraphrasing_id
View on GitHub
Quora Paraphrasing Dataset Bahasa Indonesia Version
☆11Apr 18, 2021Updated 5 years ago
Harry-Chan / seq2seqlm-on-qg
View on GitHub
☆13Feb 9, 2022Updated 4 years ago
purvanshi / operation-prediction
View on GitHub
Takes a question predicts the operation between them
☆12Jan 2, 2018Updated 8 years ago
hppRC / defsent
View on GitHub
DefSent: Sentence Embeddings using Definition Sentences
☆23Aug 5, 2021Updated 4 years ago
davidswelt / dmvccm
View on GitHub
DMV/CCM implementation
☆17Jul 14, 2016Updated 10 years ago
ku-nlp / bertknp
View on GitHub
A Japanese dependency parser based on BERT
☆23Oct 26, 2022Updated 3 years ago
cythnn / cythnn
View on GitHub
☆16Jun 5, 2016Updated 10 years ago
gentaiscool / indonesian-nlp
View on GitHub
A curated list of research papers and resources on Indonesian languages
☆41Mar 21, 2024Updated 2 years ago
Liuxg16 / UPSA
View on GitHub
☆12Feb 21, 2021Updated 5 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
osekilab / JCoLA
View on GitHub
☆19Apr 21, 2026Updated 3 months ago
LuminosityX / MM-Forecast
View on GitHub
Implementation of our paper, "MM-Forecast: A Multimodal Approach to Temporal Event Forecasting with Large Language Models".
☆18Apr 16, 2025Updated last year
pv / pythoncall
View on GitHub
Python embedded inside Matlab
☆15Nov 13, 2009Updated 16 years ago
wtsnjp / MioGatto
View on GitHub
An annotation tool for grounding of formulae
☆24May 28, 2024Updated 2 years ago
AhlemGit / Arabic-WordNet-To-SQLite
View on GitHub
This repository is about how to build an SQLite version of the Arabic WordNet database.
☆11Mar 19, 2019Updated 7 years ago
kangfend / bahasa
View on GitHub
Natural language toolkit for Indonesian Language (Bahasa)
☆20May 16, 2024Updated 2 years ago
ionelmc / django-admin-utils
View on GitHub
Utility code for easier django admin development
☆24Jan 26, 2026Updated 6 months ago
ischlag / TP-Transformer
View on GitHub
☆49Mar 8, 2021Updated 5 years ago
uetchy / badge-api
View on GitHub
Extra badges for App Store, Product Hunt and Hatena bookmarks
☆11Sep 21, 2023Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
PicoCreator / RWKV-LM-LoRA
View on GitHub
RWKV is a RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best …
☆10Nov 3, 2023Updated 2 years ago
infocusp / scaLR
View on GitHub
Single cell analysis using Low Resource
☆20Apr 3, 2026Updated 3 months ago
primus852 / stock-news
View on GitHub
Scrape financial News from Yahoo and analyse the sentiment (PoC)
☆20Jul 16, 2019Updated 7 years ago
abhikjha / Fastai-integration-with-BERT
View on GitHub
Step wise instructions to integrate the power of BERT with Fastai
☆18Jul 17, 2019Updated 7 years ago
yoichi1484 / subspace
View on GitHub
An implementation of "Subspace Representations for Soft Set Operations and Sentence Similarities" (NAACL 2024)
☆10May 31, 2024Updated 2 years ago
mrrizal / POS_Tag_Indonesian
View on GitHub
POS Tag for Indonesian language
☆18Dec 24, 2016Updated 9 years ago
janzheng / sidenote
View on GitHub
Chrome extension that helps you analyze and summarize web content
☆27Mar 27, 2026Updated 3 months ago