Vietnamese text data crawler scripts for various sites (including Youtube, Facebook, 4rum, news, ...)
☆75Oct 25, 2022Updated 3 years ago
Alternatives and similar repositories for social-scraper
Users that are interested in social-scraper are comparing it to the libraries listed below
Sorting:
- Xây dựng tập dữ liệu 500GB (20% done) văn bản tiếng Việt để huấn luyện mô hình ngôn ngữ lớn☆29Apr 7, 2023Updated 2 years ago
- Pre-trained Word2Vec models for Vietnamese☆160Dec 30, 2020Updated 5 years ago
- Framework quét dữ liệu trên Internet hỗ trợ render javascript và quét đa nhiệm☆48Jul 6, 2022Updated 3 years ago
- Finetune multiple pre-trained Transformer-based models to solve Vietnamese Fake News Detection problem (ReINTEL) in VLSP2020 shared task☆18Dec 16, 2020Updated 5 years ago
- Sentiment classification for Vietnamese text using PhoBert☆99Nov 16, 2020Updated 5 years ago
- ntc-scv is dataset of blogs on website https://streetcodevn.com☆26Oct 21, 2021Updated 4 years ago
- Tutorial phân loại văn bản sử dụng một số thuật toán học máy☆10Aug 8, 2020Updated 5 years ago
- Thư viện chuẩn hóa văn bản Tiếng Việt☆180May 26, 2025Updated 9 months ago
- Vietnamese sensitive words (including teencode) was created by ML algorithm☆67Jan 13, 2021Updated 5 years ago
- ☆11May 11, 2021Updated 4 years ago
- ☆10Dec 10, 2018Updated 7 years ago
- ☆15Jun 12, 2023Updated 2 years ago
- Machine Learning Project Template - Ready to production☆101Dec 13, 2022Updated 3 years ago
- Thư viện chuyển đổi chữ số dành riêng cho Tiếng Việt - Convert numbers to text(words) and text(words) to numbers in Vietnamese☆77Jan 2, 2026Updated 2 months ago
- Vietnamese Punctuation Prediction using Pretrained Language Models☆14May 8, 2022Updated 3 years ago
- ☆12Oct 6, 2024Updated last year
- Vietnamese Human-based Text-to-Speech☆13Sep 9, 2012Updated 13 years ago
- Công cụ quét và phân tích từ khoá các trang báo mạng Việt Nam☆266May 22, 2023Updated 2 years ago
- A Large-scale Vietnamese News Text Classification Corpus☆109Sep 24, 2019Updated 6 years ago
- Sentence Embeddings with BERT & XLNet☆27Aug 23, 2020Updated 5 years ago
- To collect and promote FOSS projects started by and contributed to by Vietnamese☆12Sep 24, 2018Updated 7 years ago
- Công cụ tra hán việt từ điển từ termnial☆12Jan 27, 2018Updated 8 years ago
- ☆27Jan 17, 2022Updated 4 years ago
- Vietnamese self-supervised Wav2vec2 model☆61Nov 5, 2022Updated 3 years ago
- My dojo for learning and training machine learning skills☆13Mar 9, 2016Updated 9 years ago
- Corpus tiếng việt☆385Oct 3, 2025Updated 5 months ago
- Repository to track the progress in Vietnamese Natural Language Processing, including the datasets and the current state-of-the-art for t…☆370Sep 5, 2022Updated 3 years ago
- Source code for Zalo AI 2021 submission☆142Dec 20, 2021Updated 4 years ago
- Code tìm code 😗☆32May 6, 2020Updated 5 years ago
- PhoNLP: A BERT-based multi-task learning model for part-of-speech tagging, named entity recognition and dependency parsing (NAACL 2021)☆150Dec 31, 2024Updated last year
- Project to share nlp algorithms☆65Oct 27, 2018Updated 7 years ago
- Solution for MC_OCR competition☆95Mar 7, 2023Updated 2 years ago
- Repository for Deep Learning study group at FTI (FPT)☆17Jan 30, 2018Updated 8 years ago
- VietConizer: Vietnamese OCR with NVIDIA DALI☆16Jul 5, 2025Updated 8 months ago
- ☆16Jun 17, 2021Updated 4 years ago
- Custom ML tracking experiment and debugging tools.☆15Aug 2, 2022Updated 3 years ago
- Một cuốn sách tập trung vào hướng dẫn cách cấu trúc các dự án Học Máy và phân tích cách làm cho các thuật toán Học Máy hoạt động.☆1,085Oct 13, 2021Updated 4 years ago
- A toolkit for processing Vietnamese texts☆16Oct 20, 2022Updated 3 years ago
- Zalo AI Challenge 2020: News Summarization - Runner-up solution☆20Dec 4, 2020Updated 5 years ago