Vietnamese text data crawler scripts for various sites (including Youtube, Facebook, 4rum, news, ...)
☆75Oct 25, 2022Updated 3 years ago
Alternatives and similar repositories for social-scraper
Users that are interested in social-scraper are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Xây dựng tập dữ liệu 500GB (20% done) văn bản tiếng Việt để huấn luyện mô hình ngôn ngữ lớn☆29Apr 7, 2023Updated 3 years ago
- Sentiment classification for Vietnamese text using PhoBert☆98Nov 16, 2020Updated 5 years ago
- ntc-scv is dataset of blogs on website https://streetcodevn.com☆27Oct 21, 2021Updated 4 years ago
- Pre-trained Word2Vec models for Vietnamese☆161Dec 30, 2020Updated 5 years ago
- Vietnamese sensitive words (including teencode) was created by ML algorithm☆67Jan 13, 2021Updated 5 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- A Large-scale Vietnamese News Text Classification Corpus☆107Sep 24, 2019Updated 6 years ago
- Cải thiện Elasticsearch trong bài toán semantic search sử dụng phương pháp Sentence Embeddings☆25May 27, 2021Updated 4 years ago
- ☆10Dec 10, 2018Updated 7 years ago
- Vietnamese self-supervised Wav2vec2 model☆61Nov 5, 2022Updated 3 years ago
- Công cụ quét và phân tích từ khoá các trang báo mạng Việt Nam☆265May 22, 2023Updated 2 years ago
- Framework quét dữ liệu trên Internet hỗ trợ render javascript và quét đa nhiệm☆48Jul 6, 2022Updated 3 years ago
- ☆35Aug 1, 2024Updated last year
- Thư viện chuẩn hóa văn bản Tiếng Việt☆181May 26, 2025Updated 10 months ago
- Machine Learning Project Template - Ready to production☆101Dec 13, 2022Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Vietnamese Punctuation Prediction using Pretrained Language Models☆14May 8, 2022Updated 3 years ago
- Vietnamese Human-based Text-to-Speech☆13Sep 9, 2012Updated 13 years ago
- PhoNLP: A BERT-based multi-task learning model for part-of-speech tagging, named entity recognition and dependency parsing (NAACL 2021)☆150Dec 31, 2024Updated last year
- Repository to track the progress in Vietnamese Natural Language Processing, including the datasets and the current state-of-the-art for t…☆372Sep 5, 2022Updated 3 years ago
- Sentence Embeddings with BERT & XLNet☆27Aug 23, 2020Updated 5 years ago
- Source code for Zalo AI 2021 submission☆142Dec 20, 2021Updated 4 years ago
- ☆63Oct 19, 2021Updated 4 years ago
- ☆12Oct 6, 2024Updated last year
- ☆48Dec 13, 2019Updated 6 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Corpus tiếng việt☆383Oct 3, 2025Updated 6 months ago
- Ai cũng có thể tự tạo chatbot bằng huấn luyện chỉ dẫn, với 12G GPU (RTX 3060) và khoảng vài chục MB dữ liệu☆114Jun 10, 2023Updated 2 years ago
- Finetune multiple pre-trained Transformer-based models to solve Vietnamese Fake News Detection problem (ReINTEL) in VLSP2020 shared task☆18Dec 16, 2020Updated 5 years ago
- Include Vietnamese stop words, Vietnamese person names, Vietnam GIS(Geographic Information System) data, Vietnamese Dictionary ...☆14Oct 18, 2017Updated 8 years ago
- VietConizer: Vietnamese OCR with NVIDIA DALI☆16Jul 5, 2025Updated 9 months ago
- Advanced PDF parsing for python☆12Jan 21, 2025Updated last year
- Project to share nlp algorithms☆65Oct 27, 2018Updated 7 years ago
- ☆16Jun 17, 2021Updated 4 years ago
- Thư viện hổ trợ chuyển đổi số sang chữ số Tiếng Việt.☆21Oct 16, 2021Updated 4 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- PhoBERT: Pre-trained language models for Vietnamese (EMNLP-2020 Findings)☆776Jul 23, 2024Updated last year
- Code tìm code 😗☆32May 6, 2020Updated 5 years ago
- ViText2SQL: A dataset for Vietnamese Text-to-SQL semantic parsing (EMNLP-2020 Findings)☆36Jul 22, 2024Updated last year
- vietnamese OCR☆140Apr 28, 2019Updated 6 years ago
- Công cụ tra hán việt từ điển từ termnial☆12Jan 27, 2018Updated 8 years ago
- This repository provides some useful snippets that you may need in some situations.☆15Mar 10, 2026Updated last month
- Vietnamese speech recognition using Wavenet☆73Feb 2, 2023Updated 3 years ago