We provide benchmark datasets for evaluating Vietnamese processing models: UIT-ViQuAD, ViNewsQA, UIT-VSFC, UIT-ViIC, UIT-ViNames, UIT-VSMEC and ViMMRC.
☆23Jun 19, 2021Updated 5 years ago
Alternatives and similar repositories for VietnameseDatasets
Users that are interested in VietnameseDatasets are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Machine Reading Comprehension has attracted significant interest in research on natural language understanding, and large-scale datasets …☆10Aug 14, 2021Updated 4 years ago
- ☆18Oct 15, 2021Updated 4 years ago
- VIMQA dataset☆14Jul 6, 2022Updated 3 years ago
- Ai cũng có thể tự tạo chatbot bằng huấn luyện chỉ dẫn, với 12G GPU (RTX 3060) và khoảng vài chục MB dữ liệu☆113Jun 10, 2023Updated 3 years ago
- A collection of Vietnamese Natural Language Processing resources.☆314Oct 28, 2025Updated 8 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- DS310.M11 - Xử Lý Ngôn Ngữ Tự Nhiên Cho Khoa Học Dữ Liệu☆18Mar 4, 2022Updated 4 years ago
- Xây dựng tập dữ liệu 500GB (20% done) văn bản tiếng Việt để huấn luyện mô hình ngôn ngữ lớn☆29Apr 7, 2023Updated 3 years ago
- Implementation of the DocLLM paper for Llama models.☆13Apr 6, 2025Updated last year
- Dự án bao gồm: 1. Xây dựng bộ dữ Instructions Vietnamese (chất lượng, nhiều, và đa dạng). 2.LLM Training, Finetuning, Evaluating & Testin…☆283Sep 1, 2025Updated 10 months ago
- Speaker overlap-aware Neural Diarization☆12Feb 13, 2023Updated 3 years ago
- Sherpa-onnx-tts-stt source for homeassisstant addon with Kroko Onnx Streaming STT integration.☆30Dec 18, 2025Updated 6 months ago
- Phân loại văn bản Tiếng Việt sử dụng pretrained model - PhoBERT☆12Feb 1, 2021Updated 5 years ago
- Adaptation datasets and scripts for the paper "Reducing gender bias in Neural Machine Translation as a domain adaptation problem" (ACL 20…☆13Mar 18, 2021Updated 5 years ago
- ☆21Jun 13, 2019Updated 7 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Ensemble PhoBERT with FastText Embedding to improve performance on Vietnamese Sentiment Analysis tasks.☆17Jun 29, 2023Updated 3 years ago
- A chrome extension to toggle subtitles using keyboard shortcut (C)☆10Jul 4, 2025Updated last year
- ☆21Nov 19, 2023Updated 2 years ago
- Machine Reading Comprehension special for the Vietnamese language☆41Mar 13, 2022Updated 4 years ago
- Master thesis: Exploring bias in German NLG (GPT-3 & GerPT-2). Applies regard classification and bias mitigation triggers.☆16Sep 25, 2024Updated last year
- ☆17Jul 10, 2022Updated 3 years ago
- A High-Quality and Large-Scale Dataset for English-Vietnamese Speech Translation (INTERSPEECH 2022)☆25Jun 5, 2025Updated last year
- ☆19Jun 28, 2022Updated 4 years ago
- ☆52Sep 3, 2025Updated 10 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- This is sample source code for Reinforcement Learning Competition, hosted by FPT-Software (Hanoi, Vietnam). The game is Gold Miner.☆27Sep 25, 2020Updated 5 years ago
- ☆27Feb 18, 2025Updated last year
- TP - AI Project S2T1, DSAI HUST☆19Jun 7, 2022Updated 4 years ago
- This repo provides Geometric LayoutLM for Vietnamese document and code for export to ONNX☆14Mar 3, 2024Updated 2 years ago
- Phần mềm nguồn mở giúp mỗi cá nhân trực tiếp sử dụng ChatGPT và hơn thế nữa ngay trên máy tính của mình.☆34Apr 5, 2023Updated 3 years ago
- BFloat16 Fused Adam Operator for PyTorch☆20Nov 16, 2024Updated last year
- ☆23Mar 20, 2024Updated 2 years ago
- Corpus tiếng việt☆385Oct 3, 2025Updated 9 months ago
- This repository holds the code for my master thesis entitles "The Association of Gender Bias with BERT - Measuring, Mitigating and Cross-…☆18Sep 19, 2022Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Quartznet implementation on pytorch [https://arxiv.org/abs/1910.10261]☆27Jul 16, 2021Updated 4 years ago
- PhoMT: A High-Quality and Large-Scale Benchmark Dataset for Vietnamese-English Machine Translation (EMNLP 2021)☆52Jun 3, 2025Updated last year
- Custom ML tracking experiment and debugging tools.☆15Aug 2, 2022Updated 3 years ago
- DataTalks.Club's Data Engineering Zoomcamp Project☆24Jul 14, 2022Updated 3 years ago
- ChatGPT solutions for the MLE interview☆14Dec 9, 2022Updated 3 years ago
- alm0n for UET's viewgrade☆16Feb 7, 2023Updated 3 years ago
- ☆10Jul 12, 2019Updated 6 years ago