Xây dựng tập dữ liệu 500GB (20% done) văn bản tiếng Việt để huấn luyện mô hình ngôn ngữ lớn
☆29Apr 7, 2023Updated 2 years ago
Alternatives and similar repositories for vi
Users that are interested in vi are comparing it to the libraries listed below
Sorting:
- ☆18Dec 20, 2023Updated 2 years ago
- Ai cũng có thể tự tạo chatbot bằng huấn luyện chỉ dẫn, với 12G GPU (RTX 3060) và khoảng vài chục MB dữ liệu☆114Jun 10, 2023Updated 2 years ago
- A dataset for Vietnamese Spelling Correction☆15Sep 27, 2021Updated 4 years ago
- Dự án bao gồm: 1. Xây dựng bộ dữ Instructions Vietnamese (chất lượng, nhiều, và đa dạng). 2.LLM Training, Finetuning, Evaluating & Testin…☆277Sep 1, 2025Updated 6 months ago
- Custom ML tracking experiment and debugging tools.☆15Aug 2, 2022Updated 3 years ago
- We provide benchmark datasets for evaluating Vietnamese processing models: UIT-ViQuAD, ViNewsQA, UIT-VSFC, UIT-ViIC, UIT-ViNames, UIT-VSM…☆20Jun 19, 2021Updated 4 years ago
- ☆33May 15, 2024Updated last year
- ☆25Aug 28, 2024Updated last year
- Solution for Zalo AI Challenge 2022 - E2E Question Answering☆110Dec 25, 2022Updated 3 years ago
- We finetune Bloomz-7b1-mt using LoRA with the chatdoctor-200k dataset at here https://huggingface.co/LinhDuong/doctorwithbloomz-7b1-mt an…☆30Apr 4, 2023Updated 2 years ago
- This is a simple App that connects the server (FastAPI - FastAPI - Flask) using micro-services architecture and domain-driven design.☆13May 3, 2023Updated 2 years ago
- Sentence Embeddings with BERT & XLNet☆27Aug 23, 2020Updated 5 years ago
- PhoBERT: Pre-trained language models for Vietnamese (EMNLP-2020 Findings)☆773Jul 23, 2024Updated last year
- Phần mềm nguồn mở giúp mỗi cá nhân trực tiếp sử dụng ChatGPT và hơn thế nữa ngay trên máy tính của mình.☆34Apr 5, 2023Updated 2 years ago
- ☆75Feb 6, 2023Updated 3 years ago
- Connects your code into an intelligent network for perfect context.☆36Updated this week
- paraphase sentence☆11Aug 22, 2025Updated 6 months ago
- Distillation Contrastive Decoding: Improving LLMs Reasoning with Contrastive Decoding and Distillation☆35Feb 27, 2024Updated 2 years ago
- PhoNLP: A BERT-based multi-task learning model for part-of-speech tagging, named entity recognition and dependency parsing (NAACL 2021)☆150Dec 31, 2024Updated last year
- Learn how to combine Nginx + wigs + load balancing + flask + unit testing + Docker☆12Jun 2, 2021Updated 4 years ago
- Machine Reading Comprehension has attracted significant interest in research on natural language understanding, and large-scale datasets …☆10Aug 14, 2021Updated 4 years ago
- Agentiqs.ai mcp-kit☆23Aug 2, 2025Updated 7 months ago
- A roadmap to becoming a Python developer☆10Aug 3, 2023Updated 2 years ago
- Ghi chép ban đầu về telegram bot☆11Jun 11, 2018Updated 7 years ago
- ☆14Jun 6, 2025Updated 8 months ago
- Pre-trained Word2Vec models for Vietnamese☆160Dec 30, 2020Updated 5 years ago
- This repo has scripts to compare various powerful RL methods☆33Feb 23, 2026Updated last week
- ☆10Mar 31, 2022Updated 3 years ago
- ☆10Jul 12, 2019Updated 6 years ago
- Seq2seq using LSTM with attention from Luong et al☆10Oct 2, 2018Updated 7 years ago
- MATS: A Multi-agent Text2SQL Framework using Small Language Models and Execution Feedback☆14Dec 27, 2025Updated 2 months ago
- Food Order Android App using Firebase☆12Oct 1, 2020Updated 5 years ago
- ☆28Sep 10, 2025Updated 5 months ago
- Copy My Writing is a command-line tool for generating content based on your personal writing style.☆11Oct 12, 2025Updated 4 months ago
- Multiclass and multilabel classification of ECG signals using various deep learning models.☆11Nov 22, 2020Updated 5 years ago
- ChatGPT solutions for the MLE interview☆14Dec 9, 2022Updated 3 years ago
- I am a computer science graduate from Abdul Wali Khan University Mardan. Here I have learned more about web development and software deve…☆10Jul 6, 2023Updated 2 years ago
- Top 9 private leaderboard & Top 17 public leaderboard☆10Dec 1, 2022Updated 3 years ago
- Transformer OCR☆752Jan 19, 2025Updated last year