symato / physics_of_llmsLinks
Các thí nghiệm liên quan tới LLMs cho tiếng Việt (insprised by Physics of LLMs Series)
☆11Updated last year
Alternatives and similar repositories for physics_of_llms
Users that are interested in physics_of_llms are comparing it to the libraries listed below
Sorting:
- ☆73Updated last year
- Pioneering in Vietnamese Multimodal Large Language Model☆53Updated 9 months ago
- The largest VQA dataset for Vietnamese. Related to the text content in the image.☆21Updated 7 months ago
- VNHSGE: Vietnamese High School Graduation Examination Dataset for Large Language Models☆28Updated 2 years ago
- ViDeBERTa: A powerful pre-trained language model for Vietnamese, EACL 2023☆57Updated 2 years ago
- Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback☆97Updated 2 years ago
- Vistral-V: Visual Instruction Tuning for Vistral - Vietnamese Large Vision-Language Model.☆23Updated last year
- PhoMT: A High-Quality and Large-Scale Benchmark Dataset for Vietnamese-English Machine Translation (EMNLP 2021)☆46Updated 5 months ago
- ☆16Updated 3 years ago
- Ai cũng có thể tự tạo chatbot bằng huấn luyện chỉ dẫn, với 12G GPU (RTX 3060) và khoảng vài chục MB dữ liệu☆113Updated 2 years ago
- Baseline for ZaloAI Challenge 2023 Elementary Math Solving☆70Updated last year
- Machine Reading Comprehension special for the Vietnamese language☆42Updated 3 years ago
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆78Updated last year
- BERT-based joint intent detection and slot filling with intent-slot attention mechanism (INTERSPEECH 2021)☆87Updated last year
- 🚀 Automatically convert unstructured data into a high-quality 'textbook' format, optimized for fine-tuning Large Language Models (LLMs)☆25Updated 2 years ago
- A dataset for Vietnamese Spelling Correction☆15Updated 4 years ago
- BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese (INTERSPEECH 2022)☆104Updated last year
- Vietnamese long form question answering system with documents retrieval.☆21Updated last year
- 👨🏻💻 Code release for Vietnamese chatbot from scratch [Published in IEEE IMCOM 2022]☆17Updated last year
- This is the official repository for Vista dataset - A Vietnamese multimodal dataset contains more than 700,000 samples of conversations a…☆26Updated last year
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆49Updated 2 years ago
- We finetune Bloomz-7b1-mt using LoRA with the chatdoctor-200k dataset at here https://huggingface.co/LinhDuong/doctorwithbloomz-7b1-mt an…☆30Updated 2 years ago
- Improving Text Embedding of Language Models Using Contrastive Fine-tuning☆65Updated last year
- RecGPT: Generative Pre-training for Text-based Recommendation (ACL 2024)☆36Updated last year
- Code for NeurIPS LLM Efficiency Challenge☆59Updated last year
- ☆71Updated 2 years ago
- Using open source LLMs to build synthetic datasets for direct preference optimization☆69Updated last year
- Use LoRA technique to improve training Large Language Model☆12Updated 2 years ago
- ViText2SQL: A dataset for Vietnamese Text-to-SQL semantic parsing (EMNLP-2020 Findings)☆36Updated last year
- Xây dựng tập dữ liệu 500GB (20% done) văn bản tiếng Việt để huấn luyện mô hình ngôn ngữ lớn☆29Updated 2 years ago