aisingapore/sealion

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/aisingapore/sealion)

aisingapore / sealion

South-East Asia Large Language Models

☆419

Alternatives and similar repositories for sealion

Users that are interested in sealion are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

DAMO-NLP-SG / DAMO-SeaLLMs
View on GitHub
[ACL 2024 Demo] SeaLLMs - Large Language Models for Southeast Asia
☆175Jul 30, 2024Updated last year
SEACrowd / seacrowd-datahub
View on GitHub
A collaborative project to collect datasets in SEA languages, SEA regions, or SEA cultures.
☆104Mar 16, 2026Updated 4 months ago
sail-sg / sailor-llm
View on GitHub
[EMNLP-2024] ⚓️ Sailor: Open Language Models for South-East Asia
☆139Dec 21, 2024Updated last year
UniversalDependencies / UD_Thai-PUD
View on GitHub
Parallel Universal Dependencies.
☆15May 6, 2026Updated 2 months ago
ProtonX-AI-for-Devs-01 / quang-le-vietnamese-rag
View on GitHub
☆13Oct 6, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
VinAIResearch / PhoGPT
View on GitHub
PhoGPT: Generative Pre-training for Vietnamese (2023)
☆791Nov 12, 2024Updated last year
telexyz / vi
View on GitHub
Xây dựng tập dữ liệu 500GB (20% done) văn bản tiếng Việt để huấn luyện mô hình ngôn ngữ lớn
☆29Apr 7, 2023Updated 3 years ago
aisingapore / SEA-HELM
View on GitHub
A comprehensive evaluation framework for the SEA region
☆32Updated this week
yzyouzhang / SASV_PR
View on GitHub
Official implementation of the Odyssey paper "A Probabilistic Fusion Framework for Spoofing Aware Speaker Verification"
☆18Jun 24, 2022Updated 4 years ago
IndoNLP / nusa-crowd
View on GitHub
A collaborative project to collect datasets in Indonesian languages.
☆287Jun 2, 2024Updated 2 years ago
vistec-AI / WangchanX
View on GitHub
WangchanX Fine-tuning Pipeline
☆46Oct 4, 2024Updated last year
nup-csai / Qtok
View on GitHub
☆16Oct 17, 2024Updated last year
aisingapore / sgnlp
View on GitHub
Machine learning models from Singapore's NLP research community
☆37Apr 4, 2023Updated 3 years ago
Telegram-Zalo / zac2022-e2e-qa
View on GitHub
Solution for Zalo AI Challenge 2022 - E2E Question Answering
☆109Dec 25, 2022Updated 3 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
CaoHaiNam / Vietnamese-Address-Standardization
View on GitHub
RIVF 2021: Deep neural network based learning to rank for address standardization
☆10Jul 13, 2024Updated 2 years ago
v-nhandt21 / ViMFA
View on GitHub
Montreal Forced Aligner for Vietnamese
☆15Oct 23, 2023Updated 2 years ago
langmaninternet / VietnameseTextNormalizer
View on GitHub
Thư viện chuẩn hóa văn bản Tiếng Việt
☆180May 26, 2025Updated last year
Oztobuzz / Vista
View on GitHub
This is the official repository for Vista dataset - A Vietnamese multimodal dataset contains more than 700,000 samples of conversations a…
☆26May 14, 2024Updated 2 years ago
cstorm125 / esninja
View on GitHub
Best practices for product search in English and Thai using Elasticsearch
☆14Mar 16, 2021Updated 5 years ago
mrpeerat / CL-ReLKT
View on GitHub
The implementation of CL-ReLKT (NAACL-2022)
☆14Aug 31, 2022Updated 3 years ago
thangnch / MiAI_Langchain_RAG
View on GitHub
Demo of build RAG application from Langchain
☆29Jan 15, 2024Updated 2 years ago
KornWtp / ConGen
View on GitHub
Implementation of ConGen: Unsupervised Control and Generalization Distillation For Sentence Representation (Finding of EMNLP 2022).
☆22Sep 13, 2023Updated 2 years ago
mrpeerat / SCT
View on GitHub
SCT: An Efficient Self-Supervised Cross-View Training For Sentence Embedding (TACL)
☆16Jul 27, 2024Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
VinAIResearch / BARTpho
View on GitHub
BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese (INTERSPEECH 2022)
☆105Jul 22, 2024Updated 2 years ago
govtech-responsibleai / KnowOrNot
View on GitHub
☆28Feb 11, 2026Updated 5 months ago
linhduongtuan / doctorwithbloom
View on GitHub
We finetune Bloomz-7b1-mt using LoRA with the chatdoctor-200k dataset at here https://huggingface.co/LinhDuong/doctorwithbloomz-7b1-mt an…
☆31Apr 4, 2023Updated 3 years ago
baochi0212 / LaVy
View on GitHub
Pioneering in Vietnamese Multimodal Large Language Model
☆53Jan 23, 2025Updated last year
ndcuong91 / MC_OCR
View on GitHub
Solution for MC_OCR competition
☆97Mar 7, 2023Updated 3 years ago
PyThaiNLP / KhanomTanLLM
View on GitHub
☆17Sep 24, 2024Updated last year
vietai / dab
View on GitHub
Data Augmentation by Backtranslation (DAB) ヽ( •_-)ᕗ
☆70Jun 20, 2022Updated 4 years ago
aiverify-foundation / moonshot
View on GitHub
Moonshot - A simple and modular tool to evaluate and red-team any LLM application.
☆339Jun 10, 2026Updated last month
FlorianWilhelm / mlstm4reco
View on GitHub
Multiplicative LSTM for Recommendations
☆20Aug 7, 2018Updated 7 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
v-nhandt21 / Vinorm
View on GitHub
Python - NSW package for Vietnamese: Normalization system to convert numbers, abbreviations, and words that cannot be pronounced into syl…
☆67Jan 1, 2025Updated last year
hllj / Vistral-V
View on GitHub
Vistral-V: Visual Instruction Tuning for Vistral - Vietnamese Large Vision-Language Model.
☆23Jul 1, 2024Updated 2 years ago
datquocnguyen / PhoW2V
View on GitHub
Pre-trained Word2Vec syllable- and word-level embeddings for Vietnamese
☆54Aug 8, 2023Updated 2 years ago
IndoNLP / nusa-writes
View on GitHub
NusaWrites is an in-depth analysis of corpora collection strategy and a comprehensive language modeling benchmark for underrepresented an…
☆30Sep 27, 2024Updated last year
allbyai / ToRoLaMa
View on GitHub
ToRoLaMa: The Vietnamese Instruction-Following and Chat Model
☆24Jan 4, 2024Updated 2 years ago
KenrickLance / BalitaNLP-Dataset
View on GitHub
Filipino multi-modal NLP dataset. Consists of 350k+ Filipino news articles and associated images
☆14Mar 11, 2025Updated last year
VietnamAIHub / Vietnamese_LLMs
View on GitHub
Dự án bao gồm: 1. Xây dựng bộ dữ Instructions Vietnamese (chất lượng, nhiều, và đa dạng). 2.LLM Training, Finetuning, Evaluating & Testin…
☆285Sep 1, 2025Updated 10 months ago