DAMO-NLP-SG / SeaLLMs

[ACL 2024 Demo] SeaLLMs - Large Language Models for Southeast Asia

☆149

Related projects ⓘ

Alternatives and complementary repositories for SeaLLMs

sail-sg / sailor-llm
[EMNLP-2024] ⚓️ Sailor: Open Language Models for South-East Asia
☆112Updated 2 months ago
nlp-uoregon / Okapi
Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback
☆91Updated last year
pphuc25 / distil-cd
Distillation Contrastive Decoding: Improving LLMs Reasoning with Contrastive Decoding and Distillation
☆32Updated 8 months ago
FreedomIntelligence / MultilingualSIFT
MultilingualSIFT: Multilingual Supervised Instruction Fine-tuning
☆86Updated last year
ZaloAI-Jaist / VMLU
☆59Updated 6 months ago
Xdao85 / VNHSGE
VNHSGE: Vietnamese High School Graduation Examination Dataset for Large Language Models
☆25Updated last year
jshuadvd / LongRoPE
Implementation of the LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper
☆124Updated 4 months ago
FranxYao / Long-Context-Data-Engineering
Implementation of paper Data Engineering for Scaling Language Models to 128K Context
☆438Updated 8 months ago
ParticleMedia / RAGTruth
Github repository for "RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models"
☆115Updated last month
princeton-nlp / QuRating
[ICML 2024] Selecting High-Quality Data for Training Language Models
☆146Updated 5 months ago
tianyi-lab / Superfiltering
[ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning
☆125Updated 2 months ago
sail-sg / sailcraft
🚢 Data Toolkit for Sailor Language Models
☆82Updated 4 months ago
MozerWang / Loong
[EMNLP 2024 (Oral)] Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA
☆93Updated last week
nlp-uoregon / mlmm-evaluation
Multilingual Large Language Models Evaluation Benchmark
☆107Updated 3 months ago
GAIR-NLP / ProX
Offical Repo for "Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale"
☆191Updated last month
HKUNLP / ChunkLlama
[ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"
☆358Updated last month
VinAIResearch / RecGPT
RecGPT: Generative Pre-training for Text-based Recommendation (ACL 2024)
☆30Updated 2 months ago
SqueezeAILab / LLM2LLM
[ACL 2024] LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement
☆156Updated 7 months ago
TIGER-AI-Lab / MAmmoTH2
Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]
☆124Updated 3 weeks ago
GAIR-NLP / ReAlign
Reformatted Alignment
☆112Updated last month
aisingapore / sealion
South-East Asia Large Language Models
☆270Updated 3 weeks ago
mzbac / llama2-fine-tune
Scripts for fine-tuning Llama2 via SFT and DPO.
☆183Updated last year
deepseek-ai / ESFT
Expert Specialized Fine-Tuning
☆148Updated 2 months ago
OFA-Sys / InsTag
InsTag: A Tool for Data Analysis in LLM Supervised Fine-tuning
☆218Updated last year
nguyenvulebinh / extractive-qa-mrc
Machine Reading Comprehension special for the Vietnamese language
☆38Updated 2 years ago
princeton-nlp / CEPE
[ACL 2024] Long-Context Language Modeling with Parallel Encodings
☆147Updated 5 months ago
OpenBMB / RAGEval
☆83Updated 2 weeks ago
nick7nlp / Counting-Stars
Counting-Stars (★)
☆76Updated 2 months ago
arcee-ai / EvolKit
EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language M…
☆181Updated 3 weeks ago
QwenLM / AutoIF
☆219Updated 3 months ago