UMxYTL-AI-Labs / MalayMMLU
[MalayMMLU] This is the first-ever Bahasa Melayu multitask benchmark designed to elevate the performance of Large Language Models (LLMs) and Large Vision Language Models (LVLMs).
☆31Updated 4 months ago
Alternatives and similar repositories for MalayMMLU:
Users that are interested in MalayMMLU are comparing it to the libraries listed below
- We gather Malaysian dataset! https://malaysian-dataset.readthedocs.io/☆315Updated this week
- Natural Language Toolkit for Malaysian language, https://malaya.readthedocs.io/☆488Updated this week
- A collaborative project to collect datasets in SEA languages, SEA regions, or SEA cultures.☆81Updated 3 months ago
- Recipes to prepare datasets!☆13Updated last month
- [ACL 2024 Demo] SeaLLMs - Large Language Models for Southeast Asia☆166Updated 8 months ago
- Maya: An Instruction Finetuned Multilingual Multimodal Model using Aya☆108Updated 2 months ago
- Fine-tuning Open-Source LLMs for Adaptive Machine Translation☆77Updated 3 weeks ago
- Fine-tune ModernBERT on a large Dataset with Custom Tokenizer Training☆65Updated 2 months ago
- Repository accompanying "An Open Dataset and Model for Language Identification" (Burchell et al., 2023)☆74Updated 3 weeks ago
- Using open source LLMs to build synthetic datasets for direct preference optimization☆61Updated last year
- Fine-tuning large language models (LLMs) is crucial for enhancing performance across domain-specific task applications. This comprehensiv…☆12Updated 7 months ago
- Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages -- ACL 2023☆100Updated last year
- ☆90Updated last month
- 💬 Language Identification with Support for More Than 2000 Labels -- EMNLP 2023☆127Updated 4 months ago
- Generate synthetic labeled data for extremely low-resource languages using bilingual lexicons.☆15Updated 6 months ago
- Code repository for "Introducing Airavata: Hindi Instruction-tuned LLM"☆59Updated 6 months ago
- A Multilingual Replicable Instruction-Following Model☆93Updated last year
- أسئلة باللغة العربية تركز على الثقافة السعودية تم اختبارها على عدد من النماذج اللغوية الضخمة LLMs☆14Updated 3 months ago
- Speech Toolkit for Malaysian language, https://malaya-speech.readthedocs.io/☆251Updated last week
- LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM☆240Updated last month
- [EMNLP-2024] ⚓️ Sailor: Open Language Models for South-East Asia☆129Updated 4 months ago
- ☆43Updated 10 months ago
- A collection of NLP resources for Malay☆25Updated 6 years ago
- A collection of preprocessed datasets and pretrained models for generating paraphrases.☆29Updated 3 years ago
- The FLORES+ Machine Translation Benchmark☆102Updated 5 months ago
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆139Updated 2 months ago
- ☆25Updated 7 months ago
- ☆121Updated last week
- ☆17Updated 2 years ago
- NTREX -- News Test References for MT Evaluation☆83Updated 10 months ago