UMxYTL-AI-Labs / MalayMMLULinks
[MalayMMLU] This is the first-ever Bahasa Melayu multitask benchmark designed to elevate the performance of Large Language Models (LLMs) and Large Vision Language Models (LVLMs).
☆55Updated 4 months ago
Alternatives and similar repositories for MalayMMLU
Users that are interested in MalayMMLU are comparing it to the libraries listed below
Sorting:
- Natural Language Toolkit for Malaysian language, https://malaya.readthedocs.io/☆519Updated this week
- We gather Malaysian dataset! https://malaysian-dataset.readthedocs.io/☆328Updated this week
- Build datasets using natural language☆558Updated 3 months ago
- 🤗 Benchmark Large Language Models Reliably On Your Data☆423Updated last week
- South-East Asia Large Language Models☆382Updated this week
- SONAR, a new multilingual and multimodal fixed-size sentence embedding space, with a full suite of speech and text encoders and decoders.☆855Updated 2 months ago
- Efficiently find the best-suited language model (LM) for your NLP task☆132Updated 5 months ago
- Real-Time Detection of Hallucinated Entities in Long-Form Generation☆273Updated last month
- 🎨 NeMo Data Designer: A general library for generating high-quality synthetic data from scratch or based on seed data.☆603Updated last week
- ☆695Updated 8 months ago
- Collection of scripts and notebooks for OpenAI's latest GPT OSS models☆494Updated 4 months ago
- dLLM: Simple Diffusion Language Modeling☆1,541Updated this week
- implement RED metrics in fastapi integrate with Prometheus and Grafana☆40Updated 10 months ago
- Inference, Fine Tuning and many more recipes with Gemma family of models☆276Updated 5 months ago
- ☆158Updated 8 months ago
- Recipes for learning, fine-tuning, and adapting ColPali to your multimodal RAG use cases. 👨🏻🍳☆351Updated 7 months ago
- A little(lil) Language Model (LM). A tiny reproduction of LLaMA 3's model architecture.☆54Updated 8 months ago
- ☆243Updated 3 months ago
- Tool for generating high quality Synthetic datasets☆1,455Updated 2 months ago
- A CLI to estimate inference memory requirements for Hugging Face models, written in Python.☆168Updated this week
- ☆126Updated 3 months ago
- Fine-tune ModernBERT on a large Dataset with Custom Tokenizer Training☆74Updated 2 months ago
- Low latency, High Accuracy, Custom Query routers for Humans and Agents. Built by Prithivi Da☆119Updated 9 months ago
- Repo for the Belebele dataset, a massively multilingual reading comprehension dataset.☆338Updated last year
- ☆236Updated last month
- An open-source tool for LLM prompt optimization.☆738Updated last week
- Model Activity Visualiser☆520Updated 9 months ago
- A compact LLM pretrained in 9 days by using high quality data☆340Updated 9 months ago
- ☆206Updated last year
- An open-source implementation of Whisper☆470Updated 2 months ago