VILA-Lab / Open-LLM-Leaderboard
Open-LLM-Leaderboard: Open-Style Question Evaluation. Paper at https://arxiv.org/abs/2406.07545
☆28Updated 2 months ago
Related projects: ⓘ
- The official implementation of the paper "What Matters in Transformers? Not All Attention is Needed".☆43Updated last week
- [ACL 2024] The official codebase for the paper "Self-Distillation Bridges Distribution Gap in Language Model Fine-tuning".☆81Updated this week
- [ICML'24] The official implementation of “Rethinking Optimization and Architecture for Tiny Language Models”☆114Updated 2 months ago
- Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities. arXiv:2408.07666.☆107Updated this week
- Co-LLM: Learning to Decode Collaboratively with Multiple Language Models☆89Updated 4 months ago
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning. COLM 2024 Accepted Paper☆21Updated 3 months ago
- Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];☆32Updated 8 months ago
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆123Updated 6 months ago
- Code and Data for "Long-context LLMs Struggle with Long In-context Learning"☆87Updated 2 months ago
- [ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning☆101Updated 2 weeks ago
- A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.☆73Updated 7 months ago
- ☆29Updated 3 weeks ago
- [ACL 2024] Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models☆59Updated 3 months ago
- 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623☆52Updated 3 weeks ago
- The source code of the EMNLP 2023 main conference paper: Sparse Low-rank Adaptation of Pre-trained Language Models.☆62Updated 6 months ago
- Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models☆121Updated 3 months ago
- ☆60Updated 5 months ago
- [NeurIPS 2023] This is the code for the paper `Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias`.☆133Updated 10 months ago
- [ICML 2024 Oral] A framework for society simulation that supports complex simulation, for example: multi-scene.☆39Updated last month
- Unofficial implementation for the paper "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆123Updated 3 months ago
- Official code for "MAmmoTH2: Scaling Instructions from the Web"☆106Updated last week
- [ACL'24] A Knowledge-grounded Interactive Evaluation Framework for Large Language Models☆32Updated 2 months ago
- FuseAI Project☆75Updated 3 weeks ago
- Ouroboros: Speculative Decoding with Large Model Enhanced Drafting☆60Updated 6 months ago
- Scalable Meta-Evaluation of LLMs as Evaluators☆39Updated 7 months ago
- CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs☆66Updated last month
- ☆94Updated 8 months ago
- ☆57Updated 3 weeks ago
- Official github repo for AutoDetect, an automated weakness detection framework for LLMs.☆36Updated 2 months ago
- Code associated with Tuning Language Models by Proxy (Liu et al., 2024)☆84Updated 5 months ago