SAILResearch / awesome-foundation-model-leaderboards
A curated list of awesome leaderboard-oriented resources for foundation models
☆183Updated this week
Related projects: ⓘ
- Grimoire is All You Need for Enhancing Large Language Models☆115Updated 6 months ago
- A deployment, monitoring and autoscaling service towards serverless LLM serving.☆152Updated last week
- TxBKG - Knowledge Graph Generation for Any PDFs☆224Updated 9 months ago
- 使用deepspeed从头开始训练一个LLM,经过pretrain和sft阶段,验证llm学习知识、理解语言、回答问题的能力☆145Updated 2 months ago
- The Official Repo of ML-Bench: Evaluating Large Language Models and Agents for Machine Learning Tasks on Repository-Level Code (https://a…☆350Updated last week
- AvaTaR: Optimizing LLM Agents for Tool-Assisted Knowledge Retrieval (https://arxiv.org/abs/2406.11200)☆140Updated last month
- Benchmarking LLMs via Uncertainty Quantification☆206Updated 7 months ago
- [ACL 2024] User-friendly evaluation framework: Eval Suite & Benchmarks: UHGEval, HaluEval, HalluQA☆176Updated 3 weeks ago
- ☆189Updated 2 months ago
- [ACL 2024] CodeScope: An Execution-based Multilingual Multitask Multidimensional Benchmark for Evaluating LLMs on Code Understanding and …☆107Updated last month
- We leverage 14 datasets as OOD test data and conduct evaluations on 8 NLU tasks over 21 popularly used models. Our findings confirm that …☆115Updated last year
- A Comprehensive Benchmark for Code Information Retrieval.☆61Updated last week
- An interpretable large language model (LLM) for medical diagnosis.☆68Updated last week
- This is a repo for my NanoGPT Pytorch2.0 Implementation when torch2.0 released soon, faster and simpler, a good tutorial learning GPT.☆59Updated 7 months ago
- A multimodal agent framework for solving complex tasks☆505Updated last week
- Pytorch Library for Relational Table Learning with LLMs.☆270Updated last week
- WorldGPT: Empowering LLM as Multimodal World Model☆116Updated last month
- This tool(enhance_long) aims to enhance the LlaMa2 long context extrapolation capability in the lowest-cost approach, preferably without …☆47Updated 9 months ago
- This is the official code repository of MoTCoder: Elevating Large Language Models with Modular of Thought for Challenging Programming Tas…☆60Updated 3 weeks ago
- The official implementation of Self-Play Preference Optimization (SPPO)☆461Updated last month
- 从预训练到强化学习的中文llama2☆93Updated 11 months ago
- 本项目旨在结合以往研究人员的代表性工作,从多个维度评估sft数据,并自动化过滤sft数据。☆55Updated 6 months ago
- ☆30Updated last month
- EffiBench: Benchmarking the Efficiency of Automatically Generated Code☆50Updated last month
- Awesome LLMs on Device: A Comprehensive Survey☆613Updated this week
- [EMNLP 2023] FreeAL: Towards Human-Free Active Learning in the Era of Large Language Models☆81Updated 8 months ago
- ☆55Updated 2 months ago
- Medical Multimodal LLMs☆232Updated last week
- Empower Your Model with Longer and Better Context Comprehention☆50Updated last year
- Multilingual Corpus of Web Fiction☆211Updated 2 months ago