☆150Jan 4, 2024Updated 2 years ago
Alternatives and similar repositories for gemini-benchmark
Users that are interested in gemini-benchmark are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- An Apache 2.0 fork of HuggingFace's Large Language Model Text Generation Inference☆19Mar 10, 2024Updated 2 years ago
- ☆14Oct 28, 2023Updated 2 years ago
- Collections of RLxLM experiments using minimal codes☆14Feb 17, 2025Updated last year
- ☆165Nov 23, 2024Updated last year
- 服务器 GPU 监控程序,当 GPU 属性满足预设条件时通过微信发送提示消息☆34Aug 10, 2021Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Source code for the paper "Prefix Language Models are Unified Modal Learners"☆44Apr 30, 2023Updated 2 years ago
- Evaluating LLMs with CommonGen-Lite☆95Mar 21, 2024Updated 2 years ago
- ☆102Dec 22, 2023Updated 2 years ago
- GSM-Plus: Data, Code, and Evaluation for Enhancing Robust Mathematical Reasoning in Math Word Problems.☆64Jul 8, 2024Updated last year
- Official implementation of "Reasoning by Superposition: A Theoretical Perspective on Chain of Continuous Thought" (NeurIPS 2025)☆38Oct 8, 2025Updated 5 months ago
- Staged Training for Transformer Language Models☆33Mar 31, 2022Updated 3 years ago
- Official repository for MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models [NeurIPS 2024]☆79Nov 14, 2024Updated last year
- THOUGHTSCULPT, a general reasoning and search method for complex tasks☆13Dec 13, 2024Updated last year
- [NeurlPS D&B 2024] Generative AI for Math: MathPile☆420Apr 4, 2025Updated 11 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Visual and Embodied Concepts evaluation benchmark☆21Oct 10, 2023Updated 2 years ago
- Web-grounded natural language instructions☆18Nov 25, 2024Updated last year
- Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]☆591Dec 9, 2024Updated last year
- Repository for NPHardEval, a quantified-dynamic benchmark of LLMs☆64Mar 26, 2024Updated 2 years ago
- The source code of "Merging Experts into One: Improving Computational Efficiency of Mixture of Experts (EMNLP 2023)":☆44Feb 28, 2026Updated 3 weeks ago
- [CVPR 2023] Out-of-Distributed Semantic Pruning for Robust Semi-Supervised Learning☆22Jun 11, 2023Updated 2 years ago
- [NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623☆89Sep 26, 2024Updated last year
- Vision Large Language Models trained on M3IT instruction tuning dataset☆17Aug 16, 2023Updated 2 years ago
- Data and code for the paper Causal Reasoning of Entities and Events in Procedural Texts.☆12May 26, 2023Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Latent Large Language Models☆19Aug 24, 2024Updated last year
- ☆55Apr 1, 2024Updated last year
- Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024]☆147Sep 20, 2024Updated last year
- MADAv2: Advanced Multi-Anchor Based Active Domain Adaptation Segmentation☆25Jul 8, 2023Updated 2 years ago
- The implementation of the paper "Retrieval-Free Knowledge-Grounded Dialogue Response Generation with Adapters".☆17May 24, 2022Updated 3 years ago
- 800,000 step-level correctness labels on LLM solutions to MATH problems☆2,106Jun 1, 2023Updated 2 years ago
- ☆134Dec 22, 2023Updated 2 years ago
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆124Sep 9, 2024Updated last year
- [ICLR 2024] Lemur: Open Foundation Models for Language Agents☆557Oct 28, 2023Updated 2 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Multimodal computer agent data collection program☆165Dec 5, 2025Updated 3 months ago
- [ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large mult…☆839Feb 3, 2025Updated last year
- [ICML 2024] Selecting High-Quality Data for Training Language Models☆201Dec 8, 2025Updated 3 months ago
- MathVista: data, code, and evaluation for Mathematical Reasoning in Visual Contexts☆355Sep 29, 2025Updated 5 months ago
- [NeurIPS2023] Official implementation of the paper "Large Language Models are Visual Reasoning Coordinators"☆105Nov 9, 2023Updated 2 years ago
- [EMNLP-2022 Findings] Code for paper “ProGen: Progressive Zero-shot Dataset Generation via In-context Feedback”.☆27Feb 4, 2023Updated 3 years ago
- Fairer Preferences Elicit Improved Human-Aligned Large Language Model Judgments (Zhou et al., EMNLP 2024)☆14Oct 3, 2024Updated last year