A framework for few-shot evaluation of autoregressive language models.
☆26Dec 21, 2023Updated 2 years ago
Alternatives and similar repositories for lm-evaluation-harness
Users that are interested in lm-evaluation-harness are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆37Oct 29, 2024Updated last year
- ☆25Aug 2, 2022Updated 3 years ago
- ☆26Nov 1, 2021Updated 4 years ago
- Formalization of IMO shortlist problems in Lean 4☆25Updated this week
- ProofNet dataset ported into Lean 4☆29Jun 9, 2025Updated 10 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Model Selection with Large Language Models for Reasoning (EMNLP2023 Findings)☆30Dec 23, 2023Updated 2 years ago
- Code for the paper LEGO-Prover: Neural Theorem Proving with Growing Libraries☆67Feb 29, 2024Updated 2 years ago
- The TacTok automated Coq proof script synthesis tool☆17Jan 9, 2024Updated 2 years ago
- Code example for pretraining an LLM with vanilla PyTorch training loop☆10Jun 6, 2024Updated last year
- ☆23Feb 3, 2026Updated 2 months ago
- Data and code for EACL'24 paper: Over-Reasoning and Redundant Calculation of Large Language Models☆11Jan 23, 2024Updated 2 years ago
- Maps: Python's missing mappings☆13Nov 29, 2017Updated 8 years ago
- ☆12Feb 9, 2024Updated 2 years ago
- Transfer Learning in Dialogue Benchmarking Toolkit☆14Mar 31, 2023Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- The source code for the paper: Yirong Mao, Ruiping Wang, Shiguang Shan, Xilin Chen. COSONet: Compact Second-Order Network for Video Face …☆12Dec 27, 2018Updated 7 years ago
- MMLU eval for RU/EN☆16Jul 31, 2023Updated 2 years ago
- ☆14Jul 17, 2025Updated 9 months ago
- ☆45Sep 21, 2024Updated last year
- code for the paper "Adversarial Reinforced Instruction Attacker for Robust Vision-Language Navigation" (TPAMI 2021)☆10Jul 15, 2022Updated 3 years ago
- ☆12Feb 16, 2024Updated 2 years ago
- ☆49Aug 29, 2023Updated 2 years ago
- Problem Sets for MIT 6.822 Formal Reasoning About Programs, Spring 2020☆19May 4, 2020Updated 5 years ago
- [ICML'21 Oral] Improving Lossless Compression Rates via Monte Carlo Bits-Back Coding☆14Jun 10, 2021Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Code Repository for "A Causal Framework to Quantify the Robustness of Mathematical Reasoning with Language Models".☆15Oct 14, 2022Updated 3 years ago
- PyTorch code for IEEE TCI2022 paper "Deep Hyperspectral Image Fusion Network with Iterative Spatio-Spectral Regularization"☆10May 10, 2022Updated 3 years ago
- Proof artifact co-training for Lean☆44Dec 29, 2022Updated 3 years ago
- Code & data for ICLR 2024 spotlight paper: 🍯MUSTARD: Mastering Uniform Synthesis of Theorem and Proof Data☆42May 29, 2024Updated last year
- This code implements a basic, Twitter-aware tokenizer.☆12Feb 8, 2024Updated 2 years ago
- Trying to build an all in one speech-text language model - a bit like GPT-4o☆22Jun 1, 2024Updated last year
- Brando's utils☆16Feb 25, 2025Updated last year
- CFG-GAN: Composite functional gradient learning of generative adversarial models☆15Jul 9, 2020Updated 5 years ago
- A Foreign Function Interface (FFI) to cvc5 solver in Lean.☆24Apr 10, 2026Updated last week
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆14Jul 13, 2025Updated 9 months ago
- Experiments on automation for Lean☆166Updated this week
- ☆16Apr 11, 2022Updated 4 years ago
- Danfeng Hong, Naoto Yokoya, Jian Xu, Xiaoxiang Zhu. Joint & Progressive Learning from High-Dimensional Data for Multi-Label Classificatio…☆11Nov 14, 2021Updated 4 years ago
- paper collection: alignment of diffusion models☆29Mar 6, 2026Updated last month
- Naver sentiment movie corpus classification☆17Oct 12, 2021Updated 4 years ago
- Code for Navigating Connected Memories with a Task-oriented Dialog System☆17Dec 12, 2022Updated 3 years ago