Repository for the code and dataset for the paper: "Have LLMs Advanced enough? Towards Harder Problem Solving Benchmarks For Large Language Models"
☆39Dec 18, 2023Updated 2 years ago
Alternatives and similar repositories for JEEBench
Users that are interested in JEEBench are comparing it to the libraries listed below
Sorting:
- Code for our ACL '23 paper titled "Grokking of Hierarchical Structure in Vanilla Transformers"☆24Oct 8, 2023Updated 2 years ago
- No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models (ICLR 2022)☆29Feb 9, 2022Updated 4 years ago
- prediction markets -> llm -> news☆24Nov 10, 2025Updated 3 months ago
- ☆13Oct 20, 2017Updated 8 years ago
- DIY Python Projects☆10Aug 26, 2016Updated 9 years ago
- A Data Source for Reasoning Embodied Agents☆19Sep 18, 2023Updated 2 years ago
- Convert your docs to markdown format.☆14Jul 26, 2016Updated 9 years ago
- Domain and problem PDDL parser in C/C++ using Flex & Bison.☆15Jun 18, 2019Updated 6 years ago
- ☆17Oct 31, 2023Updated 2 years ago
- A basic weather prediction software powered by TensorFlow☆15Dec 5, 2016Updated 9 years ago
- Nyström Normalized Cut PyTorch Implementation☆25Dec 12, 2025Updated 2 months ago
- Trying the Ifood Challenge 2018☆17Jun 19, 2018Updated 7 years ago
- NumGLUE: A Suite of Fundamental yet Challenging Mathematical Reasoning Tasks☆20May 10, 2022Updated 3 years ago
- IIT-JEE Name wise Result☆33Aug 9, 2021Updated 4 years ago
- Data and code for the SciFact-Open task☆28Nov 24, 2023Updated 2 years ago
- Caffe for "Augmenting Supervised Neural Networks with Unsupervised Objectives for Large-scale Image Classification"☆20Aug 28, 2017Updated 8 years ago
- An introduction to global assessment techniques using Python☆12Apr 24, 2023Updated 2 years ago
- A formal proof of the irrationality of zeta(3), the Apéry constant [maintainer=@amahboubi,@pi8027]☆25Feb 26, 2026Updated last week
- Repo for ICML23 "Why do Nearest Neighbor Language Models Work?"☆59Jan 12, 2023Updated 3 years ago
- [ACL 2024 Findings] CriticBench: Benchmarking LLMs for Critique-Correct Reasoning☆30Mar 5, 2024Updated 2 years ago
- The rule-based evaluation subset and code implementation of Omni-MATH☆26Dec 23, 2024Updated last year
- Formal representation and solving for Euclidean plane geometry problems.☆33Dec 19, 2025Updated 2 months ago
- Implementation of the model: "Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models" in PyTorch☆28Feb 9, 2026Updated 3 weeks ago
- Conic10K: A large-scale dataset for closed-vocabulary math problem understanding. Accepted to EMNLP2023 Findings.☆31Dec 6, 2023Updated 2 years ago
- This repository contains implementation of CROSSGRAD (https://openreview.net/forum?id=r1Dx7fbCW) and DAN (https://arxiv.org/abs/1505.0781…☆24Dec 28, 2018Updated 7 years ago
- TensorFlow implementation [ICLR 18] "Learning Approximate Inference Networks for Structured Prediction"☆30Jun 10, 2018Updated 7 years ago
- PreAct: Prediction Enhances Agent's Planning Ability (Coling2025)☆30Dec 12, 2024Updated last year
- NaturalProofs: Mathematical Theorem Proving in Natural Language (NeurIPS 2021 Datasets & Benchmarks)☆134Sep 8, 2022Updated 3 years ago
- ipython-notebooks on popular algorithms meant to be used at technical sessions for IITB students☆28Apr 9, 2017Updated 8 years ago
- Codebase for fine-tuning Llama2 70B to generate math test questions and answers.☆11Aug 30, 2024Updated last year
- Improving Language Understanding from Screenshots. Paper: https://arxiv.org/abs/2402.14073☆31Jul 9, 2024Updated last year
- PyTorch codes for the paper "An Empirical Study of Multimodal Model Merging"☆37Oct 11, 2023Updated 2 years ago
- Code for the paper "Learning to Prove Theorems by Learning to Generate Theorems"☆33Oct 30, 2020Updated 5 years ago
- Youtube playlist maker webapp - make lightning fast playlists and share them!☆29May 4, 2018Updated 7 years ago
- Dataset of 9536 H&E-stained patches for colorectal polyps classification and adenomas grading | ICIP21 https://doi.org/10.1109/ICIP42928.…☆34May 18, 2023Updated 2 years ago
- Code repository corresponding to the paper "Prompt Tuned Embedding Classification for Multi-Label Industry Sector Allocation" (NAACL 2024…☆10May 31, 2024Updated last year
- Neural SPH☆41Nov 23, 2025Updated 3 months ago
- ☆11Dec 23, 2024Updated last year
- NaturalProver: Grounded Mathematical Proof Generation with Language Models☆39Mar 24, 2023Updated 2 years ago