henrykmichalewski/math-evals

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/henrykmichalewski/math-evals)

henrykmichalewski / math-evals

Math evaluations of llama models.

☆10

Alternatives and similar repositories for math-evals

Users that are interested in math-evals are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

tianlwang / eval_gsm8k
View on GitHub
☆33Jul 4, 2024Updated 2 years ago
edukaton / codilime
View on GitHub
☆12Feb 18, 2018Updated 8 years ago
google-deepmind / eval_hub
View on GitHub
Gemini evaluations
☆22Feb 19, 2026Updated 5 months ago
sscit / rel
View on GitHub
A domain specific language for requirements engineering. Besides the DSL, the REL framework contains Python integration, and a Visual Stu…
☆11Apr 24, 2022Updated 4 years ago
google-deepmind / formal-putnam-like
View on GitHub
Lean formalizations of Putnam-like problems
☆20Apr 22, 2026Updated 3 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
odashi / ase15-django-dataset
View on GitHub
Django Dataset for Code Translation Tasks
☆31Feb 21, 2018Updated 8 years ago
Jometeorie / probing_llama
View on GitHub
☆17Feb 26, 2024Updated 2 years ago
Lucien-qiang / DBQA-KBQA
View on GitHub
☆10May 25, 2017Updated 9 years ago
LittleYUYU / CoaCor
View on GitHub
Code for "CoaCor: Code Annotation for Code Retrieval with Reinforcement Learning" (WWW 2019)
☆37Apr 21, 2020Updated 6 years ago
jabberjabberjabber / Chunkify
View on GitHub
Create text chunks which end at natural stopping points without using a tokenizer
☆26Nov 26, 2025Updated 7 months ago
JZCS2018 / SMAT
View on GitHub
Model and datasets for schema matching
☆15Jul 17, 2021Updated 5 years ago
socialmediaie / SocialMediaIE
View on GitHub
A toolkit for social media information extraction using multi-task learning and active learning
☆19Dec 27, 2022Updated 3 years ago
anastasiabolotnikova / adaboost
View on GitHub
Adaboost Haar feature classifier cascade training
☆10Apr 25, 2019Updated 7 years ago
Event-AHU / Prompt_Learning_Paper_List
View on GitHub
☆19Nov 7, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
DeepSoftwareAnalytics / RACE
View on GitHub
Replication package for EMNLP2022 paper- RACE: Retrieval-Augmented Commit Message Generation
☆19Oct 21, 2022Updated 3 years ago
LIANGQINGYUAN / Lyra
View on GitHub
Lyra: A Benchmark for Turducken-Style Code Generation
☆15Apr 22, 2022Updated 4 years ago
nuochenpku / LLaMA_Analysis
View on GitHub
This is official project in our paper: Is Bigger and Deeper Always Better? Probing LLaMA Across Scales and Layers
☆31Jan 13, 2024Updated 2 years ago
boxabirds / claudit
View on GitHub
Uses conversation history to audit important decisions and changes.
☆18Jul 13, 2025Updated last year
allwefantasy / code-dataset
View on GitHub
☆21Nov 27, 2024Updated last year
buhuixiezuowendelihua / Pytorch_Image_Classification
View on GitHub
LeNet5→AlexNet→VGGNet→GoogleNet→ResNet→MobileNet→DenseNet
☆13May 12, 2020Updated 6 years ago
Gogolian / babyagi-js-html
View on GitHub
☆19Apr 4, 2023Updated 3 years ago
hannahxchen / automatic-paraphrase-dataset-augmentation
View on GitHub
Code and data for automatic paraphrase dataset augmentation.
☆11Mar 8, 2021Updated 5 years ago
neubig / howtocode-2017
View on GitHub
An example of DyNet autobatching for the NIPS "how to code a paper" workshop
☆12Dec 9, 2017Updated 8 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
kevincure / YourPersonalAI
View on GitHub
☆13Mar 11, 2023Updated 3 years ago
samrawal / llama2_chat_templater
View on GitHub
Wrapper to easily generate the chat template for Llama2
☆65Mar 10, 2024Updated 2 years ago
lab-design / ICSE2020DNNBugRepair
View on GitHub
Dataset for ICSE 2020 paper "Repairing Deep Neural Networks: Fix Patterns and Challenges"
☆10Feb 10, 2020Updated 6 years ago
BillyWangwzx / fight
View on GitHub
fight with landlord (斗地主AI）
☆16Apr 4, 2018Updated 8 years ago
deepsense-ai / tensorflow_on_slurm
View on GitHub
☆40Jul 17, 2018Updated 8 years ago
microsoft / iclr2019-learning-to-represent-edits
View on GitHub
Code for the ICLR 2019 paper "Learning to Represent Edits"
☆13Dec 8, 2022Updated 3 years ago
leanprover-community / mathport
View on GitHub
Mathport is a tool for porting Lean3 projects to Lean4
☆46Nov 21, 2024Updated last year
murtazarang / MD-MADDPG
View on GitHub
☆14Sep 27, 2019Updated 6 years ago
tianyi-lab / MiP-Overthinking
View on GitHub
[COLM'25] Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?
☆39Jun 5, 2025Updated last year
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
leuchine / self_play_picard
View on GitHub
Using self-play to augment multi-turn text-to-SQL datasets
☆12Oct 20, 2022Updated 3 years ago
solyarisoftware / prompter.vim
View on GitHub
vim as a perfect large language models prompts playground
☆20Nov 29, 2023Updated 2 years ago
oskarimikkila / Empirical-Deep-Hedging
View on GitHub
☆16Jul 9, 2022Updated 4 years ago
genrm-star / genrm-critiques
View on GitHub
GenRM-CoT: Data release for verification rationales
☆68Oct 16, 2024Updated last year
timqian / letcode.ai
View on GitHub
☆24Nov 19, 2023Updated 2 years ago
sy2737 / Elevator-MARL
View on GitHub
Asynchronous Elevator Control Simulator + Multi-Agent Reinforcement Learning training algorithms
☆16May 9, 2018Updated 8 years ago
lavinal712 / control-lora-v3
View on GitHub
☆11Dec 15, 2025Updated 7 months ago