Source codes and datasets for How well do Large Language Models perform in Arithmetic tasks?
☆57Apr 17, 2023Updated 2 years ago
Alternatives and similar repositories for math401-llm
Users that are interested in math401-llm are comparing it to the libraries listed below
Sorting:
- An offical implementation of EHRDiff [TMLR]☆33Jun 25, 2024Updated last year
- Efficient Symptom Inquiring and Diagnosis via Adaptive Alignment of Reinforcement Learning and Classification [AI in Medicine Journal]☆12May 20, 2022Updated 3 years ago
- 🤖ConvRe🤯: An Investigation of LLMs’ Inefficacy in Understanding Converse Relations (EMNLP 2023)☆24Oct 10, 2023Updated 2 years ago
- Codes and Pre-trained models for RAMM: Retrieval-augmented Biomedical Visual Question Answering with Multi-modal Pre-training [ACM MM 202…☆29Nov 2, 2023Updated 2 years ago
- [AAAI 2025] Augmenting Math Word Problems via Iterative Question Composing (https://arxiv.org/abs/2401.09003)☆23Oct 2, 2025Updated 5 months ago
- BioBART: Pretraining and Evaluation of A Biomedical Generative Language Model [ACL-BioNLP 2022]☆52Oct 26, 2022Updated 3 years ago
- Code for "[COLM'25] RepoST: Scalable Repository-Level Coding Environment Construction with Sandbox Testing"☆23Mar 18, 2025Updated 11 months ago
- CODER: Knowledge infused cross-lingual medical term embedding for term normalization. [JBI, ACL-BioNLP 2022]☆80Jun 28, 2022Updated 3 years ago
- Momentum Decoding: Open-ended Text Generation as Graph Exploration☆19Jan 27, 2023Updated 3 years ago
- Resources of deep learning for mathematical reasoning (DL4MATH).☆371Dec 22, 2023Updated 2 years ago
- [ACL 2024 Findings] CriticBench: Benchmarking LLMs for Critique-Correct Reasoning☆30Mar 5, 2024Updated 2 years ago
- Text Diffusion Model with Encoder-Decoder Transformers for Sequence-to-Sequence Generation [NAACL 2024]☆99Aug 17, 2023Updated 2 years ago
- HyPe: Better Pre-trained Language Model Fine-tuning with Hidden Representation Perturbation [ACL 2023]☆14Jul 11, 2023Updated 2 years ago
- Code for EMNLP'24 paper - On Diversified Preferences of Large Language Model Alignment☆16Aug 6, 2024Updated last year
- Fusing Heterogeneous Factors with Triaffine Mechanism for Nested Named Entity Recognition [ACL 2022 Findings]☆44Feb 21, 2023Updated 3 years ago
- RACE is a multi-dimensional benchmark for code generation that focuses on Readability, mAintainability, Correctness, and Efficiency.☆12Oct 12, 2024Updated last year
- Conic10K: A large-scale dataset for closed-vocabulary math problem understanding. Accepted to EMNLP2023 Findings.☆31Dec 6, 2023Updated 2 years ago
- Data and code for ACL 2023 paper "RobuT: A Systematic Study of Table QA Robustness Against Human-Annotated Adversarial Perturbations"☆15Feb 8, 2024Updated 2 years ago
- Official code and dataset for our NAACL 2024 paper: DialogCC: An Automated Pipeline for Creating High-Quality Multi-modal Dialogue Datase…☆13Jun 24, 2024Updated last year
- Code Synonyms Do Matter: Multiple Synonyms Matching Network for Automatic ICD Coding [ACL 2022]☆54Aug 6, 2022Updated 3 years ago
- Code and data for "An Accurate Unsupervised Method for Joint Entity Alignment and Dangling Entity Detection".☆15Mar 26, 2022Updated 3 years ago
- Medical ML Benchmark☆11May 16, 2023Updated 2 years ago
- code for the table-based open domain question answering project, with paper title: "Reasoning over Hybrid Chain for Table-and-Text Open D…☆12Sep 16, 2022Updated 3 years ago
- ☆15Aug 18, 2022Updated 3 years ago
- ☆30Dec 27, 2024Updated last year
- Improving Biomedical Pretrained Language Models with Knowledge [BioNLP 2021]☆65Feb 6, 2023Updated 3 years ago
- Implementation of ICML 23 Paper: Specializing Smaller Language Models towards Multi-Step Reasoning.☆132Jun 18, 2023Updated 2 years ago
- Benchmarking Complex Instruction-Following with Multiple Constraints Composition (NeurIPS 2024 Datasets and Benchmarks Track)☆102Feb 20, 2025Updated last year
- The is the official implementation of "Lyra: Orchestrating Dual Correction in Automated Theorem Proving"☆15Jul 2, 2024Updated last year
- ML Benchmarks in Algebraic Combinatorics☆23Jan 15, 2026Updated last month
- Dataset and Evaluation Code for the K-QA Benchmark.☆18May 26, 2024Updated last year
- [ICLR'25 Spotlight] Rethinking and improving autoformalization: towards a faithful metric and a Dependency Retrieval-based approach☆27May 20, 2025Updated 9 months ago
- PMC-Patients☆102Jun 7, 2024Updated last year
- Code of the COLING22 paper "uChecker: Masked Pretrained Language Models as Unsupervised Chinese Spelling Checkers"☆19Aug 17, 2022Updated 3 years ago
- K12高中数学试题数据集☆15Aug 16, 2023Updated 2 years ago
- ☆18Oct 13, 2022Updated 3 years ago
- Orientation of the protein-protein interaction network using network diffusion techniques☆15Mar 16, 2021Updated 4 years ago
- [NIPS2023] RRHF & Wombat☆809Sep 22, 2023Updated 2 years ago
- PMC-Patients: A Large-scale Dataset of Patient Summaries and Relations for Benchmarking Retrieval-based Clinical Decision Support Systems…☆74Dec 20, 2023Updated 2 years ago