GanjinZero / math401-llmView external linksLinks
Source codes and datasets for How well do Large Language Models perform in Arithmetic tasks?
β57Apr 17, 2023Updated 2 years ago
Alternatives and similar repositories for math401-llm
Users that are interested in math401-llm are comparing it to the libraries listed below
Sorting:
- An offical implementation of EHRDiff [TMLR]β31Jun 25, 2024Updated last year
- π€ConvReπ€―: An Investigation of LLMsβ Inefficacy in Understanding Converse Relations (EMNLP 2023)β24Oct 10, 2023Updated 2 years ago
- Codes and Pre-trained models for RAMM: Retrieval-augmented Biomedical Visual Question Answering with Multi-modal Pre-training [ACM MM 202β¦β29Nov 2, 2023Updated 2 years ago
- [AAAI 2025] Augmenting Math Word Problems via Iterative Question Composing (https://arxiv.org/abs/2401.09003)β23Oct 2, 2025Updated 4 months ago
- Code for "[COLM'25] RepoST: Scalable Repository-Level Coding Environment Construction with Sandbox Testing"β22Mar 18, 2025Updated 10 months ago
- CODER: Knowledge infused cross-lingual medical term embedding for term normalization. [JBI, ACL-BioNLP 2022]β80Jun 28, 2022Updated 3 years ago
- Momentum Decoding: Open-ended Text Generation as Graph Explorationβ19Jan 27, 2023Updated 3 years ago
- [ACL 2024 Findings] CriticBench: Benchmarking LLMs for Critique-Correct Reasoningβ30Mar 5, 2024Updated last year
- Resources of deep learning for mathematical reasoning (DL4MATH).β370Dec 22, 2023Updated 2 years ago
- Text Diffusion Model with Encoder-Decoder Transformers for Sequence-to-Sequence Generation [NAACL 2024]β99Aug 17, 2023Updated 2 years ago
- Scratchpad/Chain-of-Thought Promptsβ12Jun 6, 2022Updated 3 years ago
- Code for EMNLP'24 paper - On Diversified Preferences of Large Language Model Alignmentβ16Aug 6, 2024Updated last year
- HyPe: Better Pre-trained Language Model Fine-tuning with Hidden Representation Perturbation [ACL 2023]β14Jul 11, 2023Updated 2 years ago
- RACE is a multi-dimensional benchmark for code generation that focuses on Readability, mAintainability, Correctness, and Efficiency.β12Oct 12, 2024Updated last year
- Fusing Heterogeneous Factors with Triaffine Mechanism for Nested Named Entity Recognition [ACL 2022 Findings]β44Feb 21, 2023Updated 2 years ago
- Conic10K: A large-scale dataset for closed-vocabulary math problem understanding. Accepted to EMNLP2023 Findings.β31Dec 6, 2023Updated 2 years ago
- Codes and Data for Scaling Relationship on Learning Mathematical Reasoning with Large Language Modelsβ270Sep 12, 2024Updated last year
- EMNLP'22 | PromptEHR: Conditional Electronic Healthcare Records Generation with Prompt Learningβ32Jun 8, 2023Updated 2 years ago
- Official code and dataset for our NAACL 2024 paper: DialogCC: An Automated Pipeline for Creating High-Quality Multi-modal Dialogue Dataseβ¦β13Jun 24, 2024Updated last year
- Data and code for ACL 2023 paper "RobuT: A Systematic Study of Table QA Robustness Against Human-Annotated Adversarial Perturbations"β15Feb 8, 2024Updated 2 years ago
- Code Synonyms Do Matter: Multiple Synonyms Matching Network for Automatic ICD Coding [ACL 2022]β54Aug 6, 2022Updated 3 years ago
- β15Aug 18, 2022Updated 3 years ago
- Medical ML Benchmarkβ11May 16, 2023Updated 2 years ago
- β30Dec 27, 2024Updated last year
- Benchmarking Complex Instruction-Following with Multiple Constraints Composition (NeurIPS 2024 Datasets and Benchmarks Track)β101Feb 20, 2025Updated 11 months ago
- [ICLR'25 Spotlight] Rethinking and improving autoformalization: towards a faithful metric and a Dependency Retrieval-based approachβ27May 20, 2025Updated 8 months ago
- The is the official implementation of "Lyra: Orchestrating Dual Correction in Automated Theorem Proving"β15Jul 2, 2024Updated last year
- Dataset and Evaluation Code for the K-QA Benchmark.β18May 26, 2024Updated last year
- Calculating Expected Time for training LLM.β38Apr 17, 2023Updated 2 years ago
- PMC-Patientsβ102Jun 7, 2024Updated last year
- Code of the COLING22 paper "uChecker: Masked Pretrained Language Models as Unsupervised Chinese Spelling Checkers"β19Aug 17, 2022Updated 3 years ago
- Orientation of the protein-protein interaction network using network diffusion techniquesβ15Mar 16, 2021Updated 4 years ago
- β18Oct 13, 2022Updated 3 years ago
- [NIPS2023] RRHF & Wombatβ808Sep 22, 2023Updated 2 years ago
- PMC-Patients: A Large-scale Dataset of Patient Summaries and Relations for Benchmarking Retrieval-based Clinical Decision Support Systemsβ¦β74Dec 20, 2023Updated 2 years ago
- [NAACL 2025] Representing Rule-based Chatbots with Transformersβ23Feb 9, 2025Updated last year
- β16Jun 12, 2023Updated 2 years ago
- [ACL 2024]Official GitHub repo for OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scieβ¦β182Jun 8, 2025Updated 8 months ago
- Code for the paper "ICON: Improving Inter-Report Consistency in Radiology Report Generation via Lesion-aware Mixup Augmentation" (EMNLP'2β¦β17Dec 11, 2024Updated last year