chen-judge / UniGeo
[EMNLP 22] UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical Expression
☆26Updated last year
Related projects: ⓘ
- Official Implementation of ACL 2021 paper “GeoQA: A Geometric Question Answering Benchmark Towards Multimodal Numerical Reasoning”.☆41Updated 2 years ago
- Evaluating Mathematical Reasoning Beyond Accuracy☆32Updated 5 months ago
- ☆13Updated 4 months ago
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆78Updated last week
- [ACL 2023] Solving Math Word Problems via Cooperative Reasoning induced Language Models☆33Updated 9 months ago
- ☆16Updated last year
- ☆80Updated 9 months ago
- Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"☆65Updated 6 months ago
- Code & data for ICLR 2024 spotlight paper: 🍯MUSTARD: Mastering Uniform Synthesis of Theorem and Proof Data☆36Updated 3 months ago
- ☆42Updated last year
- The code and data for the paper JiuZhang3.0☆29Updated 3 months ago
- Multi-modal code generation problems.☆15Updated 2 weeks ago
- [ICML'24] TroVE: Inducing Verifiable and Efficient Toolboxes for Solving Programmatic Tasks☆20Updated 7 months ago
- ☆44Updated last year
- The official implementation of paper "Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language Models as Agen…☆20Updated 6 months ago
- [ACL 2024 Findings] CriticBench: Benchmarking LLMs for Critique-Correct Reasoning☆20Updated 6 months ago
- Code for paper "Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning"☆59Updated 7 months ago
- The official repository for the paper "From Zero to Hero: Examining the Power of Symbolic Tasks in Instruction Tuning".☆61Updated last year
- Improving Language Understanding from Screenshots. Paper: https://arxiv.org/abs/2402.14073☆24Updated 2 months ago
- [ICLR'24 spotlight] Tool-Augmented Reward Modeling☆33Updated 6 months ago
- ☆10Updated 3 weeks ago
- This is the repo for our paper "Mr-Ben: A Comprehensive Meta-Reasoning Benchmark for Large Language Models"☆38Updated 2 months ago
- A curated list of awesome resources dedicated to Scaling Laws for LLMs☆61Updated last year
- Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement☆21Updated last month
- ☆23Updated 2 months ago
- 🤖ConvRe🤯: An Investigation of LLMs’ Inefficacy in Understanding Converse Relations (EMNLP 2023)☆22Updated 11 months ago
- VaLM: Visually-augmented Language Modeling. ICLR 2023.☆55Updated last year
- This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"☆21Updated 2 months ago
- ☆57Updated last year
- my commonly-used tools☆46Updated last month