msclar / symbolictom
☆20Updated last year
Related projects ⓘ
Alternatives and complementary repositories for symbolictom
- Evaluate the Quality of Critique☆35Updated 5 months ago
- Official repository for paper "Weak-to-Strong Extrapolation Expedites Alignment"☆67Updated 5 months ago
- ☆53Updated 2 months ago
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆48Updated 8 months ago
- [ACL 2024] Code for "MoPS: Modular Story Premise Synthesis for Open-Ended Automatic Story Generation"☆30Updated 3 months ago
- GSM-Plus: Data, Code, and Evaluation for Enhancing Robust Mathematical Reasoning in Math Word Problems.☆45Updated 4 months ago
- [EMNLP Findings 2024 & ACL 2024 NLRSE Oral] Enhancing Mathematical Reasoning in Language Models with Fine-grained Rewards☆44Updated 6 months ago
- [NeurIPS 2024] Knowledge Circuits in Pretrained Transformers☆66Updated 3 weeks ago
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"☆32Updated 9 months ago
- AbstainQA, ACL 2024☆19Updated last month
- BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval☆56Updated 2 weeks ago
- [NAACL 2024 Outstanding Paper] Source code for the NAACL 2024 paper entitled "R-Tuning: Instructing Large Language Models to Say 'I Don't…☆84Updated 4 months ago
- BeHonest: Benchmarking Honesty in Large Language Models☆29Updated 2 months ago
- Sotopia-π: Interactive Learning of Socially Intelligent Language Agents (ACL 2024)☆50Updated 6 months ago
- [ICLR'24 Spotlight] "Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts"☆59Updated 6 months ago
- Code for the paper <SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning>☆44Updated last year
- [NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.☆61Updated 3 weeks ago
- ☆36Updated 10 months ago
- ☆63Updated 5 months ago
- Are LLMs Capable of Data-based Statistical and Causal Reasoning? Benchmarking Advanced Quantitative Reasoning with Data☆30Updated 2 months ago
- ☆40Updated 11 months ago
- ☆25Updated last month
- ☆28Updated last week
- Github repository for "FELM: Benchmarking Factuality Evaluation of Large Language Models" (NeurIPS 2023)☆54Updated 10 months ago
- [ICLR 2024] Evaluating Large Language Models at Evaluating Instruction Following☆114Updated 4 months ago
- the instructions and demonstrations for building a formal logical reasoning capable GLM☆50Updated 2 months ago
- Source code of "Reasons to Reject? Aligning Language Models with Judgments"☆56Updated 8 months ago
- Code associated with Tuning Language Models by Proxy (Liu et al., 2024)☆96Updated 7 months ago
- ☆37Updated 6 months ago
- ☆10Updated last year