allenai / PlaSma
This is a repository for paper titled, PlaSma: Making Small Language Models Better Procedural Knowledge Models for (Counterfactual) Planning
☆13Updated last year
Alternatives and similar repositories for PlaSma:
Users that are interested in PlaSma are comparing it to the libraries listed below
- Implementation of the model: "Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models" in PyTorch☆28Updated this week
- Repository for Skill Set Optimization☆12Updated 5 months ago
- Source code for MMEvalPro, a more trustworthy and efficient benchmark for evaluating LMMs☆22Updated 3 months ago
- [ICLR'24 spotlight] Tool-Augmented Reward Modeling☆44Updated 3 weeks ago
- Evaluate the Quality of Critique☆35Updated 7 months ago
- Evaluation on Logical Reasoning and Abstract Reasoning Challenges☆21Updated 11 months ago
- A framework for evolving and testing question-answering datasets with various models.☆13Updated 10 months ago
- Supporting code for ReCEval paper☆27Updated 4 months ago
- This repository includes a benchmark and code for the paper "Evaluating LLMs at Detecting Errors in LLM Responses".☆27Updated 5 months ago
- SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Model https://arxiv.org/pdf/2411.02433☆18Updated last month
- This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"☆17Updated 10 months ago
- Improving Language Understanding from Screenshots. Paper: https://arxiv.org/abs/2402.14073☆26Updated 6 months ago
- [COLM'24] "How Easily do Irrelevant Inputs Skew the Responses of Large Language Models?"☆19Updated 3 months ago
- Tasks for describing differences between text distributions.☆16Updated 5 months ago
- [NAACL 2024] A Synthetic, Scalable and Systematic Evaluation Suite for Large Language Models☆33Updated 7 months ago
- ☆15Updated 5 months ago
- Code and data for the ACL 2024 Findings paper "Do LVLMs Understand Charts? Analyzing and Correcting Factual Errors in Chart Captioning"☆23Updated 7 months ago
- [NAACL 2024] Making Language Models Better Tool Learners with Execution Feedback☆39Updated 10 months ago
- Benchmarking Benchmark Leakage in Large Language Models☆47Updated 7 months ago
- [EMNLP'24 (Main)] DRPO(Dynamic Rewarding with Prompt Optimization) is a tuning-free approach for self-alignment. DRPO leverages a search-…☆18Updated 2 months ago
- Source code of "Reasons to Reject? Aligning Language Models with Judgments"☆58Updated 10 months ago
- Neuro-Symbolic Integration Brings Causal and Reliable Reasoning Proofs☆34Updated 11 months ago
- Code for RL4F: Generating Natural Language Feedback with Reinforcement Learning for Repairing Model Outputs. ACL 2023.☆63Updated last month
- ☆14Updated 10 months ago
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"☆33Updated last year
- PPTC Benchmark: Evaluating Large Language Models for PowerPoint Task Completion☆47Updated 10 months ago
- official repo of AAAI2024 paper Mitigating the Impact of False Negatives in Dense Retrieval with Contrastive Confidence Regularization☆13Updated last year
- ☆28Updated 11 months ago
- [ACL 2024] Masked Thought: Simply Masking Partial Reasoning Steps Can Improve Mathematical Reasoning Learning of Language Models☆15Updated 6 months ago
- Unofficial Implementation of Chain-of-Thought Reasoning Without Prompting☆24Updated 10 months ago