[ACL 2025] Are Your LLMs Capable of Stable Reasoning?
☆32Aug 5, 2025Updated 6 months ago
Alternatives and similar repositories for GPassK
Users that are interested in GPassK are comparing it to the libraries listed below
Sorting:
- ☆14Apr 14, 2025Updated 10 months ago
- [EMNLP 2024 Findings] ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs☆29May 22, 2025Updated 9 months ago
- Official Repo of "CIBench: Evaluation of LLMs as Code Interpreter "☆14Jul 19, 2024Updated last year
- ☆60Jan 12, 2026Updated last month
- [ACL 2024 Findings] MathBench: A Comprehensive Multi-Level Difficulty Mathematics Evaluation Dataset☆111May 22, 2025Updated 9 months ago
- chinesetokenization☆13Jun 4, 2013Updated 12 years ago
- ☆21Oct 22, 2025Updated 4 months ago
- LMM for VQA, tcsvt version☆11Jul 19, 2024Updated last year
- The code and datasets of our ACM MM 2024 paper "Hallu-PI: Evaluating Hallucination in Multi-modal Large Language Models within Perturbed …☆11Sep 27, 2024Updated last year
- Our EMNLP 2022 paper on VIP-Based Prompting for Parameter-Efficient Learning☆10Oct 22, 2022Updated 3 years ago
- Model LEGO: Creating Models Like Disassembling and Assembling Building Blocks☆17Jan 15, 2025Updated last year
- [ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models☆35Jun 12, 2024Updated last year
- [NAACL 2024] CoE-SQL: In-Context Learning for Multi-Turn Text-to-SQL with Chain-of-Editions☆13May 7, 2024Updated last year
- ☆32Nov 18, 2025Updated 3 months ago
- [TACL] Do Vision and Language Models Share Concepts? A Vector Space Alignment Study☆16Nov 22, 2024Updated last year
- Current Alpha version of the ONTO-TRON-5000☆41Dec 1, 2025Updated 3 months ago
- ☆23Dec 18, 2024Updated last year
- [ACL 2025 Findings] Implicit Reasoning in Transformers is Reasoning through Shortcuts☆17Mar 11, 2025Updated 11 months ago
- [NeurIPS'25] The official code of "PeRL: Permutation-Enhanced Reinforcement Learning for Interleaved Vision-Language Reasoning"☆30Jan 12, 2026Updated last month
- Control LLM☆22Apr 6, 2025Updated 10 months ago
- Accelerating the development of large multimodal models (LMMs) with lmms-eval☆14Oct 14, 2024Updated last year
- Code for the paper "Cottention: Linear Transformers With Cosine Attention"☆20Nov 15, 2025Updated 3 months ago
- Multi-GPU supported kmeans clustering for cluser-clip☆15Jun 3, 2024Updated last year
- The All-in-one Judge Models introduced by Opencompass☆116Jul 15, 2025Updated 7 months ago
- ☆16Jul 23, 2024Updated last year
- SG-Bench: Evaluating LLM Safety Generalization Across Diverse Tasks and Prompt Types☆25Nov 29, 2024Updated last year
- Official repository of paper "Context-DPO: Aligning Language Models for Context-Faithfulness"☆21Feb 17, 2025Updated last year
- ☆32Oct 13, 2025Updated 4 months ago
- ☆22Sep 2, 2025Updated 6 months ago
- A scalable automated alignment method for large language models. Resources for "Aligning Large Language Models via Self-Steering Optimiza…☆20Nov 21, 2024Updated last year
- ☆20Nov 4, 2025Updated 3 months ago
- Official Repo for SvS: A Self-play with Variational Problem Synthesis strategy for RLVR training☆53Dec 13, 2025Updated 2 months ago
- The open-source materials for paper "Sparsing Law: Towards Large Language Models with Greater Activation Sparsity".☆30Nov 12, 2024Updated last year
- [ICLR'25] ApolloMoE: Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts☆52Nov 20, 2024Updated last year
- ☆21Jul 25, 2025Updated 7 months ago
- [ACM MM 2025] MLLMs for Aesthetics Reasoning☆23Jan 5, 2026Updated last month
- Official implementation of Vector-ICL: In-context Learning with Continuous Vector Representations (ICLR 2025)☆21Jun 2, 2025Updated 9 months ago
- Official Repo of "MMBench: Is Your Multi-modal Model an All-around Player?"☆288May 22, 2025Updated 9 months ago
- The official repository of the Omni-MATH benchmark.☆93Dec 22, 2024Updated last year