Rachum-thu / LongPiBenchLinks
The repository for papaer "Distance between Relevant Information Pieces Causes Bias in Long-Context LLMs"
☆14Updated 11 months ago
Alternatives and similar repositories for LongPiBench
Users that are interested in LongPiBench are comparing it to the libraries listed below
Sorting:
- COLING 2025: MBA-RAG: a Bandit Approach for Adaptive Retrieval-Augmented Generation through Question Complexity☆22Updated 11 months ago
- [NAACL'25] "Revealing the Barriers of Language Agents in Planning"☆13Updated 5 months ago
- Co-LLM: Learning to Decode Collaboratively with Multiple Language Models☆123Updated last year
- [ACL 2025] Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆112Updated 5 months ago
- [ACL 2025] How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training☆44Updated 4 months ago
- Unlearning Isn't Invisible: Detecting Unlearning Traces in LLMs from Model Outputs☆23Updated 4 months ago
- SSRL: Self-Search Reinforcement Learning☆152Updated 3 months ago
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆82Updated 8 months ago
- ☆141Updated 6 months ago
- THOUGHTSCULPT, a general reasoning and search method for complex tasks☆13Updated 11 months ago
- ☆222Updated 8 months ago
- HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models☆52Updated 11 months ago
- Official repository for Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning [ICLR 2025]☆48Updated 9 months ago
- Data Synthesis for Deep Research Based on Semi-Structured Data☆179Updated last week
- [EMNLP 2025] The official implementation for paper "Agentic-R1: Distilled Dual-Strategy Reasoning"☆100Updated 2 months ago
- Code for RATIONALYST: Pre-training Process-Supervision for Improving Reasoning https://arxiv.org/pdf/2410.01044☆35Updated last year
- ☆35Updated 6 months ago
- The official implementation of Self-Exploring Language Models (SELM)☆63Updated last year
- ☆84Updated last year
- ☆88Updated last week
- [ACL 2025] RetroLLM: Empowering LLMs to Retrieve Fine-grained Evidence within Generation☆119Updated 10 months ago
- Demystifying Reinforcement Learning in Agentic Reasoning☆121Updated last month
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆173Updated 10 months ago
- ☆67Updated 7 months ago
- The code implementation of Symbolic-MoE☆44Updated 2 months ago
- ☆60Updated 4 months ago
- The official implementation of Preference Data Reward-Augmentation.☆18Updated 6 months ago
- ☆78Updated 3 weeks ago
- Public code repo for paper "SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales"☆109Updated last year
- [ICLR 2025] DSBench: How Far are Data Science Agents from Becoming Data Science Experts?☆81Updated 3 months ago