Shangyint / langProBe
☆14Updated 3 weeks ago
Alternatives and similar repositories for langProBe:
Users that are interested in langProBe are comparing it to the libraries listed below
- Code implementation of synthetic continued pretraining☆104Updated 3 months ago
- BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval☆99Updated last week
- Aligning with Human Judgement: The Role of Pairwise Preference in Large Language Model Evaluators (Liu et al.; COLM 2024)☆47Updated 3 months ago
- Official repository for R2E-Gym: Procedural Environment Generation and Hybrid Verifiers for Scaling Open-Weights SWE Agents☆64Updated this week
- Code and data used in the paper: "Training on Incorrect Synthetic Data via RL Scales LLM Math Reasoning Eight-Fold"☆30Updated 10 months ago
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆54Updated last year
- [ICLR 2024] Evaluating Large Language Models at Evaluating Instruction Following☆124Updated 9 months ago
- Official Repo for ICLR 2024 paper MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback by Xingyao Wang*, Ziha…☆123Updated 10 months ago
- ☆71Updated 5 months ago
- Implementation of the paper: "Making Retrieval-Augmented Language Models Robust to Irrelevant Context"☆69Updated 8 months ago
- ☆66Updated last month
- Awesome LLM Self-Consistency: a curated list of Self-consistency in Large Language Models☆96Updated 8 months ago
- ☆32Updated 11 months ago
- ☆39Updated 2 years ago
- ☆149Updated 4 months ago
- Organize the Web: Constructing Domains Enhances Pre-Training Data Curation☆42Updated 2 months ago
- Official repository for MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models [NeurIPS 2024]☆64Updated 5 months ago
- Repository for MuSiQue: Multi-hop Questions via Single-hop Question Composition, TACL 2022☆128Updated 10 months ago
- NaturalCodeBench (Findings of ACL 2024)☆63Updated 6 months ago
- ☆44Updated last year
- ☆101Updated last year
- ☆95Updated last year
- Prompting Large Language Models to Generate Dense and Sparse Representations for Zero-Shot Document Retrieval☆45Updated 6 months ago
- ☆49Updated last year
- Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)☆57Updated 6 months ago
- ☆36Updated 10 months ago
- PASTA: Post-hoc Attention Steering for LLMs☆114Updated 5 months ago
- Repository for NPHardEval, a quantified-dynamic benchmark of LLMs☆53Updated last year
- ☆43Updated 8 months ago
- A Kernel-Based View of Language Model Fine-Tuning https://arxiv.org/abs/2210.05643☆76Updated last year