Tree-Shu-Zhao / ParallelSearchLinks

This is the official code for the paper "ParallelSearch: Train your LLMs to Decompose Query and Search Sub-queries in Parallel with Reinforcement Learning"

☆26

Alternatives and similar repositories for ParallelSearch

Users that are interested in ParallelSearch are comparing it to the libraries listed below

Sorting:

Reason-Wang / NAT
[NAACL 2025] The official implementation of paper "Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language M…
☆29Updated last year
yale-nlp / MCTS-RAG
Data and Code for EMNLP 2025 Findings Paper "MCTS-RAG: Enhancing Retrieval-Augmented Generation with Monte Carlo Tree Search"
☆73Updated 4 months ago
Yu-Fangxu / FoR
[ICML 2025] Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples
☆108Updated 3 months ago
shizhediao / R-Tuning
[NAACL 2024 Outstanding Paper] Source code for the NAACL 2024 paper entitled "R-Tuning: Instructing Large Language Models to Say 'I Don't…
☆122Updated last year
weizhepei / InstructRAG
[ICLR 2025] InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized Rationales
☆127Updated 8 months ago
TianduoWang / DPO-ST
[ACL 2024] Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning
☆51Updated last year
WeiminXiong / IPR
Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)
☆62Updated last year
wlzhang2020 / ReasonRAG
Source code of paper: Process vs. Outcome Reward: Which is Better for Agentic RAG Reinforcement Learning
☆38Updated 4 months ago
kaistAI / Janus
[NeurIPS 2024] Train LLMs with diverse system messages reflecting individualized preferences to generalize to unseen system messages
☆51Updated 2 months ago
OSU-NLP-Group / In-Context-Reranking
[ICLR'25] "Attention in Large Language Models Yields Efficient Zero-Shot Re-Rankers"
☆36Updated 6 months ago
orionw / rank1
Test-time compute in information retrieval
☆46Updated 3 months ago
TsinghuaC3I / SSRL
SSRL: Self-Search Reinforcement Learning
☆147Updated 2 months ago
GAIR-NLP / benbench
Benchmarking Benchmark Leakage in Large Language Models
☆55Updated last year
ernie-research / Tool-Augmented-Reward-Model
[ICLR'24 spotlight] Tool-Augmented Reward Modeling
☆51Updated 4 months ago
THU-KEG / Agentic-Reward-Modeling
[ACL 2025] Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems
☆108Updated 4 months ago
google-research / chain-of-table
Code for paper Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding
☆85Updated last year
ytyz1307zzh / RefAug
Code for EMNLP 2024 paper "Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning"
☆55Updated last year
DAMO-NLP-SG / contrastive-cot
Contrastive Chain-of-Thought Prompting
☆68Updated last year
YangLing0818 / SuperCorrect-llm
[ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction
☆83Updated 7 months ago
MurongYue / LLM_MoT_cascade
This is the implementation for the paper "LARGE LANGUAGE MODEL CASCADES WITH MIX- TURE OF THOUGHT REPRESENTATIONS FOR COST- EFFICIENT REA…
☆27Updated last year
BeastyZ / LLM-Verified-Retrieval
Repo for Llatrieval
☆31Updated last year
orionw / FollowIR
FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions
☆48Updated last year
belindal / ERASE
Code and Data for "Language Modeling with Editable External Knowledge"
☆36Updated last year
gauss5930 / iDUS
An unofficial implementation of SOLAR-10.7B model and the newly proposed interlocked-DUS(iDUS) implementation and experiment details.
☆13Updated last year
yyDing1 / ScaleQuest
[ACL 2025] We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLM…
☆68Updated last year
hkust-nlp / WebExplorer
The official repo of "WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents"
☆81Updated last month
icip-cas / Verifier-Engineering
Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering
☆62Updated 10 months ago
OSU-NLP-Group / llm-planning-eval
[ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"
☆54Updated last year
snap-stanford / optimas
Optimize Any User-defined Compound AI Systems
☆59Updated 2 months ago
ByteDance-Seed / WideSearch
WideSearch: Benchmarking Agentic Broad Info-Seeking
☆96Updated 2 weeks ago