Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers
☆27Mar 1, 2025Updated last year
Alternatives and similar repositories for MultiAgentVerification
Users that are interested in MultiAgentVerification are comparing it to the libraries listed below
Sorting:
- ☆16Feb 22, 2025Updated last year
- Learning from preferences is a common paradigm for fine-tuning language models. Yet, many algorithmic design decisions come into play. Ou…☆32Apr 20, 2024Updated last year
- CS194-196 Course Project☆14Feb 20, 2025Updated last year
- ☆14Dec 5, 2024Updated last year
- KernelBench v2: Can LLMs Write GPU Kernels? - Benchmark with Torch -> Triton (and more!) problems☆21Jul 4, 2025Updated 7 months ago
- M^3PC: Test-Time Model Predictive Control for Pretrained Masked Trajectory Model, ICLR 2025☆19Mar 17, 2025Updated 11 months ago
- ☆17Updated this week
- ☆21Jul 9, 2025Updated 7 months ago
- ☆25Dec 13, 2024Updated last year
- Code for "Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining"☆26Oct 14, 2025Updated 4 months ago
- ☆24Apr 3, 2025Updated 10 months ago
- Official Repository for Task-Circuit Quantization☆24Jun 1, 2025Updated 9 months ago
- The open-source materials for paper "Sparsing Law: Towards Large Language Models with Greater Activation Sparsity".☆30Nov 12, 2024Updated last year
- Implementation and datasets for "Training Language Models to Generate Quality Code with Program Analysis Feedback"☆41Jul 21, 2025Updated 7 months ago
- ☆17Aug 1, 2025Updated 7 months ago
- (NeurIPS '22) LISA: Learning Interpretable Skill Abstractions - A framework for unsupervised skill learning using Imitation☆29Feb 22, 2023Updated 3 years ago
- Lego for GRPO☆30May 27, 2025Updated 9 months ago
- General Reasoner: Advancing LLM Reasoning Across All Domains [NeurIPS25]☆221Nov 27, 2025Updated 3 months ago
- Official repository for paper "Versatile Offline Imitation from Observations and Examples via Regularized State-Occupancy Matching" (ICML…☆28Jan 12, 2023Updated 3 years ago
- The official implementation of Self-Exploring Language Models (SELM)☆63Jun 4, 2024Updated last year
- ☆30Dec 23, 2024Updated last year
- Code base for paper: Reparameterized Policy Learning for Multimodal Trajectory Optimization☆27Jul 19, 2023Updated 2 years ago
- Source code for the paper "Policy Architectures for Compositional Generalization in Control"☆30May 19, 2022Updated 3 years ago
- B-Spline Density Estimation Library - nonparametric density estimation using B-Spline density estimator from univariate sample.☆16Aug 22, 2021Updated 4 years ago
- ☆13Oct 5, 2025Updated 4 months ago
- [ICLR 2026] RPG: KL-Regularized Policy Gradient (https://arxiv.org/abs/2505.17508)☆64Feb 19, 2026Updated last week
- ☆35May 16, 2025Updated 9 months ago
- Reinforcement learning examples for Torobo based on IsaacLab☆36Dec 3, 2024Updated last year
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆90Mar 18, 2025Updated 11 months ago
- Starter SDK for full-stack EVM applications, built for TreeHacks 2025 Web3 Workshop☆13Feb 14, 2025Updated last year
- Martingale posterior neural networks for fast sequential decision making @ Neurips 2025☆23Nov 13, 2025Updated 3 months ago
- A tool for an analysis of LLM generations.☆42Oct 13, 2025Updated 4 months ago
- [ICCV 2025] Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.☆183Sep 26, 2025Updated 5 months ago
- [ACL'24, Outstanding Paper] Emulated Disalignment: Safety Alignment for Large Language Models May Backfire!☆39Aug 2, 2024Updated last year
- ☆47Apr 29, 2025Updated 10 months ago
- Code implementation for CoTexT: Multi-task Learning with Code-Text Transformer☆36Sep 14, 2021Updated 4 years ago
- Repository of "Train Once, Get a Family: State-Adaptive Balances for Offline-to-Online Reinforcement Learning" (NeurIPS 2023 Spotlight)☆40Oct 30, 2023Updated 2 years ago
- Simplifying Model-based RL: Learning Representations, Latent-space Models and Policies with One Objective☆82Mar 9, 2023Updated 2 years ago
- PyTorch Implementation for the paper "Let Me Help You! Neuro-Symbolic Short-Context Action Anticipation" accepted to RA-L'24.☆12Nov 27, 2024Updated last year