ZJU-CTAG / B4Links
Code for ASE'24 paper "B4: Towards Optimal Assessment of Plausible Code Solutions with Plausible Tests"
☆11Updated last year
Alternatives and similar repositories for B4
Users that are interested in B4 are comparing it to the libraries listed below
Sorting:
- [KDD Explore'24]Time Series Forecasting with LLMs: Understanding and Enhancing Model Capabilities☆17Updated 7 months ago
- Official Implementation of UA^{2}-Agent and other baseline algorithms of "Towards Unified Alignment Between Agents, Humans, and Environme…☆19Updated last year
- JudgeLRM: Large Reasoning Models as a Judge☆40Updated last week
- [EMNLP 2024] Ask-before-Plan: Proactive Language Agents for Real-World Planning☆21Updated 4 months ago
- [ICLR 2025] "GraphRouter: A Graph-based Router for LLM Selections", Tao Feng, Yanzhen Shen, Jiaxuan You☆46Updated 3 months ago
- ☆70Updated last year
- The code implementation of Symbolic-MoE☆45Updated 3 months ago
- [AAAI-26] Are We on the Right Way for Assessing Document Retrieval-Augmented Generation?☆26Updated this week
- ☆152Updated last year
- Code for "TrustRAG: Enhancing Robustness and Trustworthiness in RAG" AAAI 2026 Workshop on Trust and Control in Agentic AI (TrustAgent)☆52Updated 8 months ago
- Code, benchmark and environment for "ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows"☆117Updated last month
- Official repository for the paper Number Cookbook: Number Understanding of Language Models and How to Improve It.☆19Updated 8 months ago
- [ICLR 2025] ELICIT: LLM Augmentation Via External In-context Capability☆12Updated 9 months ago
- Official code implementation for the ACL 2025 paper: 'Dynamic Scaling of Unit Tests for Code Reward Modeling'☆26Updated 7 months ago
- Unofficial Implementation of Chain-of-Thought Reasoning Without Prompting☆34Updated last year
- [NeurIPS 2024 Oral] Repository of the CMuST paper: "Get Rid of Isolation: A Continuous Multi-task Spatio-Temporal Learning Framework"☆15Updated 9 months ago
- ☆89Updated last week
- The code of RouterDC☆69Updated 8 months ago
- [NeurIPS 2025 Spotlight] Co-Evolving LLM Coder and Unit Tester via Reinforcement Learning☆143Updated 3 months ago
- [ICLR 2025 Oral] "Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free"☆83Updated last year
- Repo for EmbedLLM: Learning Compact Representations of Large Language Models☆26Updated 2 months ago
- JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence☆70Updated 3 weeks ago
- From Commands to Prompts: LLM-based Semantic File System for AIOS☆40Updated 9 months ago
- [ACL 2025] A Generalizable and Purely Unsupervised Self-Training Framework☆70Updated 6 months ago
- [NeurIPS'25] Router-R1: Teaching LLMs Multi-Round Routing and Aggregation via Reinforcement Learning☆96Updated 2 months ago
- ☆94Updated 8 months ago
- ☆12Updated 9 months ago
- BlackGoose Rimer: RWKV as a Superior Architecture for Large-Scale Time Series Modeling☆29Updated 5 months ago
- A Survey of Direct Preference Optimization (DPO)☆86Updated 5 months ago
- Official implementation of RMoE (Layerwise Recurrent Router for Mixture-of-Experts)☆27Updated last year