HKUST-KnowComp / NewtonBenchLinks
NewtonBench: Benchmarking Generalizable Scientific Law Discovery in LLM Agents
☆132Updated 3 weeks ago
Alternatives and similar repositories for NewtonBench
Users that are interested in NewtonBench are comparing it to the libraries listed below
Sorting:
- Code for "FaithLens: Detecting and Explaining Faithfulness Hallucination"☆95Updated last week
- Code implementation of the paper accepted by IEEE TKDE2024: "Make Heterophilic Graphs Better Fit GNN: A Graph Rewiring Approach"☆111Updated last year
- Dataset and evaluation code of ISDrama(ACM-MM 2025): Immersive Spatial Drama Generation through Multimodal Prompting☆237Updated 4 months ago
- A lightweight intelligent agent framework implementing the complete ReAct pattern☆177Updated 4 months ago
- Group Expectation Policy Optimization for Heterogeneous Reinforcement Learning☆164Updated last month
- [ACL 2025 Oral] QAEncoder: Towards Aligned Representation Learning in Question Answering Systems☆176Updated 6 months ago
- ☆357Updated 6 months ago
- A benchmark suite for evaluating LLM-based interactive scientific reasoning.☆91Updated last week
- Official implementation of CIKM2024 paper titled "PROSPECT: Learn MLPs on Graphs Robust against Adversarial Structure Attacks"☆22Updated 11 months ago
- Awesome Literature Graph Learning Challenges☆100Updated 3 months ago
- A powerful multi-format file parsing, data cleaning, and AI annotation toolkit.☆144Updated last month
- Official Implementation of FastMCTS: A Simple Sampling Strategy for Data Synthesis☆112Updated 6 months ago
- The codes for the paper One-bit Deep Hashing: Towards a Resource-Efficient Hashing Model with Binary Neural Networks (ACMMM24)☆45Updated 10 months ago
- The code for Refining Sentence Embedding Model through Ranking Sentences Generation with Large Language Models (Finding of ACL2025)☆83Updated 6 months ago
- 希望能帮助同学们复习数据结构☆43Updated this week
- [VLDB 2025] SimRN: Trajectory Similarity Learning in Road Networks based on Distributed Deep Reinforcement Learning☆106Updated 8 months ago
- [NeurIPS 2025] Official implementation of "STRAP: Spatio-Temporal Pattern Retrieval for Out-of-Distribution Generalization"☆75Updated 2 months ago
- [COLM 2025] Assessing Judging Bias in Large Reasoning Models: An Empirical Study https://openreview.net/pdf?id=SlRtFwBdzP☆163Updated 3 months ago
- Repository for the paper:☆69Updated last year
- 4th Place Solution for the Kaggle Competition: LMSYS - Chatbot Arena Human Preference Predictions☆171Updated last year
- ☆85Updated 10 months ago
- An Integrated Library for Tuning, Deploying and Interpreting Genomic Models☆119Updated 3 months ago
- use v402 merchant excellent way☆82Updated 3 weeks ago
- ☆75Updated 7 months ago
- https://arxiv.org/abs/2510.10004☆83Updated 3 months ago
- MCP server for Oura API integration☆110Updated 3 weeks ago
- (EMNLP 2025 Findings) Source Evaluation scripts for Humanity's Last Code Exam☆95Updated 4 months ago
- Spring项目:支持设置时间、价格、距离权重的个性化导航服务,并支持根据大量用户行驶状态更新道路情况和预计到达时间☆22Updated 8 months ago
- Marco Search Agent for Realistic and Challenging Agentic Search☆240Updated 2 months ago
- A project aims to improve LLMs' pixel reasoning ability.☆80Updated 4 months ago