SciMT / SciMT-benchmarkLinks
☆11Updated last year
Alternatives and similar repositories for SciMT-benchmark
Users that are interested in SciMT-benchmark are comparing it to the libraries listed below
Sorting:
- SciKnowEval: Evaluating Multi-level Scientific Knowledge of Large Language Models☆19Updated 7 months ago
- MathFusion: Enhancing Mathematical Problem-solving of LLM through Instruction Fusion (ACL 2025)☆25Updated last month
- PRESTO: Progressive Pretraining Enhances Synthetic Chemistry Outcomes [EMNLP 2024]☆25Updated 7 months ago
- PyTorch codes for the paper "An Empirical Study of Multimodal Model Merging"☆37Updated last year
- Pre-trained Language Model for Scientific Text☆45Updated last year
- The source code for running LLMs on the AAAR-1.0 benchmark.☆16Updated 2 months ago
- ☆20Updated 4 years ago
- Structured Chemistry Reasoning with Large Language Models☆39Updated last year
- implementation of dualformer☆17Updated 3 months ago
- ☆20Updated last month
- [NeurIPS 2024] Code and Data Repo for Paper "Embedding Trajectory for Out-of-Distribution Detection in Mathematical Reasoning"☆26Updated last year
- Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation, ICML 2024☆22Updated last year
- MMSci: A Multimodal Multi-Discipline Dataset for PhD-Level Scientific Comprehension☆45Updated 6 months ago
- A trainable user simulator☆34Updated 9 months ago
- ☆10Updated 2 months ago
- Offcial Repo of Paper "Eliminating Position Bias of Language Models: A Mechanistic Approach""☆14Updated 2 weeks ago
- ☆13Updated last year
- ☆16Updated 8 months ago
- exploring whether LLMs perform case-based or rule-based reasoning☆29Updated last year
- This repository contains the code and data for the paper "VisOnlyQA: Large Vision Language Models Still Struggle with Visual Perception o…☆23Updated 2 months ago
- ☆16Updated 11 months ago
- Preparing for ML Interviews.☆11Updated 2 months ago
- ☆20Updated 7 months ago
- Official Implementation of UA^{2}-Agent and other baseline algorithms of "Towards Unified Alignment Between Agents, Humans, and Environme…☆17Updated 7 months ago
- On The Planning Abilities of OpenAI's o1 Models: Feasibility, Optimality, and Generalizability☆39Updated 2 months ago
- [NAACL 2025] The official implementation of paper "Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language M…☆26Updated last year
- Official repository for Decentralized Arena via Collective LLM Intelligence☆14Updated last month
- Code accompanying the paper "Noise Contrastive Alignment of Language Models with Explicit Rewards" (NeurIPS 2024)☆54Updated 7 months ago
- ☆10Updated last month
- ☆42Updated 2 months ago