SciMT / SciMT-benchmarkLinks

☆11

Alternatives and similar repositories for SciMT-benchmark

Users that are interested in SciMT-benchmark are comparing it to the libraries listed below

Sorting:

HICAI-ZJU / SciKnowEval
SciKnowEval: Evaluating Multi-level Scientific Knowledge of Large Language Models
☆22Updated last week
QizhiPei / MathFusion
MathFusion: Enhancing Mathematical Problem-solving of LLM through Instruction Fusion (ACL 2025)
☆27Updated last week
VITA-Group / o1-planning
On The Planning Abilities of OpenAI's o1 Models: Feasibility, Optimality, and Generalizability
☆39Updated 2 weeks ago
LHL3341 / MetaLadder
☆10Updated 3 months ago
Leezekun / MMSci
MMSci: A Multimodal Multi-Discipline Dataset for PhD-Level Scientific Comprehension
☆45Updated 7 months ago
ozyyshr / StructChem
Structured Chemistry Reasoning with Large Language Models
☆40Updated last year
ylsung / vl-merging
PyTorch codes for the paper "An Empirical Study of Multimodal Model Merging"
☆37Updated last year
Alab-NII / Awesome-SciLM
Pre-trained Language Model for Scientific Text
☆45Updated last year
GraphPKU / Case_or_Rule
exploring whether LLMs perform case-based or rule-based reasoning
☆29Updated last year
mathllm / Step-Controlled_DPO
☆22Updated last year
arnab-api / romba
Applies ROME and MEMIT on Mamba-S4 models
☆14Updated last year
WildVision-AI / LMM-Engines
☆16Updated 9 months ago
janphilippfranken / sami
Self-Supervised Alignment with Mutual Information
☆20Updated last year
Yijia-Xiao / Know2BIO
Know2BIO: A Comprehensive Dual-View Benchmark for Evolving Biomedical Knowledge Graphs
☆14Updated last year
AgentForceTeamOfficial / UA2-Agent
Official Implementation of UA^{2}-Agent and other baseline algorithms of "Towards Unified Alignment Between Agents, Humans, and Environme…
☆18Updated 8 months ago
maszhongming / ParaKnowTransfer
Code for "Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective"
☆32Updated last year
dongxiangjue / Awesome-LLM-Self-Improvement
A curated list of awesome LLM Inference-Time Self-Improvement (ITSI, pronounced "itsy") papers from our recent survey: A Survey on Large …
☆85Updated 7 months ago
IDEA-XL / PRESTO
PRESTO: Progressive Pretraining Enhances Synthetic Chemistry Outcomes [EMNLP 2024]
☆28Updated 8 months ago
gregorbachmann / Next-Token-Failures
☆87Updated last year
yale-nlp / refdpo
☆16Updated last year
sail-sg / scaling-with-vocab
[NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623
☆86Updated 9 months ago
allenai / easy-to-hard-generalization
Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"
☆48Updated last year
haozheji / exact-optimization
ICML 2024 - Official Repository for EXO: Towards Efficient Exact Optimization of Language Model Alignment
☆58Updated last year
facebookresearch / dualformer
implementation of dualformer
☆18Updated 4 months ago
Thinklab-SJTU / BiLAF
Official implementation of Our NeurIPS 2024 Paper "Boundary Matters: A Bi-Level Active Finetuning Method"
☆13Updated 5 months ago
smiles724 / Awesome-LLM-RLVR
Collection of latest papers and materials in the area of RLVR!
☆17Updated last month
wzq016 / PINE
Offcial Repo of Paper "Eliminating Position Bias of Language Models: A Mechanistic Approach""
☆14Updated last month
gersteinlab / ChemAgent
[ICLR 2025]ChemAgent: Self-updating Library in Large Language Models Improves Chemical Reasoning
☆63Updated 4 months ago
tsinghua-fib-lab / SmartAgent
The official repository of "SmartAgent: Chain-of-User-Thought for Embodied Personalized Agent in Cyber World".
☆28Updated 4 months ago
ECNU-ICALK / MELO
[AAAI 2024] MELO: Enhancing Model Editing with Neuron-indexed Dynamic LoRA
☆25Updated last year