OpenDFM / SciEval
[AAAI 2024] SciEval: A Multi-Level Large Language Model Evaluation Benchmark for Scientific Research
☆23Updated 3 months ago
Related projects ⓘ
Alternatives and complementary repositories for SciEval
- Structured Chemistry Reasoning with Large Language Models☆31Updated 6 months ago
- ☆103Updated 4 months ago
- Pre-trained Language Model for Scientific Text☆42Updated 8 months ago
- SciGLM: Training Scientific Language Models with Self-Reflective Instruction Annotation and Tuning (NeurIPS D&B Track 2024)☆65Updated 8 months ago
- ☆33Updated 3 weeks ago
- [NeurIPS 2024] Knowledge Circuits in Pretrained Transformers☆68Updated 3 weeks ago
- [NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623☆67Updated last month
- InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized Rationales☆51Updated 3 weeks ago
- [EMNLP 2024 (Oral)] Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA☆90Updated this week
- 🐋 An unofficial implementation of Self-Alignment with Instruction Backtranslation.☆131Updated 4 months ago
- Benchmarking Agentic Workflow Generation☆25Updated last week
- [ACL 2024] An Easy-to-use Hallucination Detection Framework for LLMs.☆19Updated last month
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆30Updated 9 months ago
- A method of ensemble learning for heterogeneous large language models.☆30Updated 3 months ago
- Edgeformers: Graph-Empowered Transformers for Representation Learning on Textual-Edge Networks (ICLR 2023)☆57Updated last year
- ACL 2024 | LooGLE: Long Context Evaluation for Long-Context Language Models☆166Updated last month
- Data and code for our paper "Why Does the Effective Context Length of LLMs Fall Short?"☆61Updated last week
- Official Code Repository for LM-Steer Paper: "Word Embeddings Are Steers for Language Models" (ACL 2024 Outstanding Paper Award)☆56Updated last month
- Code implementation of synthetic continued pretraining☆54Updated last month
- Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024]☆127Updated last month
- Code and Data for "Long-context LLMs Struggle with Long In-context Learning"☆91Updated 4 months ago
- ☆68Updated 4 months ago
- [ICLR 2024] CLEX: Continuous Length Extrapolation for Large Language Models☆72Updated 8 months ago
- [ICLR'24 spotlight] Tool-Augmented Reward Modeling☆36Updated 8 months ago
- [NeurIPS'24] Official code for *🎯DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*☆75Updated last month
- Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]☆124Updated 2 weeks ago
- ☆56Updated 8 months ago
- A trainable user simulator☆26Updated last month
- We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLMs.☆47Updated 2 weeks ago
- BioT5 (EMNLP 2023) and BioT5+ (ACL 2024 Findings)☆95Updated last month