Reproducible and flexible LLM evaluations for scientific reasoning.
☆28Jul 23, 2025Updated 8 months ago
Alternatives and similar repositories for lm-open-science-evaluation
Users that are interested in lm-open-science-evaluation are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official repository of paper "Context-DPO: Aligning Language Models for Context-Faithfulness"☆23Feb 17, 2025Updated last year
- ☆18Mar 2, 2026Updated last month
- ☆30Dec 7, 2025Updated 4 months ago
- [AAAI 2025] Assessing the Creativity of LLMs in Proposing Novel Solutions to Mathematical Problems☆13May 5, 2025Updated 11 months ago
- PhysReason Becnhmark☆19Jul 8, 2025Updated 9 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning☆117Feb 2, 2026Updated 2 months ago
- ☆13Nov 11, 2022Updated 3 years ago
- ☆79May 22, 2024Updated last year
- [COLING 2025] Official repo of paper: "Not Aligned" is Not "Malicious": Being Careful about Hallucinations of Large Language Models' Jail…☆12Jul 26, 2024Updated last year
- Chat with Excel / CSV Data.☆14Mar 22, 2024Updated 2 years ago
- Code and data for paper "Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation".☆24Oct 22, 2025Updated 5 months ago
- 和文档交谈,旨在让 GPT 得到最新的文档后优化其在代码上的回复☆10Apr 2, 2023Updated 3 years ago
- Universal Adversarial Perturbations for Vision-Language Pre-trained Models☆24Aug 8, 2025Updated 8 months ago
- Official repository of paper "Parameters vs. Context: Fine-Grained Control of Knowledge Reliance in Language Models"☆24May 27, 2025Updated 10 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Reformatted Alignment☆111Sep 23, 2024Updated last year
- Repository for our paper "DeepEdit: Knowledge Editing as Decoding with Constraints". https://arxiv.org/abs/2401.10471☆21Jun 19, 2024Updated last year
- Creating high resolution background images☆14Sep 8, 2015Updated 10 years ago
- ☆34Aug 14, 2025Updated 8 months ago
- SciKnowEval: Evaluating Multi-level Scientific Knowledge of Large Language Models☆27Jul 13, 2025Updated 9 months ago
- Implementation of "Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation"☆21Jul 31, 2023Updated 2 years ago
- Werewolves Assistant Web is a Vue web app using the Werewolves Assistant API. Thanks to this app, be the game master of the Werewolves ga…☆16Feb 22, 2023Updated 3 years ago
- 一些常见攻击算法的实现☆16May 13, 2021Updated 4 years ago
- A library for open domain query facet extraction and generation☆16Apr 24, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Medium API Ruby Client☆22Oct 25, 2017Updated 8 years ago
- ☆16Sep 6, 2024Updated last year
- This is a repo consisting of papers about LLMs' perception of their knowledge boundaries; Uncertainty Quantification; Honesty Alignment; …☆24Nov 25, 2025Updated 4 months ago
- 识别工厂中托盘和托盘上的孔☆14Sep 11, 2023Updated 2 years ago
- ☆13Jun 21, 2022Updated 3 years ago
- ☆27Oct 7, 2025Updated 6 months ago
- (NeurIPS 2025 D&B Track) OverLayBench: A Benchmark for Layout-to-Image Generation with Dense Overlaps☆26Mar 27, 2026Updated 3 weeks ago
- ☆16Jul 26, 2023Updated 2 years ago
- Awesome-Parallel-Reasoning: Unlocking the reasoning potential of LLMs. Papers, Code, Resources & Survey.☆51Mar 8, 2026Updated last month
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Subgraph Based Learning of Contextual Embedding☆29Nov 5, 2021Updated 4 years ago
- Official implementation of Panacea: A foundation model for clinical trial design, recruitment, search, and summarization.☆18Dec 24, 2024Updated last year
- ☆13Jun 16, 2021Updated 4 years ago
- ☆16Sep 4, 2025Updated 7 months ago
- Official code for the paper Towards Fully Exploiting LLM Internal States to Enhance Knowledge Boundary Perception. The code is based on t…☆19Aug 5, 2025Updated 8 months ago
- SIGIR 2022: GERE: Generative Evidence Retrieval for Fact Verification☆20Jul 19, 2022Updated 3 years ago
- A project that can generate ancient poems based on pictures, including CLIP, T5, GPT2 models☆21Feb 16, 2025Updated last year