[NeurIPS 2024] Code and Data Repo for Paper "Embedding Trajectory for Out-of-Distribution Detection in Mathematical Reasoning"
☆28May 28, 2024Updated 2 years ago
Alternatives and similar repositories for OOD-Math-Reasoning
Users that are interested in OOD-Math-Reasoning are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ACL 2024] Can Watermarks Survive Translation? On the Cross-lingual Consistency of Text Watermark for Large Language Models☆47Jun 4, 2024Updated 2 years ago
- [NeurIPS 2025 D&B Track] Evaluation Code Repo for Paper "PolyMath: Evaluating Mathematical Reasoning in Multilingual Contexts"☆44May 22, 2025Updated last year
- [ICLR 2025] Code and Data Repo for Paper "Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation"☆100Dec 19, 2024Updated last year
- [WMT 2022] Implementation of TAL-SJTU's system for WMT22 English-Livonian☆23May 4, 2023Updated 3 years ago
- Code of "Improving Machine Translation with Human Feedback: An Exploration of Quality Estimation as a Reward Model"☆22Jun 28, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- {DeepL, Google, WMT-Best, davinci-003, turbo, gpt-4} × {En-De, En-Cs, En-Ru, En-Zh, De-Fr, En-Ja, Uk-En, Uk-Cs, En-Hr, En-Ha, En-Is}☆14Jun 18, 2023Updated 3 years ago
- [ICML2024]Adaptive decoding balances the diversity and coherence of open-ended text generation.☆19Jun 2, 2024Updated 2 years ago
- Official repository for Decentralized Arena via Collective LLM Intelligence☆18May 19, 2025Updated last year
- [ACL 2024 Findings] CriticBench: Benchmarking LLMs for Critique-Correct Reasoning☆31Mar 5, 2024Updated 2 years ago
- Task Compass: Scaling Multi-task Pre-training with Task Prefix (EMNLP 2022: Findings) (stay tuned & more will be updated)☆22Oct 17, 2022Updated 3 years ago
- A Data Source for Reasoning Embodied Agents☆19Sep 18, 2023Updated 2 years ago
- [ICLR24] code for LSN☆10Oct 28, 2024Updated last year
- source code for NeurIPS'24 paper "HaloScope: Harnessing Unlabeled LLM Generations for Hallucination Detection"☆70Apr 11, 2025Updated last year
- Official repository for ACL 2025 paper "ProcessBench: Identifying Process Errors in Mathematical Reasoning"☆191May 20, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Official PyTorch code for UAI 2023 paper "Concurrent Misclassification and Out-of-Distribution Detection for Semantic Segmentation via En…☆12Nov 10, 2023Updated 2 years ago
- ☆62Oct 14, 2024Updated last year
- [ICML 2025] Official repository for paper "OR-Bench: An Over-Refusal Benchmark for Large Language Models"☆28Mar 4, 2025Updated last year
- This the implementation of LeCo☆33Jan 20, 2025Updated last year
- EMNLP'23 survey: a curation of awesome papers and resources on refreshing large language models (LLMs) without expensive retraining.☆135Dec 12, 2023Updated 2 years ago
- RENT (Reinforcement Learning via Entropy Minimization) is an unsupervised method for training reasoning LLMs.☆43Oct 31, 2025Updated 7 months ago
- Code for "Fusion Label Enhancement for Multi-Label Learning" in IJCAI-ECAI 2022.☆10Apr 4, 2023Updated 3 years ago
- ☆30Jun 19, 2023Updated 3 years ago
- Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity☆22Aug 28, 2025Updated 10 months ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- ☆23Nov 15, 2022Updated 3 years ago
- ☆20Nov 3, 2024Updated last year
- 1 提出了一种新的相似度损失(SimL),用于增大类间差异同时减小类内差异;2 SimL+CE优化CNN;3 基于电信号诊断轴承故障☆15Jun 4, 2024Updated 2 years ago
- ☆10Dec 21, 2022Updated 3 years ago
- Code and data for the paper "Why think step by step? Reasoning emerges from the locality of experience"☆65Apr 4, 2025Updated last year
- ☆13Jan 14, 2026Updated 5 months ago
- ☆21Oct 11, 2025Updated 8 months ago
- Open-Retrieval Conversational Machine Reading: A new setting & OR-ShARC dataset☆13Nov 19, 2022Updated 3 years ago
- ☆29Aug 27, 2025Updated 10 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Code and data for NAACL 2025 paper "IHEval: Evaluating Language Models on Following the Instruction Hierarchy"☆17Feb 25, 2025Updated last year
- Cross-Attention Guided Loss-Based Deep Dual-Branch Fusion Network for Liver Tumor Classification☆15Sep 26, 2024Updated last year
- The repository for our paper: Neighboring Perturbations of Knowledge Editing on Large Language Models☆16May 4, 2024Updated 2 years ago
- Code for training a language model reaction predictor. (To accompany our paper on the OOD evaluation of reaction predictors).☆12Jan 13, 2025Updated last year
- Official Repo for SparseLLM: Global Pruning of LLMs (NeurIPS 2024)☆69Mar 27, 2025Updated last year
- Docling workshops☆42May 27, 2026Updated last month
- code of Let the data choose: Flexible and Diverse Anchor Graph Fusion for Scalable Multi-view Clustering☆17Feb 28, 2023Updated 3 years ago