Hambaobao/Marathon

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Hambaobao/Marathon)

Hambaobao / Marathon

Marathon: A Multiple-choice Long Context Evaluation Benchmark for Large Language Models.

☆10

Alternatives and similar repositories for Marathon

Users that are interested in Marathon are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

RainBowLuoCS / MMEvol
View on GitHub
(ACL 2025) 🔥🔥🔥Code for "Empowering Multimodal Large Language Models with Evol-Instruct"
☆22May 15, 2025Updated last year
MozerWang / promISe
View on GitHub
[COLING 2024 (Oral)] PromISe:Releasing the Capabilities of LLMs with Prompt Introspective Search
☆23Aug 26, 2024Updated last year
Hambaobao / SWE-Flow
View on GitHub
SWE-Flow: Synthesizing Software Engineering Data in a Test-Driven Manner
☆40Jun 29, 2025Updated last year
maitrix-org / dynamic-alignment-optimization
View on GitHub
[EMNLP'24 (Main)] DRPO(Dynamic Rewarding with Prompt Optimization) is a tuning-free approach for self-alignment. DRPO leverages a search-…
☆24Nov 17, 2024Updated last year
RainBowLuoCS / DEEM
View on GitHub
(ICLR 2025 Spotlight) DEEM: Official implementation of Diffusion models serve as the eyes of large language models for image perception.
☆51Jul 1, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
IPBench / IPBench
View on GitHub
[ACL 2026] Repository of IPBench
☆23Apr 6, 2026Updated 3 months ago
ZNLP / Language-Imbalance-Driven-Rewarding
View on GitHub
[ICLR 2025] Language Imbalance Driven Rewarding for Multilingual Self-improving
☆25Apr 6, 2026Updated 3 months ago
MozerWang / Loong
View on GitHub
[EMNLP 2024 (Oral)] Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA
☆155Dec 22, 2025Updated 7 months ago
apple / ml-mia-bench
View on GitHub
This repo contains code and data for ICLR 2025 paper MIA-Bench: Towards Better Instruction Following Evaluation of Multimodal LLMs
☆38Mar 9, 2025Updated last year
II-Bench / II-Bench
View on GitHub
☆28Oct 28, 2024Updated last year
October2001 / ProLong
View on GitHub
[ACL 2024 (Oral)] A Prospector of Long-Dependency Data for Large Language Models
☆61Jul 23, 2024Updated 2 years ago
MozerWang / AMPO
View on GitHub
[ICLR 2026] Adaptive Social Learning via Mode Policy Optimization for Language Agents
☆51Feb 2, 2026Updated 5 months ago
ritzz-ai / PACS
View on GitHub
☆31Sep 12, 2025Updated 10 months ago
Yazdi9 / Stable-DreamFusion-NeRF
View on GitHub
Text-To-3D dream-fusion
☆13Apr 10, 2023Updated 3 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
CSUBioGroup / DeepEP
View on GitHub
a deep learning framework for essential protein prediction
☆13Mar 24, 2023Updated 3 years ago
hpcaitech / GPT-Demo
View on GitHub
GPT Demo with hybrid distributed training
☆10Dec 1, 2022Updated 3 years ago
Geaming2002 / Ruler
View on GitHub
Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models
☆40Sep 30, 2024Updated last year
Yuxin104 / Opt-GDBA
View on GitHub
☆14Nov 12, 2024Updated last year
gouzigouzi / attention-residuals-for-chinese-llms
View on GitHub
A Chinese-focused PyTorch framework for exploring Attention Residuals in Qwen3-style causal LMs, with baseline, Block AttnRes, Full AttnR…
☆19May 3, 2026Updated 2 months ago
vietansegan / segan
View on GitHub
Samplers, samplers, samplers
☆10Oct 31, 2016Updated 9 years ago
ljcleo / agent_sense
View on GitHub
Benchmarking Social Intelligence of Language Agents through Interactive Scenarios
☆13Jan 4, 2025Updated last year
Speakn0w / PlotCraft-Benchmark
View on GitHub
☆16Dec 10, 2025Updated 7 months ago
soyoung97 / AcuRank
View on GitHub
☆15Jul 30, 2025Updated 11 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
fedebotu / NeurIPS2022-OpenReviewData
View on GitHub
Crawl & Visualize NeurIPS 2022 Data from OpenReview
☆14Nov 8, 2022Updated 3 years ago
zchoi / SPT
View on GitHub
[TCSVT23] Official code for "SPT: Spatial Pyramid Transformer for Image Captioning".
☆10Aug 14, 2024Updated last year
Lyun0912-wu / LongAttn
View on GitHub
LongAttn ：Selecting Long-context Training Data via Token-level Attention
☆15Jul 16, 2025Updated last year
DNLab2024 / Mobile-LLaMA
View on GitHub
☆21Nov 1, 2024Updated last year
learnwithexamples / 3gpp_spec
View on GitHub
LTE and NR specifications, both zip and pdf versions.
☆16Dec 26, 2022Updated 3 years ago
WowCZ / LongMIT
View on GitHub
LongMIT: Essential Factors in Crafting Effective Long Context Multi-Hop Instruction Datasets
☆43Sep 30, 2024Updated last year
DeligientSloth / AdversialNLP
View on GitHub
☆15Jul 17, 2020Updated 6 years ago
gl-ybnbxb / BoNBoN
View on GitHub
☆19Jun 3, 2024Updated 2 years ago
Pi-Star-Lab / csce642-deepRL
View on GitHub
Assignments of CSCE-642: Deep Reinforcement Learning offered at Texas A&M University.
☆10Aug 31, 2025Updated 10 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
iwiwi / epochraft-hf-fsdp
View on GitHub
Example of using Epochraft to train HuggingFace transformers models with PyTorch FSDP
☆11Jan 29, 2024Updated 2 years ago
phucty / wtabhtml
View on GitHub
Tool to parse wiki tables from the HTML dump of Wikipedia
☆11Jun 12, 2022Updated 4 years ago
LARS-research / TREFE
View on GitHub
Searching a High Performance Feature Extractor for Text Recognition Network. TPAMI 2022
☆13Nov 25, 2022Updated 3 years ago
jhliu17 / spectral-clustering.matlab
View on GitHub
An intuitive implementation of spectral clustering on matlab
☆16Apr 23, 2021Updated 5 years ago
xiaoboxia / RTM_LNL
View on GitHub
Regularly Truncated M-estimators for Learning with Noisy Labels
☆11Apr 24, 2024Updated 2 years ago
MozerWang / DEMO
View on GitHub
[ACL 2025 (Findings)] DEMO: Reframing Dialogue Interaction with Fine-grained Element Modeling
☆22Dec 16, 2024Updated last year
gokulp01 / meta-qlearning-humanoid
View on GitHub
Meta QLearning experiments to optimize robot walking patterns
☆26Aug 21, 2024Updated last year