JoshuaPurtell / LRCBenchLinks

Evals meant to evaluate language models' ability to reason over long contexts.

☆10

Alternatives and similar repositories for LRCBench

Users that are interested in LRCBench are comparing it to the libraries listed below

Sorting:

zhudotexe / redel
ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems. (EMNLP 2024 Demo)
☆84Updated 4 months ago
xjdr-alt / llmri
look how they massacred my boy
☆63Updated 9 months ago
ai8hyf / OpenResearchAssistant
An automated tool for discovering insights from research papaer corpora
☆138Updated last year
BBischof / yapping
Verbosity control for AI agents
☆65Updated last year
doomslide / hyperobject
Plotting (entropy, varentropy) for small LMs
☆98Updated 2 months ago
diicellman / dspy-gradio-rag
RAG example using DSPy, Gradio, FastAPI
☆83Updated last year
allenai / genesys
Source code and utilities for the Genesys distributed language model architecture discovery system.
☆47Updated last month
QuixiAI / kraken
☆66Updated last year
QuixiAI / dolphin-logger
☆102Updated last month
sam-paech / diplobench
Benchmark for LLMs playing full press diplomacy
☆53Updated 5 months ago
brendanhogan / picoDeepResearch
☆65Updated 2 months ago
swairshah / Intensify
coloring terminal text with intensities (used for plotting probability, entropy with tokens)
☆12Updated 10 months ago
davidberenstein1957 / dataset-viber
Dataset Viber is your chill repo for data collection, annotation and vibe checks.
☆47Updated 11 months ago
catena-labs / moa-llm
A Python library to orchestrate LLMs in a neural network-inspired structure
☆50Updated 10 months ago
jerber / arc_agi
☆56Updated last month
AtakanTekparmak / tiny_fnc_engine
tiny_fnc_engine is a minimal python library that provides a flexible engine for calling functions extracted from a LLM.
☆38Updated 11 months ago
louisbrulenaudet / ragoon
High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡
☆66Updated 9 months ago
JD-P / RetroInstruct
Synthetic data derived by templating, few shot prompting, transformations on public domain corpora, and monte carlo tree search.
☆32Updated 5 months ago
Alignment-Lab-AI / KnowledgeBase
never forget anything again! combine AI and intelligent tooling for a local knowledge base to track catalogue, annotate, and plan for you…
☆37Updated last year
yoheinakajima / autofinetune
auto fine tune of models with synthetic data
☆76Updated last year
SinatrasC / entropix-smollm
smolLM with Entropix sampler on pytorch
☆150Updated 9 months ago
Xalp / ECHO
Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)
☆91Updated 6 months ago
teknium1 / ShareGPT-Builder
☆116Updated 7 months ago
SohamGovande / podplex
🦾💻🌐 distributed training & serverless inference at scale on RunPod
☆18Updated last year
GoodAI / goodai-ltm-benchmark
A library for benchmarking the Long Term Memory and Continual learning capabilities of LLM based agents. With all the tests and code you…
☆76Updated 7 months ago
marketagents-ai / MarketAgents
A distributed agent orchestration framework for market agents
☆105Updated this week
Doriandarko / MLX-GRPO
A pure MLX-based training pipeline for fine-tuning LLMs using GRPO on Apple Silicon.
☆42Updated 6 months ago
OpenPipe / deductive-reasoning
Train your own SOTA deductive reasoning model
☆104Updated 5 months ago
vgel / logitloom
explore token trajectory trees on instruct and base models
☆134Updated 2 months ago
tom-doerr / simpledspy
☆113Updated last month