JoshuaPurtell / LRCBenchLinks
Evals meant to evaluate language models' ability to reason over long contexts.
☆9Updated 8 months ago
Alternatives and similar repositories for LRCBench
Users that are interested in LRCBench are comparing it to the libraries listed below
Sorting:
- ☆66Updated last year
- A repository of projects and datasets under active development by Alignment Lab AI☆22Updated last year
- ☆59Updated 2 weeks ago
- Code for Columbia University COMS 3997 – LLM Ethics and Foundations☆14Updated 5 months ago
- Chat Markup Language conversation library☆55Updated last year
- look how they massacred my boy☆63Updated 7 months ago
- ☆114Updated 5 months ago
- ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems. (EMNLP 2024 Demo)☆78Updated 2 months ago
- Synthetic data derived by templating, few shot prompting, transformations on public domain corpora, and monte carlo tree search.☆32Updated 3 months ago
- Lego for GRPO☆28Updated last week
- The original BabyAGI, updated with LiteLLM and no vector database reliance (csv instead)☆21Updated 8 months ago
- Train your own SOTA deductive reasoning model☆93Updated 3 months ago
- ☆19Updated last year
- auto fine tune of models with synthetic data☆75Updated last year
- A Python library to orchestrate LLMs in a neural network-inspired structure☆49Updated 8 months ago
- ☆38Updated 10 months ago
- Mine-tuning is a methodology for synchronizing human and AI attention.☆19Updated 11 months ago
- Training an LLM to use a calculator with multi-turn reinforcement learning, achieving a **62% absolute increase in evaluation accuracy**.☆38Updated last month
- 🦾💻🌐 distributed training & serverless inference at scale on RunPod☆17Updated last year
- ☆16Updated 4 months ago
- Verbosity control for AI agents☆63Updated last year
- coloring terminal text with intensities (used for plotting probability, entropy with tokens)☆12Updated 7 months ago
- Small, simple agent task environments for training and evaluation☆18Updated 7 months ago
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)☆90Updated 4 months ago
- ☆75Updated 2 weeks ago
- Testing paligemma2 finetuning on reasoning dataset☆18Updated 5 months ago
- tiny_fnc_engine is a minimal python library that provides a flexible engine for calling functions extracted from a LLM.☆38Updated 8 months ago
- BH hackathon☆14Updated last year
- ☆54Updated 4 months ago
- MLX port for xjdr's entropix sampler (mimics jax implementation)☆64Updated 7 months ago