JoshuaPurtell / LRCBench
Evals meant to evaluate language models' ability to reason over long contexts.
β9Updated 7 months ago
Alternatives and similar repositories for LRCBench:
Users that are interested in LRCBench are comparing it to the libraries listed below
- Synthetic data derived by templating, few shot prompting, transformations on public domain corpora, and monte carlo tree search.β32Updated last month
- π¦Ύπ»π distributed training & serverless inference at scale on RunPodβ17Updated 11 months ago
- β66Updated 11 months ago
- look how they massacred my boyβ63Updated 6 months ago
- ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems. (EMNLP 2024 Demo)β78Updated last month
- β38Updated 9 months ago
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)β90Updated 3 months ago
- Train your own SOTA deductive reasoning modelβ88Updated last month
- Chat Markup Language conversation libraryβ55Updated last year
- β48Updated 5 months ago
- tiny_fnc_engine is a minimal python library that provides a flexible engine for calling functions extracted from a LLM.β38Updated 7 months ago
- A Python library to orchestrate LLMs in a neural network-inspired structureβ46Updated 6 months ago
- LLM reads a paper and produce a working prototypeβ52Updated 2 weeks ago
- A repository of projects and datasets under active development by Alignment Lab AIβ22Updated last year
- Clue inspired puzzles for testing LLM deduction abilitiesβ33Updated last month
- MLX port for xjdr's entropix sampler (mimics jax implementation)β64Updated 5 months ago
- Glyphs, acting as collaboratively defined symbols linking related concepts, add a layer of multidimensional semantic richness to user-AI β¦β49Updated 2 months ago
- smolLM with Entropix sampler on pytorchβ151Updated 5 months ago
- An introduction to LLM Samplingβ77Updated 4 months ago
- Simple GRPO scripts and configurations.β58Updated 2 months ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optunaβ39Updated 2 months ago
- β14Updated 3 months ago
- Entropy Based Sampling and Parallel CoT Decodingβ17Updated 6 months ago
- A Collection of Pydantic Models to Abstract IRLβ18Updated this week
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?β63Updated last month
- Lego for GRPOβ27Updated 3 weeks ago
- β51Updated 5 months ago
- Using multiple LLMs for ensemble Forecastingβ16Updated last year
- β20Updated last year
- The original BabyAGI, updated with LiteLLM and no vector database reliance (csv instead)β21Updated 6 months ago