JoshuaPurtell / LRCBench
Evals meant to evaluate language models' ability to reason over long contexts.
β9Updated 6 months ago
Alternatives and similar repositories for LRCBench:
Users that are interested in LRCBench are comparing it to the libraries listed below
- π¦Ύπ»π distributed training & serverless inference at scale on RunPodβ17Updated 10 months ago
- ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems. (EMNLP 2024 Demo)β74Updated 2 weeks ago
- β51Updated 4 months ago
- β66Updated 10 months ago
- β17Updated 3 months ago
- The next evolution of Agentsβ48Updated 2 weeks ago
- Synthetic data derived by templating, few shot prompting, transformations on public domain corpora, and monte carlo tree search.β31Updated last month
- Train your own SOTA deductive reasoning modelβ81Updated 3 weeks ago
- A Python library to orchestrate LLMs in a neural network-inspired structureβ46Updated 5 months ago
- tiny_fnc_engine is a minimal python library that provides a flexible engine for calling functions extracted from a LLM.β38Updated 6 months ago
- look how they massacred my boyβ63Updated 5 months ago
- Testing paligemma2 finetuning on reasoning datasetβ18Updated 3 months ago
- Verbosity control for AI agentsβ60Updated 10 months ago
- β20Updated last year
- Small, simple agent task environments for training and evaluationβ18Updated 5 months ago
- MLX port for xjdr's entropix sampler (mimics jax implementation)β63Updated 4 months ago
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing β‘β67Updated 4 months ago
- Lego for GRPOβ25Updated 2 weeks ago
- β48Updated 4 months ago
- β83Updated last month
- β111Updated 3 months ago
- The original BabyAGI, updated with LiteLLM and no vector database reliance (csv instead)β21Updated 6 months ago
- auto fine tune of models with synthetic dataβ75Updated last year
- β41Updated 11 months ago
- Leveraging DSPy for AI-driven task understanding and solution generation, the Self-Discover Framework automates problem-solving through rβ¦β60Updated 8 months ago
- β48Updated last year
- Just a bunch of benchmark logs for different LLMsβ119Updated 8 months ago
- Using various instructor clients evaluating the quality and capabilities of extractions and reasoning.β50Updated 6 months ago
- Using multiple LLMs for ensemble Forecastingβ16Updated last year
- Mine-tuning is a methodology for synchronizing human and AI attention.β17Updated 9 months ago