adobe-research / NoLiMaLinks
Official repository for "NoLiMa: Long-Context Evaluation Beyond Literal Matching"
☆128Updated 3 weeks ago
Alternatives and similar repositories for NoLiMa
Users that are interested in NoLiMa are comparing it to the libraries listed below
Sorting:
- A comprehensive repository of reasoning tasks for LLMs (and beyond)☆448Updated 10 months ago
- ☆130Updated 4 months ago
- Plug-and-play tree search for agents☆259Updated 2 weeks ago
- 🤗 Benchmark Large Language Models Reliably On Your Data☆381Updated this week
- Official repository for "DynaSaur: Large Language Agents Beyond Predefined Actions"☆347Updated 7 months ago
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆146Updated 5 months ago
- ⚖️ Awesome LLM Judges ⚖️☆108Updated 3 months ago
- Collection of scripts and notebooks for OpenAI's latest GPT OSS models☆222Updated this week
- ☆155Updated 3 months ago
- Inference-time scaling for LLMs-as-a-judge.☆272Updated 3 weeks ago
- ☆118Updated 11 months ago
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆173Updated 6 months ago
- A framework for optimizing DSPy programs with RL☆96Updated this week
- code for training & evaluating Contextual Document Embedding models☆196Updated 2 months ago
- Simple & Scalable Pretraining for Neural Architecture Research☆283Updated this week
- Guaranteed Structured Output from any Language Model via Hierarchical State Machines☆142Updated 2 months ago
- Official PyTorch implementation for Hogwild! Inference: Parallel LLM Generation with a Concurrent Attention Cache☆116Updated 3 weeks ago
- Accelerating your LLM training to full speed! Made with ❤️ by ServiceNow Research☆218Updated this week
- ☆146Updated 7 months ago
- Hallucinations (Confabulations) Document-Based Benchmark for RAG. Includes human-verified questions and answers.☆197Updated this week
- Tutorial for building LLM router☆221Updated last year
- Routing on Random Forest (RoRF)☆187Updated 10 months ago
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆72Updated 4 months ago
- II-Researcher: a new open-source framework designed to aid building search / research agents☆457Updated last week
- Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse …☆573Updated this week
- Sandboxed code execution for AI agents, locally or on the cloud. Massively parallel, easy to extend. Powering SWE-agent and more.☆278Updated this week
- Train your own SOTA deductive reasoning model☆104Updated 5 months ago
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.☆318Updated 9 months ago
- ☆133Updated 3 months ago
- ☆102Updated last month