Official repository for "NoLiMa: Long-Context Evaluation Beyond Literal Matching"
☆198Jul 17, 2025Updated 10 months ago
Alternatives and similar repositories for NoLiMa
Users that are interested in NoLiMa are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆57Oct 10, 2025Updated 8 months ago
- A starter template to collect emails with NuxtJs using NuxtHub.☆26Jan 15, 2025Updated last year
- Use machine learning to predict lens distortion parameters from physical camera properties.☆17Jan 6, 2025Updated last year
- This repository contains training code for the Gemamba VLM☆14Jul 3, 2024Updated last year
- Public Evaluation Result Archieve for BFCL☆30Dec 17, 2025Updated 5 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Spectral Sphere Optimizer☆118Mar 23, 2026Updated 2 months ago
- A simple frontend page to interact with an OpenAI like API☆16Jan 31, 2025Updated last year
- Loomis Painter: Reconstructing the painting process☆54Nov 24, 2025Updated 6 months ago
- BABILong is a benchmark for LLM evaluation using the needle-in-a-haystack approach.☆249Jun 1, 2026Updated last week
- A RAG system designed to process documents with multimodal content. It can generate factual, context-aware answers to user queries, based…☆26Dec 13, 2024Updated last year
- Expose MCP tools for LLMs☆21Mar 22, 2025Updated last year
- A simple no-install web UI for Ollama and OAI-Compatible APIs!☆31Jan 30, 2025Updated last year
- We believe that every SOTA result is only valid on its own dataset. RAGView provides a unified evaluation platform to benchmark different…☆79Dec 5, 2025Updated 6 months ago
- Core, Junction, and VRAM temperature reader for Linux + GDDR6/GDDR6X GPUs☆86Oct 22, 2025Updated 7 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Use Codestral Mamba with Visual Studio Code and the Continue extension. A local LLM alternative to GitHub Copilot.☆30Jul 18, 2024Updated last year
- Code, results and other artifacts from the paper introducing the WildChat-50m dataset and the Re-Wild model family.☆38Apr 1, 2025Updated last year
- ☆13Apr 30, 2026Updated last month
- llmon-py is a multimodal webui for Llama 3-8B.☆15Jul 1, 2024Updated last year
- Documents the style side of the short-story Creative Writing LLM benchmark: we generated many short stories with a range of LLMs, then an…☆24Dec 18, 2025Updated 5 months ago
- ☆25Oct 6, 2023Updated 2 years ago
- smolbox of recipies☆29Apr 23, 2025Updated last year
- ☆41Nov 22, 2025Updated 6 months ago
- EvaByte: Efficient Byte-level Language Models at Scale☆119Apr 22, 2025Updated last year
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- ☆15Sep 24, 2024Updated last year
- Dockerfile and instructions for human pose estimation implementation using Caffe, OpenCV 3.1.0 and Python 2.7.☆12Mar 3, 2019Updated 7 years ago
- klmbr - a prompt pre-processing technique to break through the barrier of entropy while generating text with LLMs☆89Sep 22, 2024Updated last year
- BigKnow2022: Bringing Language Models Up to Speed☆16Mar 27, 2023Updated 3 years ago
- This repo contains the source code for RULER: What’s the Real Context Size of Your Long-Context Language Models?☆1,561Nov 13, 2025Updated 7 months ago
- Lego for GRPO☆30May 27, 2025Updated last year
- [EMNLP 2024] Source code for the paper "Learning Planning-based Reasoning with Trajectory Collection and Process Rewards Synthesizing".☆84Jan 14, 2025Updated last year
- QWOP AI using Q-learning☆12Jul 13, 2016Updated 9 years ago
- Modified version of fairseq, including new implementations for criterions using reinforcement learning methods.☆11Aug 14, 2019Updated 6 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- LongMIT: Essential Factors in Crafting Effective Long Context Multi-Hop Instruction Datasets☆43Sep 30, 2024Updated last year
- DeepSeek-V3.2-Exp DSA Warmup Lightning Indexer training operator based on tilelang☆44Nov 19, 2025Updated 6 months ago
- A Conversational Speech Generation Model☆14Mar 16, 2025Updated last year
- The offical repo for "LLMs Improving LLMs: Agentic Discovery for Test-Time Scaling"☆165May 15, 2026Updated last month
- MXNET + OpenAI Gym implementation of A3C from "Asynchronous Methods for Deep Reinforcement Learning"☆11Apr 10, 2017Updated 9 years ago
- Marathon: A Multiple-choice Long Context Evaluation Benchmark for Large Language Models.☆10May 16, 2024Updated 2 years ago
- Official implementation of ECCV24 paper: POA☆24Aug 8, 2024Updated last year