Tiny QA Benchmark++ a micro-benchmark suite (52-item gold + on-demand multilingual synthetic packs), generator CLI, and CI-ready eval harness for ultra-fast LLM smoke-testing & regression-catching.
☆16Apr 30, 2026Updated 2 months ago
Alternatives and similar repositories for tiny_qa_benchmark_pp
Users that are interested in tiny_qa_benchmark_pp are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This is the official repo of "QuickLLaMA: Query-aware Inference Acceleration for Large Language Models"☆54Jul 16, 2024Updated last year
- Seamless Voice Interactions with LLMs☆12Oct 28, 2023Updated 2 years ago
- ☆10Oct 24, 2024Updated last year
- JAX Scalify: end-to-end scaled arithmetics☆18Oct 30, 2024Updated last year
- A Structured Output Benchmark whose 'ground-truth' is actually right☆19Dec 5, 2025Updated 6 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Source code for GreaTer ICLR 2025 - Gradient Over Reasoning makes Smaller Language Models Strong Prompt Optimizers☆36Apr 18, 2025Updated last year
- MPI Code Generation through Domain-Specific Language Models☆16Nov 19, 2024Updated last year
- Example for Logging LLM Evaluator Prompt Responses☆18Aug 14, 2023Updated 2 years ago
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆16Jun 16, 2024Updated 2 years ago
- Python library to add support for embedding natural code in Python with shared program state.☆30Jan 20, 2026Updated 5 months ago
- Improving transparency of large language models' reasoning☆15Nov 25, 2025Updated 7 months ago
- ☆15Mar 12, 2024Updated 2 years ago
- RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment☆18Dec 19, 2024Updated last year
- [EMNLP 2024] Tree of Problems: Improving structured problem solving with compositionality☆20Mar 4, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- A Data Source for Reasoning Embodied Agents☆19Sep 18, 2023Updated 2 years ago
- ☆16Jul 23, 2024Updated last year
- UQ: Assessing Language Models on Unsolved Questions☆30Aug 26, 2025Updated 10 months ago
- A Gradio app for analyzing audio files to determine true sample rate and bit depth.☆20Sep 17, 2024Updated last year
- Official implementation of Data Contamination Can Cross Language Barriers☆12Sep 11, 2024Updated last year
- ☆12Apr 25, 2026Updated 2 months ago
- ☆16Apr 10, 2025Updated last year
- ☆20Jan 27, 2024Updated 2 years ago
- Official pytorch implementation for TVQ-VAE☆12Feb 27, 2024Updated 2 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- ☆22Sep 29, 2024Updated last year
- [ICCV2025] WikiAutoGen offical page☆25Feb 6, 2026Updated 4 months ago
- ☆29Aug 27, 2025Updated 10 months ago
- Automatic Integration for Neural Spatio-Temporal Point Process models (AI-STPP) is a new paradigm for exact, efficient, non-parametric inf…☆25Oct 14, 2024Updated last year
- PyTorch code for "ADEM-VL: Adaptive and Embedded Fusion for Efficient Vision-Language Tuning"☆21Oct 28, 2024Updated last year
- PDF FLARE demo with Langchain and Cassandra as Vector Store☆14Nov 15, 2023Updated 2 years ago
- 给 AI 一篇文章,自动生成一款可以玩的 RPG Maker MZ 游戏☆88Mar 5, 2026Updated 3 months ago
- [ACL 2026 Oral] Official implementation of LaMI: Augmenting Large Language Models via Late Multi-Image Fusion☆19May 18, 2026Updated last month
- ☆19Nov 4, 2025Updated 7 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- The official implementation of the paper "Reducing Fine-Tuning Memory Overhead by Approximate and Memory-Sharing Backpropagation"☆21Dec 10, 2024Updated last year
- ☆23Sep 2, 2025Updated 9 months ago
- PyTorch code for System-1.x: Learning to Balance Fast and Slow Planning with Language Models☆25Jul 22, 2024Updated last year
- ☆65May 21, 2026Updated last month
- About The official GitHub page for ''Unleashing the Potential of Large Language Models as Prompt Optimizers: An Analogical Analysis with …☆30Dec 12, 2024Updated last year
- ☁️ KUMO: Generative Evaluation of Complex Reasoning in Large Language Models☆20Jun 4, 2025Updated last year
- [NAACL 2025] Source code for MMEvalPro, a more trustworthy and efficient benchmark for evaluating LMMs☆25Sep 26, 2024Updated last year