☆57Jul 31, 2025Updated 10 months ago
Alternatives and similar repositories for agentic-benchmarks
Users that are interested in agentic-benchmarks are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- SysX☆36Mar 27, 2024Updated 2 years ago
- [VLDB'2025] LEAP: LLM-powered End-to-end Automatic Library for Processing Social Science Queries on Unstructured Data☆20Nov 3, 2025Updated 7 months ago
- ☆23Dec 28, 2023Updated 2 years ago
- ☆22Aug 1, 2021Updated 4 years ago
- ☆144Jul 2, 2024Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆12Jan 2, 2024Updated 2 years ago
- Face Recognition on NVIDIA TX2☆10Sep 5, 2018Updated 7 years ago
- [ICLR 2026] Meta-RL Induces Exploration in Language Agents☆42Feb 1, 2026Updated 4 months ago
- CVE-Bench: A Benchmark for AI Agents’ Ability to Exploit Real-World Web Application Vulnerabilities☆237Jan 14, 2026Updated 5 months ago
- Code for Paper (Policy Optimization in RLHF: The Impact of Out-of-preference Data)☆29Dec 19, 2023Updated 2 years ago
- Code for ICLR 2022 Paper (HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning)☆12Nov 28, 2023Updated 2 years ago
- Fuzzing Automatic Differentiation in Deep-Learning Libraries (ICSE'23)☆27Mar 2, 2024Updated 2 years ago
- Building self-refined guardrails via DSPy☆14Jul 2, 2024Updated last year
- ICLR 2026: Agent-X Evaluating Deep Multimodal Reasoning in Vision-Centric Agentic Tasks☆42Apr 28, 2026Updated last month
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Enable RNNLM lattice rescoring with Pytorch [kaldi]☆12Jun 5, 2020Updated 6 years ago
- This project applies Monte Carlo Tree Search (MCTS) to a simple grid world.☆10May 30, 2018Updated 8 years ago
- Code for Paper (Preserving Diversity in Supervised Fine-tuning of Large Language Models)☆58May 12, 2025Updated last year
- Falcon Evaluate is an open-source Python library aims to revolutionise the LLM - RAG evaluation process by offering a low-code solution. …☆14Jan 31, 2024Updated 2 years ago
- Design + Code for kelsanford.design☆11Jul 1, 2015Updated 10 years ago
- ☆14May 30, 2019Updated 7 years ago
- Access MS SQL from Node.js using Edge.js☆27Jan 28, 2025Updated last year
- Builds a WMT18-like corpus for word-level QE with annotations in the source and target words.☆10Sep 19, 2022Updated 3 years ago
- General purpose benchmarking tool for turbopuffer deployments☆38Updated this week
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Ant Gather and Ant Maze envs, separated from RLLab☆11Aug 2, 2018Updated 7 years ago
- Source Code for "Improved Embeddings for Learning Prerequisite Chains" (CPSC 490 - Senior Project)☆11May 2, 2019Updated 7 years ago
- ☆10Jun 5, 2025Updated last year
- [ESEC/FSE'23] Hue: A User-Adaptive Parser for Hybrid Logs☆10Aug 24, 2023Updated 2 years ago
- Few shot learning in NLP☆11Oct 1, 2020Updated 5 years ago
- CVE-2023-28121 - WooCommerce Payments < 5.6.2 - Unauthenticated Privilege Escalation [ Mass Add Admin User ]☆11Jul 14, 2023Updated 2 years ago
- ☆12Mar 8, 2020Updated 6 years ago
- AI agent skill for building modern, composable, and accessible React UI components following the components.build specification☆50Jan 28, 2026Updated 4 months ago
- Mesh-native protocol and Rust connector library for secure AI agent integration with external services: Twitter, Linear, Stripe, Discord,…☆80Updated this week
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- JOYTOU is a BootStrap blog template developed by Joytou Wu.☆10Feb 5, 2020Updated 6 years ago
- Levin tree search guided by both a policy and a heuristic function☆19Jul 13, 2023Updated 2 years ago
- Normalize text string☆12Nov 6, 2018Updated 7 years ago
- Fuzzing Deep-Learning Libraries via Automated Relational API Inference (ESEC/FSE 2022)☆39May 17, 2023Updated 3 years ago
- An implementation of Tiling and Corruption (TACo) Augmentations for OCR/HTR☆15Dec 4, 2021Updated 4 years ago
- ☆34Sep 19, 2025Updated 8 months ago
- Python probabilistic PCA (PPCA) implementation.☆13Nov 28, 2018Updated 7 years ago