☆25Nov 19, 2025Updated 3 months ago
Alternatives and similar repositories for WildVisualizer
Users that are interested in WildVisualizer are comparing it to the libraries listed below
Sorting:
- ☆49Apr 4, 2025Updated 10 months ago
- ☆16Sep 4, 2025Updated 5 months ago
- AIRS-Bench: an AI Research Science benchmark for quantifying the end-to-end AI research abilities of LLM agents☆60Feb 17, 2026Updated last week
- From Word to World: Can Large Language Models be Implicit Text-based World Models?☆46Dec 25, 2025Updated 2 months ago
- Fast, permanent and flexible patterns for sharing and computing on texts with metadata using Apache Arrow.☆15Mar 1, 2022Updated 3 years ago
- Official Repository for Paper "BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Compet…☆18Aug 28, 2024Updated last year
- ☆19Mar 25, 2025Updated 11 months ago
- Llemma formal2formal (tactic prediction) theorem proving experiments☆20Oct 17, 2023Updated 2 years ago
- Example formalization of Game Theoretic concepts in Lean☆25Feb 14, 2025Updated last year
- Official Implementation of UA^{2}-Agent and other baseline algorithms of "Towards Unified Alignment Between Agents, Humans, and Environme…☆19Nov 12, 2024Updated last year
- Exploration using DSPy to optimize modules to maximize performance on the OpenToM dataset☆27Mar 6, 2024Updated last year
- Code for "[COLM'25] RepoST: Scalable Repository-Level Coding Environment Construction with Sandbox Testing"☆23Mar 18, 2025Updated 11 months ago
- Official implementation of "Beyond Theorem Proving: Formulation, Framework and Benchmark for Formal Problem-Solving"☆29May 8, 2025Updated 9 months ago
- ☆21Aug 30, 2025Updated 5 months ago
- Official code and data repository of MathChat: MathChat: Benchmarking Mathematical Reasoning and Instruction Following in Multi-Turn Inte…☆22Jun 3, 2024Updated last year
- [EMNLP 2024 Findings] ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs☆29May 22, 2025Updated 9 months ago
- A comprehensive benchmark for evaluating deep research agents on academic survey tasks☆50Sep 4, 2025Updated 5 months ago
- This repository contains data, code and models for contextual noncompliance.☆25Jul 18, 2024Updated last year
- To Think or Not to Think: Exploring the Unthinking Vulnerability in Large Reasoning Models☆33May 21, 2025Updated 9 months ago
- Rubik ESP32 esp-idf Device driver library.☆12Jul 3, 2021Updated 4 years ago
- ☆76Jan 8, 2026Updated last month
- The first spoken long-text dataset derived from live streams, designed to reflect the redundancy-rich and conversational nature of real-w…☆12Jun 28, 2025Updated 7 months ago
- A Text2SQL benchmark for evaluation of Large Language Models☆41Updated this week
- The original Shared Recurrent Memory Transformer implementation☆33Jul 11, 2025Updated 7 months ago
- A framework for pitting LLMs against each other in an evolving library of games ⚔☆35Apr 17, 2025Updated 10 months ago
- A long-horizon, sparse-reward math environment for reinforcement learning. Official code repo for "What makes Math problems hard for rein…☆32Aug 11, 2025Updated 6 months ago
- [AAAI'26 Oral] Official Implementation of STAR-1: Safer Alignment of Reasoning LLMs with 1K Data☆33Apr 7, 2025Updated 10 months ago
- Auditing agents for fine-tuning safety☆18Oct 21, 2025Updated 4 months ago
- The Good, The Bad, and The Greedy: Evaluation of LLMs Should Not Ignore Non-Determinism☆30Jul 17, 2024Updated last year
- ☆18Jun 10, 2025Updated 8 months ago
- Verilog code for a low power RFID chip that will communicate with I2C sensors.☆13Apr 18, 2014Updated 11 years ago
- Bayes-Adaptive RL for LLM Reasoning☆45May 28, 2025Updated 9 months ago
- AgenTracer: A Lightweight Failure Attributor for Agentic Systems☆76Nov 12, 2025Updated 3 months ago
- MobileLLM-R1☆75Sep 30, 2025Updated 4 months ago
- High-resolution time-to-digital converter in the Red Pitaya Zynq-7010 SoC☆10Jul 12, 2020Updated 5 years ago
- ☆72Jan 29, 2026Updated last month
- Maintenance Information Extraction (MaintIE)☆16Jun 29, 2024Updated last year
- LC6500DMD python control☆11Nov 15, 2016Updated 9 years ago
- ☆11Jun 22, 2025Updated 8 months ago