bird-bench / BIRD-CRITIC-1Links
[NeurIPS 2025 Main] SWE-SQL: Illuminating LLM Pathways to Solve User SQL Issues in Real-World Applications
☆771Updated 3 weeks ago
Alternatives and similar repositories for BIRD-CRITIC-1
Users that are interested in BIRD-CRITIC-1 are comparing it to the libraries listed below
Sorting:
- ☆354Updated 4 months ago
- Science-Star: A Platform for Building, Extending, and Experimenting with Scientific Agents.☆734Updated 2 weeks ago
- RAS: Retrieval-And-Structuring for Knowledge-Intensive LLM Generation☆56Updated last month
- Repo-level benchmark for real-world Code Agents: from repo understanding → env setup → incremental dev/bug-fixing → task delivery, with c…☆230Updated last month
- https://dev.to/answeryt/the-demo-spell-and-production-dilemma-of-ai-agents-how-i-built-a-self-learning-agent-system-4okk☆2,089Updated last week
- 智川x-agent☆950Updated 2 months ago
- ☆514Updated 7 months ago
- [TKDE2025] Next-Generation Database Interfaces: A Survey of LLM-based Text-to-SQL | A curated list of resources (surveys, papers, benchma…☆437Updated this week
- We introduce the Audio Logical Reasoning (ALR) dataset, consisting of 6,446 text-audio annotated samples specifically designed for comple…☆1,097Updated 2 months ago
- Auto-Manage Your Personal Task Context with AI.☆937Updated last week
- ☆12Updated 8 months ago
- ☆1,115Updated 3 months ago
- AI-powered tool for efficient abstract and PDF screening in systematic reviews.☆1,302Updated 5 months ago
- ☆777Updated 2 months ago
- [Neurips 2025] R-KV: Redundancy-aware KV Cache Compression for Reasoning Models☆1,138Updated last week
- We introduce temporal working memory (TWM), which aims to enhance the temporal modeling capabilities of Multimodal foundation models (MFM…☆310Updated 9 months ago
- On Predictability of Reinforcement Learning Dynamics for Large Language Models☆20Updated 3 weeks ago
- MATEval is the first multi-agent framework simulating human collaborative discussion for open-ended text evaluation.☆28Updated 4 months ago
- A tool for translating the content of LaTeX documents into various other natural languages (e.g., translating an arXiv paper from English…☆407Updated last month
- ☆529Updated 8 months ago
- [BIRD-INTERACT] Re-imagines Text-to-SQL evaluation via lens of dynamic interactions.☆331Updated this week
- JittorGeometric is a Jittor-based graph machine learning library.☆322Updated last month
- UpTop is a BNB Chain-based liquidity protocol that allows users to unilaterally add BNB to liquidity pools, earn high yields, and support…☆75Updated 4 months ago
- 面向飞书聊天机器人的全功能AI服务器端实现,用一个容器,实现在飞书对话框里操作属于自己的Manus。☆507Updated 2 months ago
- Open-Tax is an AI-powered cloud platform transforming tax compliance through automated data integration, real-time anomaly detection, and…☆407Updated 8 months ago
- Tokenize The Virtual Agents Onchain☆241Updated 4 months ago
- Framework that enables fine-tuning of vision-language grounding models on custom datasets☆600Updated 6 months ago
- DeepWism R2 is a next-generation AGI system built on the T3CEDS framework (Thin-Thick-Thin Crowd Entropy Dynamics System), which redefine…☆1,019Updated 3 months ago
- 新数据洞察方式☆1,006Updated 4 months ago
- A database operations and data analysis AI agent☆426Updated last month