suyoumo / DeepClaude_BenchmarkLinks
This project is designed to evaluate the effectiveness of DeepClaude and other combination models.
☆41Updated 9 months ago
Alternatives and similar repositories for DeepClaude_Benchmark
Users that are interested in DeepClaude_Benchmark are comparing it to the libraries listed below
Sorting:
- DeepClaude Rust的升级版本☆209Updated 8 months ago
- ☆209Updated last month
- ☆291Updated 5 months ago
- LLM Rag Intelligent Q&A Robot☆85Updated 3 months ago
- Repo-level benchmark for real-world Code Agents: from repo understanding → env setup → incremental dev/bug-fixing → task delivery, with c…☆240Updated 3 months ago
- A powerful multi-format file parsing, data cleaning, and AI annotation toolkit.☆142Updated last week
- ☆121Updated this week
- 一个用于分析创业公司数据的综合平台,包含爬虫系统、数据分析工具、创业评估AI模型、Web端和小程序端☆117Updated 7 months ago
- 🔐 企业级 AI API 安全代理 - 安全访问 DeepSeek API,无需在前端暴露密钥;🔐 Enterprise-grade AI API security proxy - Securely access DeepSeek API without exposin…☆57Updated 4 months ago
- ☆201Updated last week
- ☆219Updated 6 months ago
- EvaLearn is a pioneering benchmark designed to evaluate large language models (LLMs) on their learning capability and efficiency in chall…☆428Updated 2 months ago
- ☆185Updated 4 months ago
- A database operations and data analysis AI agent☆430Updated 3 months ago
- 🌐Web Agent Protocol (WAP) - Record and replay user interactions in the browser with MCP support☆486Updated 6 months ago
- 超能文献|AI驱动的文档翻译与学术搜索服务。支持PDF、DOCX、PPTX等多格式文档的高质量翻译(支持11种语言),特别优化了数学公式翻译。同时提供PubMed学术文献智能搜索功能。更多访问:https://suppr.wilddata.cn☆170Updated last month
- Valuation of tokens corresponding to influential individuals on social platforms through AI algorithms☆229Updated 3 months ago
- DPO-Shift: Shifting the Distribution of Direct Preference Optimization☆60Updated 9 months ago
- ☆38Updated 8 months ago
- Marco Search Agent for Realistic and Challenging Agentic Search☆238Updated last month
- AI 笔试助手,解题助手,在编码笔试或面试时,借助AI实时提供解题思路和答案。A interview assistant that leverages AI to provide real-time solutions during coding interviews.☆252Updated 3 weeks ago
- (EMNLP 2025 Findings) Source Evaluation scripts for Humanity's Last Code Exam☆95Updated 4 months ago
- 从0训练类 o1 大语言模型。☆132Updated last week
- 一个基于多个大语言模型的智能学术范文写作系统,能够根据输入的开题报告或研究设计文档,自动生成包含引用的学术范文的各章节内容。☆235Updated 5 months ago
- Dataset and evaluation code of ISDrama(ACM-MM 2025): Immersive Spatial Drama Generation through Multimodal Prompting☆236Updated 4 months ago
- 4th Place Solution for the Kaggle Competition: LMSYS - Chatbot Arena Human Preference Predictions☆171Updated last year
- An AI-powered multi-agent platform for automated investment research — combining LLM reasoning, RAG retrieval, and real-time market data …☆149Updated last month
- ☆175Updated 3 months ago
- ☆82Updated 2 months ago
- A multimodal personal assistant that allows Large Language Models (LLMs) to run code locally, acting as an autonomous agent capable of co…☆206Updated 11 months ago