这是一个基于OpenCompass的模型评测系统,该系统提供了前端页面UI以方便用户自助开展评测工作。
☆27Aug 25, 2025Updated 8 months ago
Alternatives and similar repositories for ai-eval-system
Users that are interested in ai-eval-system are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 基于轻量级 Qwen2.5-0.5B 和 SigLIP 的视觉语言多模态模型实现,包含训练和 SFT 代码。分享训练和 SFT 相关代码,记录一下探索和学习的过程。欢迎一起交流讨论~☆20Aug 31, 2025Updated 8 months ago
- [ACL 2025] Are Your LLMs Capable of Stable Reasoning?☆33Aug 5, 2025Updated 9 months ago
- ☆10Jul 2, 2022Updated 3 years ago
- ☆25Mar 10, 2021Updated 5 years ago
- 扣子智能体 API Java SDK 是对扣子智能体的API进行了封装,方便Java开发者接入系统调用。☆31Sep 25, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Implementation of KDR-Agent, the AAAI 2025 accepted paper, focusing on knowledge-driven reasoning for autonomous agents.☆18Nov 24, 2025Updated 5 months ago
- SLM-SQL: An Exploration of Small Language Models for Text-to-SQL☆32Aug 12, 2025Updated 8 months ago
- Environments, tools, and benchmarks for general computer agents☆15Dec 3, 2024Updated last year
- [ICDE 2026] Text2SQL-Flow: A Robust SQL-Aware Data Augmentation Framework for Text-to-SQL☆31Mar 25, 2026Updated last month
- This is the official repository of the paper "Atomic-to-Compositional Generalization for Mobile Agents with A New Benchmark and Schedulin…☆14Jul 27, 2025Updated 9 months ago
- ☆17Apr 11, 2025Updated last year
- 2023年iThome鐵人賽「AI & Data」組佳作【30天內成為NLP大師:掌握關鍵工具和技巧】完整程式碼,該文章會從零開始教你該如何微調大型語言模型☆18Nov 21, 2024Updated last year
- The official implementation of the paper "Self-Updatable Large Language Models by Integrating Context into Model Parameters"☆15May 18, 2025Updated 11 months ago
- CozeBot-WxworkPro 是一个集成了AI应用开发平台“扣子”的企微脚本,能够快速构建基于大模型的各种Bot,自动处理企业微信中的消息,提高工作效率。☆16Aug 7, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- This is the public repository of AAAI 2024 paper "Is a Large Language Model a Good Annotator for Event Extraction"☆10Feb 16, 2024Updated 2 years ago
- A collection of useful functions for KUKA KRL language☆13Jul 1, 2023Updated 2 years ago
- 3D Model(Autodesk DWG and DXF) to Pdf Conversion and Text Extraction using AutoCAD 2016 and AutoCAD API (ObjectARX)☆20Sep 5, 2016Updated 9 years ago
- ☆26Apr 10, 2025Updated last year
- Implementation of 12 AI agents evaluation techniques☆43Jul 31, 2025Updated 9 months ago
- SQLite3 database sink for spdlog☆16May 14, 2019Updated 6 years ago
- Lark api reverse engineering / 飞书 API 逆向工程☆12Jun 26, 2023Updated 2 years ago
- 智能agent开发的baseline☆27Jul 26, 2025Updated 9 months ago
- The code and datasets of our ACM MM 2024 paper "Hallu-PI: Evaluating Hallucination in Multi-modal Large Language Models within Perturbed …☆11Sep 27, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Universal AI-powered code reviewer using vLLM and/or Ollama provided local LLMs. Works with any language/project. Features persona system…☆43Mar 11, 2026Updated last month
- (ACL 2025) Divide-Then-Aggregate: An Efficient Tool Learning Method via Parallel Tool Invocation☆12May 21, 2025Updated 11 months ago
- ☆19Feb 24, 2025Updated last year
- A生成测试用例:基于页面和需求文档内容结合自动生成测试用例,解决单个需求文档生成测试用例质量较差的问题。Test Case Generation: Automatically generate test cases based on the current page and…☆56Apr 10, 2025Updated last year
- Instruction Following Eval☆17Jan 16, 2025Updated last year
- Paper Fetcher Project 是一个开源的 Python 项目,旨在自动化从多种学术资源(例如 ArXiv、Google Scholar 和 PubMed)抓取学术论文的过程。该工具可以定时抓取并去重保存已获取的论文数据,帮助研究人员保持文献的更新和管理。☆24Nov 20, 2024Updated last year
- H1ve-theme和CTFd-owl汉化☆18Nov 10, 2022Updated 3 years ago
- Whisper to Normal Speech Conversion with SC-MelGAN and SC-VQ-VAE☆15Dec 3, 2022Updated 3 years ago
- Benchmarking LLM Inference Speeds☆13Apr 7, 2026Updated 3 weeks ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Cog wrapper for playgroundai/playground-v2.5-1024px-aesthetic☆17Nov 25, 2024Updated last year
- ObjectZRX(VC)开发基础与实例教程(代码),内容包含zwcad二次开发学习步骤和计划、源代码、中文开发技术帮助文档。学习该教程可以参照《AutoCAD ObjectARX(VC)开发基础与实例教程-张帆》学习。示例代码基于ZWCAD2020、objectzrx20…☆13Oct 20, 2020Updated 5 years ago
- I don't want to maintain this project, the code probably won't compile or run. Archived.☆13Feb 25, 2024Updated 2 years ago
- nuxt3 three.js 3d地图 大屏可视化模板☆24Jul 19, 2024Updated last year
- 前端性能监控工具☆11Aug 19, 2020Updated 5 years ago
- 用 Go 编写的博客爬虫,定期抓取并更新 xargin.com 上的文章信息。程序将文章信息(包括标题、发表时间、阅读时间和 URL)存储在一个 Markdown 文件中,并使用 GitHub Actions 每小时自动更新。☆11Nov 27, 2024Updated last year
- Python - 100天从新手到大师☆10Oct 14, 2021Updated 4 years ago