大语言模型评估平台,支持多种评估基准、自定义数据集和性能测试。支持基于自定义数据集的RAG评估。
☆83Aug 20, 2025Updated 7 months ago
Alternatives and similar repositories for llm-eval
Users that are interested in llm-eval are comparing it to the libraries listed below
Sorting:
- A streamlined and customizable framework for efficient large model (LLM, VLM, AIGC) evaluation and performance benchmarking.☆2,535Updated this week
- fufan-chat-api的前端项目☆27Nov 1, 2024Updated last year
- A performance load tests platform base python3+vue3+locust+grafana,cool and user-friendly(性能测试平台)☆13Apr 22, 2024Updated last year
- 等保测评文档☆12Dec 18, 2018Updated 7 years ago
- 灵猫智能管理平台是一个在线web测试项目与测试工具管理平台,通过灵猫智能快速敏捷的灵活性,实现项目管理、用例管理、模块管理、UI自动化测试管理、小工具应用等等系统的测试☆11Jun 21, 2021Updated 4 years ago
- 本系统是集工单统计、任务钩子、权限管理、灵活配置流程与模版等等于一身的开源工单系统,当然也可以称之为工作流引擎。 致力于减少跨部门之间的沟通,自动任务的执行,提升工作效率与工作质量,减少不必要的工作量与人为出错率。☆12Apr 4, 2022Updated 3 years ago
- A helm chart for deploying Neoload Web on your Kubernetes cluster☆13Updated this week
- 大模型API企业网关,公司内部API管理,分发聚和系统,支持将多种大模型转换成统一的OpenAI兼容接口,尤其对国内开源模型deepseek,qwen,kimi,glm提供特别支持 可供个人或者企业内部大模型API统一管理和渠道分发使用(key管理与二次分发),长期更新,支…☆40Sep 12, 2025Updated 6 months ago
- Web Based Iperf Result Real-time Visualization☆15Apr 26, 2019Updated 6 years ago
- This project is a deliberately vulnerable environment to learn about LLM-specific risks based on the OWASP Top 10 for LLM Applications.☆52Jan 19, 2026Updated 2 months ago
- 本文提出了一个基于“文心一言”的中国LLMs的安全评估基准,其中包括8种典型的安全场景和6种指令攻击类型。此外,本文还提出了安全评估的框架和过程,利用手动编写和收集开源数据的测试Prompts,以及人工干预结合利用LLM强大的评估能力作为“共同评估者”。☆33Sep 1, 2023Updated 2 years ago
- 基于Jmeter实现的在线压测平台, 在原有版本基础上进行一些个性化的功能添加;本系统在zyanycall/stressTestPlatform的开源项目基础上开发;☆15Dec 17, 2021Updated 4 years ago
- A batteries-included monorepo framework for building sophisticated LangGraph applications☆44Oct 24, 2025Updated 4 months ago
- support Multiple Producer and Multiple Consumer with lock-free queue☆18Jan 11, 2021Updated 5 years ago
- 微信开源威胁情报机器人☆13Mar 13, 2023Updated 3 years ago
- 直接解析ngrinder csv结果,统计TPS标准差,TPS波动率,最小/大RT,RT 25/50/75/80/85/90/95/99百分位数; 如需直接在ngrinder详细页展示,需二次开发请查看:☆19Feb 16, 2016Updated 10 years ago
- This repository to demonstrate an application built with Java 21 + SrpingBoot 3 + MyBatis including CRUD operations, authentication, rout…☆12Dec 1, 2024Updated last year
- 🔥Sakura Automation Platform🔥是一站式持续自动化平台,涵盖 APP自动化、WEB自动化、API接口自动化、性能自动化,并且支持分布式测试,全面兼容 Appium、Selenium、Rest Assured、JMeter 等主流开源框架,有效助力…☆27Mar 11, 2025Updated last year
- ☆10May 25, 2015Updated 10 years ago
- LLM 推理服务性能测试☆44Dec 17, 2023Updated 2 years ago
- MobileSAM のエンコーダー/デコーダーをONNXに変換し、推論するサンプル☆11Apr 11, 2024Updated last year
- Add watermark to PDF and Office files☆16Jul 22, 2017Updated 8 years ago
- Codes简单易用的一站式研发管理平台 :免费使用 、本地安装、研发管理、测试管理、数字大屏、CI CD、接口测试、缺陷管理、DevTestOps☆29Jun 19, 2023Updated 2 years ago
- ☆12May 22, 2018Updated 7 years ago
- 【压测引擎】一个简单易用的性能测试平台,前后端分离项目;支持JMeter分布式压测,日志,报告等☆22Apr 15, 2025Updated 11 months ago
- Zaker style flow image view, it's image horizontal scrolling slowly, using as Zaker's main screen background.☆21Jan 27, 2014Updated 12 years ago
- Easy Watermark is a simple and easy-to-use watermarking framework that adds watermarks to different types of files using the same method.☆15Oct 12, 2024Updated last year
- OpenHIS医院系统(信创版)集十大核心模块于一体,涵盖目录管理、基础数据配置、个性化设置、门诊/住院全流程管理、药房药库智能管控、精细化耗材管理、财务核算体系、医保合规对接及多维报表分析等功能模块,共计372项标准化功能。☆15Feb 5, 2026Updated last month
- AI写作小工具方案:让2个智能体合作写出真正可用的图文并茂的帖子(微信公众号,小红书,博客)。1,写作智能体,2,知识库智能体。☆21Jun 8, 2025Updated 9 months ago
- aigc evals☆10Dec 2, 2023Updated 2 years ago
- Converts Swagger files to contracts for Spring Cloud Contract☆26Jul 31, 2020Updated 5 years ago
- LTX-Video-Trainer-GUI 是为LTX视频lora模型训练提供的GUI工具,支持通过简单的界面训练 LoRA 模型用于视频生成。本训练器提供了直观的 GUI 界面,使用户能够轻松设置和启动训练流程,无需编写复杂代码。☆13Jul 18, 2025Updated 8 months ago
- In this programming assignment you will implement a streaming video server and client that communicate control commands via the Real-Time…☆11Dec 29, 2012Updated 13 years ago
- [npj Digital Medicine] An In-Depth Evaluation of Federated Learning on Biomedical Natural Language Processing for Information Extraction☆12May 1, 2024Updated last year
- Custom Scheduler to deploy ML models to TRTIS for GPU Sharing☆11Apr 1, 2020Updated 5 years ago
- 对比测试不同大语言模型(LLM)性能的工具平台,支持DeepSeek API、Ollama本地模型和VLLM本地模型。A simple tools to test multi models and display the time cost.☆28May 7, 2025Updated 10 months ago
- 性能测试平台☆23Oct 25, 2018Updated 7 years ago
- ☆13Mar 16, 2025Updated last year
- inference on tvm runtime using c++ with gpu enabled☆10Apr 25, 2018Updated 7 years ago