llm2014 / llm_benchmarkLinks
☆773Updated this week
Alternatives and similar repositories for llm_benchmark
Users that are interested in llm_benchmark are comparing it to the libraries listed below
Sorting:
- LLM Arena by KCORES team☆960Updated 9 months ago
- All in one vscode plugin for mcp developer☆720Updated last month
- ☆857Updated 3 months ago
- ☆919Updated 2 months ago
- 【逐条处理完成】人为审核+修改每一条的弱智吧精选问 题QA数据集☆243Updated 10 months ago
- An open-source solution for full parameter fine-tuning of DeepSeek-V3/R1 671B, including complete code and scripts from training to infer…☆798Updated 10 months ago
- ☆745Updated 2 years ago
- PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/Docker☆2,245Updated 3 weeks ago
- OpenJudge: A Unified Framework for Holistic Evaluation and Quality Rewards☆381Updated this week
- Cool Papers - Immersive Paper Discovery☆701Updated 5 months ago
- 将gpt_academic的arxiv论文翻译单独抽取出来,并集成BabelOCR,支持本地PDF翻译,翻译成功率提高到95%+☆140Updated 2 months ago
- 全网最全-2025年AI领域最值得关注的两百位博主和一手信息源盘点☆207Updated last year
- website☆462Updated 11 months ago
- ☆1,212Updated 7 months ago
- GAOKAO-Bench is an evaluation framework that utilizes GAOKAO questions as a dataset to evaluate large language models.☆703Updated last year
- Yunjue Agent: A Fully Reproducible, Zero-Start In-Situ Self-Evolving Agent System for Open-Ended Tasks☆291Updated last week
- DeepSeek 系列工作解读、扩展和复现。☆700Updated 10 months ago
- ☆1,044Updated last year
- 《AI Quant Trading - From Zero to One》☆206Updated last week
- a huggingface mirror site.☆326Updated last year
- A minimal yet professional single agent demo project that showcases the core execution pipeline and production-grade features of agents.☆1,463Updated 3 weeks ago
- TurtleBench: Evaluating Top Language Models via Real-World Yes/No Puzzles.☆162Updated last year
- ☆814Updated 8 months ago
- 反重力Agent代理一键脚本,支持WSL、SSH远程☆415Updated this week
- [EMNLP 2025 Oral] MemoryOS is designed to provide a memory operating system for personalized AI agents.☆1,078Updated 4 months ago
- 将知乎专栏文章转换为 Markdown 文件保存到本地☆558Updated 10 months ago
- Unleash Next-Level AI! 🚀 💻 Code Generation: DeepSeek r1 + Claude 3.7 Sonnet - Unparalleled Performance! 📝 Content Creation: DeepSeek …☆2,785Updated 4 months ago
- The official repository of the dots.llm1 base and instruct models proposed by rednote-hilab.☆488Updated 5 months ago
- A streamlined and customizable framework for efficient large model (LLM, VLM, AIGC) evaluation and performance benchmarking.☆2,384Updated this week
- ☆1,300Updated last week