A simple repository showcasing a few LLM Evaluation strategies and leverages W&B Sweeps to optimize the LLM system.
☆12Jul 11, 2023Updated 2 years ago
Alternatives and similar repositories for llm-eval-sweep
Users that are interested in llm-eval-sweep are comparing it to the libraries listed below
Sorting:
- Academic Writing Analytics☆10Jan 5, 2026Updated 2 months ago
- Demonstrate using MCP with Pydantic AI framework☆14Mar 14, 2025Updated 11 months ago
- Detect-Then-Explain Framework for Text-to-SQL task☆10Dec 6, 2023Updated 2 years ago
- ☆10Nov 7, 2022Updated 3 years ago
- PatientSim: A Persona-Driven Simulator for Realistic Doctor-Patient Interactions (NeurIPS 2025 D&B track, Spotlight)☆23Feb 11, 2026Updated 3 weeks ago
- 华中科技大学研究生课程论文LaTeX模板☆11Aug 5, 2022Updated 3 years ago
- ☆11May 18, 2022Updated 3 years ago
- Repo for our work "Systematic Evaluation of Large Vision-Language Models for Surgical Artificial Intelligence"☆19Jun 2, 2025Updated 9 months ago
- User Management Application build with Spring Boot, Thymeleaf & MySQL Database☆12Dec 20, 2024Updated last year
- Scrapy抓取豆瓣图书☆10Aug 19, 2016Updated 9 years ago
- Estimates fatigue loads in wind turbines from SCADA data based on supervised learning.☆10Sep 11, 2018Updated 7 years ago
- ☆10May 14, 2020Updated 5 years ago
- [NeurIPS 2025] This is the official repository for "RAD: Towards Trustworthy Retrieval-Augmented Multi-modal Clinical Diagnosis"☆26Nov 21, 2025Updated 3 months ago
- Pipeline for employing a Lightweight deep learning models for LOW-power systems☆11Jan 9, 2023Updated 3 years ago
- Javascript based component for highlighting text-mined annotations of different semantic types in a full text article identified by a PMC…☆11Nov 29, 2016Updated 9 years ago
- Code used to run experiments for the ICLR 2023 paper "Computational Language Acquisition with Theory of Mind".☆15Apr 27, 2023Updated 2 years ago
- incremental symbol learning for natural language understanding☆10Jun 12, 2023Updated 2 years ago
- Hierarchical reinforcement learning framework which uses a directed graph to define the hierarchy.☆15Aug 5, 2022Updated 3 years ago
- Code for the AACL 2022 Paper "This Patient Looks Like That Patient: Prototypical Networks for Interpretable Diagnosis Prediction from Cli…☆12Nov 18, 2022Updated 3 years ago
- [Findings of ACL-2023] This is the official implementation of On the Difference of BERT-style and CLIP-style Text Encoders.☆14Jun 7, 2023Updated 2 years ago
- Official repository for "DEnsity: Open-domain Dialogue Evaluation Metric using Density Estimation (ACL2023 Findings)"☆11May 23, 2023Updated 2 years ago
- ☆14Jan 6, 2025Updated last year
- ☆11Jun 21, 2025Updated 8 months ago
- ☆12Oct 3, 2023Updated 2 years ago
- Official repository for Fourier model that can generate periodic signals☆10Mar 10, 2022Updated 3 years ago
- Multi-hop Evidence Retrieval for Cross-document Relation Extraction☆11Sep 1, 2023Updated 2 years ago
- DatasetResearch: Benchmarking Agent Systems for Demand-Driven Dataset Discovery☆20Sep 24, 2025Updated 5 months ago
- Official source code for Time is Not Enough: Time-Frequency based Explanation for Time-Series Black-Box Models☆12Dec 5, 2024Updated last year
- A Python Natural Language Processing Toolkit for Electronic Health Record Texts☆13May 24, 2023Updated 2 years ago
- Implementation of "Face detection in untrained deep neural networks" (Baek et al., Nature Communications, 2021)☆10Nov 2, 2021Updated 4 years ago
- ☆25Oct 13, 2025Updated 4 months ago
- Reinforcement Learning Robot avoiding obstacles(Python + V_rep)☆12Oct 29, 2019Updated 6 years ago
- Large-language Model Evaluation framework with Elo Leaderboard and A-B testing☆52Oct 24, 2024Updated last year
- MetricEval: A framework that conceptualizes and operationalizes four main components of metric evaluation, in terms of reliability and va…☆12Nov 6, 2023Updated 2 years ago
- https://deep-learning-101.github.io/Natural-Language-Processing Natural Language Processing (自然語言處理)☆14Updated this week
- Interpreting Chest X-rays Like a Radiologist: A Benchmark with Clinical Reasoning, release the dataset and the model weight☆13May 26, 2025Updated 9 months ago
- The repository of the ACCV 2024 paper "FG-CXR: A Radiologist-Aligned Gaze Dataset for Enhancing Interpretability in Chest X-Ray Report Ge…☆11Jul 28, 2025Updated 7 months ago
- The official source code for TaleBrush (CHI 2022)☆15Jul 13, 2022Updated 3 years ago
- Basic openAI chat Bot on neo4j knowledge graph☆12Oct 4, 2023Updated 2 years ago