medaks/medask-benchmarks

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/medaks/medask-benchmarks)

medaks / medask-benchmarks

A novel approach to evaluating AI agents on diagnostic accuracy in symptom checking tasks.

☆27

Alternatives and similar repositories for medask-benchmarks

Users that are interested in medask-benchmarks are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

lisjin / hct
View on GitHub
Hierarchical Context Tagger for utterance rewriting
☆13Mar 27, 2022Updated 4 years ago
XMUDeepLIT / IRSEG
View on GitHub
codes for "Improving Graph-based Sentence Ordering with Iteratively Predicted Pairwise Orderings"(EMNLP2021)
☆15Sep 12, 2021Updated 4 years ago
XMUDeepLIT / MNMT
View on GitHub
Code for "Multi-Modal Neural Machine Translation with Deep Semantic Interactions" (Information Sciences)
☆16May 21, 2021Updated 5 years ago
autollama / autollama
View on GitHub
Anthropic's Contextual Retrieval implementation with visual chunk comparison. Preview context enrichment before/after embedding.
☆30Sep 25, 2025Updated 10 months ago
maxsloef / loom-mcp
View on GitHub
this is a TypeScript-based MCP server that implements a simple loom and makes it available for Claude to use.
☆24Feb 17, 2026Updated 5 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
actava-ai / Cura
View on GitHub
actAVA Cura: Specialized Model for Agentic Healthcare
☆22Updated this week
apple / visatronic-demo
View on GitHub
Visatronic: A Multimodal Decoder-Only Model for Speech Synthesis
☆15May 28, 2025Updated last year
GoogleChromeLabs / dictation_support
View on GitHub
This SDK allows web-based apps/pages to interact with dictation devices
☆19Jun 22, 2026Updated last month
PasiKoodaa / dia
View on GitHub
A TTS model capable of generating ultra-realistic dialogue in one pass.
☆32May 1, 2025Updated last year
BA-Transform / BAT-Image-Classification
View on GitHub
This is an official implementation of our CVPR 2020 paper "Non-Local Neural Networks With Grouped Bilinear Attentional Transforms".
☆13Jan 30, 2021Updated 5 years ago
LogicJake / 2020-Xiamen-International-Bank-Financial-Cup
View on GitHub
2020厦门国际银行数创金融杯建模大赛-优胜奖方案
☆11Feb 2, 2021Updated 5 years ago
vincentamato / mlx-coconut
View on GitHub
An MLX port of Meta's Coconut reasoning model
☆16Sep 2, 2025Updated 10 months ago
thad0ctor / KrunchWrapper
View on GitHub
☆18Jul 1, 2025Updated last year
Goekdeniz-Guelmez / mlx-embeddings-lora
View on GitHub
Train Embedding Models on MLX.
☆17Jun 2, 2026Updated last month
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
ProofAgent-ai / proofagent-harness
View on GitHub
Open-source test harness for AI agents. Stress-test production agents with adversarial multi-turn scenarios in CI
☆19Updated this week
AMA-Bench / AMA-Bench
View on GitHub
[ICML 26] An evaluation framework assessing long-context retention and long-horizon memory performance for agentic applications (AMA-benc…
☆63Jun 15, 2026Updated last month
thelostwind / electron-clock
View on GitHub
A simple electron clock applet
☆12Oct 15, 2019Updated 6 years ago
eliahuhorwitz / ProbeX
View on GitHub
Official PyTorch Implementation for the "Learning on Model Weights using Tree Experts" paper (CVPR 2025).
☆16Feb 11, 2026Updated 5 months ago
edebrouwer / cfqp
View on GitHub
Deep Counterfactual Prediction with Categorical Backward Variables
☆12Feb 8, 2023Updated 3 years ago
beetree / ARC-AGI
View on GitHub
☆78May 31, 2026Updated last month
pykeen / ranking-metrics-manuscript
View on GitHub
📐 Results for the Ranking Metrics submission @ GLB 2022
☆10Apr 6, 2022Updated 4 years ago
SalesforceAIResearch / UserRL
View on GitHub
The raw UserRL repo under construction
☆113Jun 2, 2026Updated last month
Open-Cap-Table-Coalition / OCF-Tools
View on GitHub
xState-based validation tool for OCF files
☆15Jul 10, 2026Updated 2 weeks ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
goombalab / Gather-and-Aggregate
View on GitHub
Experiments Notebook of "Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism"
☆16Apr 30, 2025Updated last year
huaxiuyao / KGML
View on GitHub
KGML for EMNLP 2021
☆10Feb 2, 2022Updated 4 years ago
leaderj1001 / Bag-of-MLP
View on GitHub
Bag of MLP
☆20May 31, 2021Updated 5 years ago
VincenDen / IID
View on GitHub
Exploiting Inter-sample and Inter-feature Relations in Dataset Distillation (CVPR24)
☆10Jun 16, 2024Updated 2 years ago
Peter-obi / Kolmogorov-Arnold-Networks-KANs-mlx
View on GitHub
☆15May 17, 2024Updated 2 years ago
zoeyuchao / MPE-pytorch
View on GitHub
This is MPE-pytorch, fix some bugs.
☆11Apr 26, 2020Updated 6 years ago
issacchan26 / AntiMoneyLaunderingDetectionWithGNN
View on GitHub
Anti Money Laundering Detection using Graph Attention Network
☆70Oct 18, 2023Updated 2 years ago
chenjianhuii / Mechanistic-Data-Attribution
View on GitHub
☆16May 25, 2026Updated 2 months ago
sc420 / pygame-rl
View on GitHub
Game environment for reinforcement learning using Pygame
☆10Jul 15, 2019Updated 7 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
graves / dirdocs
View on GitHub
Recursively generate descriptions of every file in a directory then append that description to Nushell's ls.
☆16Oct 7, 2025Updated 9 months ago
js-lan / competition_codes
View on GitHub
☆10Feb 23, 2021Updated 5 years ago
allbilly / ane
View on GitHub
Run ops on Apple ANE in NPU register with pure python on M1 Asahi Linux. No Espresso, No CoreML, no metal, no .mlmodels file, no .hwx fil…
☆16Jun 28, 2026Updated 3 weeks ago
fjzzq2002 / WeightWatch
View on GitHub
Official Repository of Paper "Watch the Weights: Unsupervised monitoring and control of fine-tuned LLMs"
☆15Sep 25, 2025Updated 10 months ago
AI-secure / CoPur
View on GitHub
CoPur: Certifiably Robust Collaborative Inference via Feature Purification (NeurIPS 2022)
☆11Dec 7, 2022Updated 3 years ago
LARS-research / ERAS
View on GitHub
Code for "Efficient Relation-aware Scoring Function Search for Knowledge Graph Embedding" ICDE 2021
☆11Apr 26, 2021Updated 5 years ago
berczig / FlowBoost
View on GitHub
☆15Mar 2, 2026Updated 4 months ago