The evaluation framework for the InfiCoder-Eval benchmark.
☆21Jul 22, 2024Updated last year
Alternatives and similar repositories for infibench-evaluator
Users that are interested in infibench-evaluator are comparing it to the libraries listed below
Sorting:
- LLM benchmarks☆13Feb 22, 2024Updated 2 years ago
- [ICSE 2023] Differentiable interpretation and failure-inducing input generation for neural network numerical bugs.☆13Jan 5, 2024Updated 2 years ago
- A framework for evaluating the effectiveness of chain-of-thought reasoning in language models.☆19Feb 6, 2025Updated last year
- ☆16Nov 26, 2024Updated last year
- Framework for Cost-Effective Language Model Choice☆17Dec 12, 2023Updated 2 years ago
- This repository to demonstrate an application built with Java 21 + SrpingBoot 3 + MyBatis including CRUD operations, authentication, rout…☆12Dec 1, 2024Updated last year
- ☆25Aug 23, 2024Updated last year
- Heart Steps 2.0 Application☆10Sep 1, 2023Updated 2 years ago
- ☆33Jun 24, 2024Updated last year
- ☆13Updated this week
- 🚀 Vibe Stack - Docker setup for AI-powered coding with Vibe-Kanban + Claude Code | Secure secrets, browser-based VS Code, ready to deplo…☆42Feb 10, 2026Updated 3 weeks ago
- code for "Automated and Intelligent Synthesis of Oxygen-Producing Catalysts from Martian Meteorites by Robotic AI-Chemist "☆12Jul 31, 2023Updated 2 years ago
- Realtime Fall Detection and Human Activity Recognition using Multilayer Perceptron Neural Network from gyroscope and accelerometer sensor…☆12Jul 11, 2021Updated 4 years ago
- Official implementation of a temporal pupil light response model proposed in the Scientific Reports article: "Deep learning-based pupil m…☆11Jan 6, 2023Updated 3 years ago
- ☆12Jan 11, 2026Updated last month
- DOMAINEVAL is an auto-constructed benchmark for multi-domain code generation that consists of 2k+ subjects (i.e., description, reference …☆14Dec 12, 2024Updated last year
- repository for notes and data from machine learning studies☆11Dec 16, 2019Updated 6 years ago
- A framework for few-shot evaluation of autoregressive language models.☆12Jul 14, 2025Updated 7 months ago
- dify 知识库检索工具☆13Apr 3, 2025Updated 11 months ago
- A Swedish Natural Language Understanding Benchmark☆11Dec 12, 2025Updated 2 months ago
- Extracts static code features from opencl kernels to be used for machine learning.☆10Apr 30, 2021Updated 4 years ago
- The implement of ACL2024: "MAPO: Advancing Multilingual Reasoning through Multilingual Alignment-as-Preference Optimization"☆43Jun 15, 2024Updated last year
- Multimodal data loader compatible with pytorch and tensorflow☆12Aug 14, 2024Updated last year
- the experiment on EEG classify using CNN-LSTM structure network☆12Jan 14, 2017Updated 9 years ago
- [CVPR2024] Learning from Synthetic Human Group Activities☆14Feb 24, 2025Updated last year
- Code for paper: "Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines"☆11Oct 11, 2024Updated last year
- An assistant that will help you better visualize the presentation of mined data (information). It deals with all the health care issues w…☆11Feb 1, 2018Updated 8 years ago
- ☆18Jul 3, 2025Updated 8 months ago
- ☆10Apr 8, 2018Updated 7 years ago
- ☆12Mar 25, 2024Updated last year
- [NeurIPS 2024] Efficiency for Free: Ideal Data Are Transportable Representations☆19Jan 19, 2025Updated last year
- ☆17Apr 17, 2021Updated 4 years ago
- ☆11Jan 28, 2024Updated 2 years ago
- Deadline countdowns for academic conferences relevant to the SSE chair.☆12Feb 10, 2026Updated 3 weeks ago
- Survey of available speech datasets for Polish ASR development☆17Jan 1, 2025Updated last year
- ☆14Feb 23, 2021Updated 5 years ago
- Scrape and export data from the Open LLM Leaderboard.☆48Dec 17, 2024Updated last year
- This repository contains the replication package of our paper "Assessing the Security of GitHub Copilot’s Generated Code - A Targeted Rep…☆10Nov 16, 2023Updated 2 years ago
- LLM red teaming datasets from the paper 'Student-Teacher Prompting for Red Teaming to Improve Guardrails' for the ART of Safety Workshop …☆22Oct 12, 2023Updated 2 years ago