KL4805/ShoppingMMLU

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/KL4805/ShoppingMMLU)

KL4805 / ShoppingMMLU

Shopping MMLU: A Multi-Task Online Shopping Benchmark for LLMs.

☆47

Alternatives and similar repositories for ShoppingMMLU

Users that are interested in ShoppingMMLU are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ninglab / eCeLLM
View on GitHub
☆55Sep 19, 2025Updated 9 months ago
goodmike31 / pl-asr-speech-data-survey
View on GitHub
Survey of available speech datasets for Polish ASR development
☆17Jan 1, 2025Updated last year
INTREBID / Awesome-MM-RAG
View on GitHub
This repository is for our survey paper: "A Comprehensive Survey on Multimodal RAG: All Combinations of Modalities as Input and Output"
☆51Nov 21, 2025Updated 7 months ago
njmarko / llm-gpt-sort
View on GitHub
A new type of sorting algorithm. Use large language model (llm like gpt, chat-gpt or others) to sort collections.
☆12Jun 7, 2023Updated 3 years ago
prometheus-eval / scaling-evaluation-compute
View on GitHub
Repository for "Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators"
☆12Mar 25, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
princeton-nlp / ELIZA-Transformer
View on GitHub
[NAACL 2025] Representing Rule-based Chatbots with Transformers
☆23Feb 9, 2025Updated last year
mark-xhchen / Conditional-ECPE
View on GitHub
Repo for 2020 EMNLP paper "Conditional Causal Relationships between Emotions and Causes in Texts"
☆14Apr 8, 2021Updated 5 years ago
jmnian / WRAG
View on GitHub
Code for paper "W-RAG: Weakly Supervised Dense Retrieval in RAG for Open-domain Question Answering"
☆16Oct 2, 2025Updated 9 months ago
HangtingYe / UADB
View on GitHub
☆13Mar 29, 2026Updated 3 months ago
YeonwooSung / MLOps
View on GitHub
Miscellaneous codes and writings for MLOps
☆15Apr 8, 2026Updated 3 months ago
socialfoundations / benchbench
View on GitHub
BenchBench is a Python package to evaluate multi-task benchmarks.
☆23Oct 12, 2025Updated 8 months ago
vdlad / Remarkable-Robustness-of-LLMs
View on GitHub
Codebase the paper "The Remarkable Robustness of LLMs: Stages of Inference?"
☆20Jun 11, 2025Updated last year
gauss5930 / iDUS
View on GitHub
An unofficial implementation of SOLAR-10.7B model and the newly proposed interlocked-DUS(iDUS) implementation and experiment details.
☆14Mar 20, 2024Updated 2 years ago
yooli23 / LEGOEval
View on GitHub
A toolkit for dialogue system evaluation via crowdsourcing
☆18Apr 25, 2023Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
guosyjlu / OEMA
View on GitHub
Official PyTorch code for "Sample Efficient Offline-to-Online Reinforcement Learning" in TKDE'23.
☆16Aug 14, 2023Updated 2 years ago
all-the-noises / eval-arena
View on GitHub
☆34Mar 21, 2026Updated 3 months ago
Lifelong-ML / LASEM
View on GitHub
Code for the ICML 2021 paper "Sharing Less is More: Lifelong Learning in Deep Networks with Selective Layer Transfer"
☆12Aug 17, 2021Updated 4 years ago
yanxue7 / E3T-Overcooked
View on GitHub
☆15May 4, 2024Updated 2 years ago
ArnaudFickinger / adversarial-surprise
View on GitHub
Explore and Control with Adversarial Surprise
☆10Jul 20, 2021Updated 4 years ago
aotakeda / ai-json-fixer
View on GitHub
A simple JSON parser specifically designed to handle malformed JSON output from Large Language Models (LLMs) like GPT, Claude, and others…
☆27Jun 20, 2025Updated last year
Brand24-AI / mms_benchmark
View on GitHub
The most extensive open massively multilingual corpus of datasets for training sentiment models. The corpus consists of 79 manually selec…
☆16Nov 14, 2023Updated 2 years ago
robmsmt / SpeechLoop
View on GitHub
Many ASRs under one roof. With Benchmarking... answering the question. What is the best ASR for my dataset?
☆19Oct 5, 2022Updated 3 years ago
liuqi6777 / llm4ranking
View on GitHub
Large language models for document ranking.
☆75May 20, 2026Updated last month
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
yale-nlp / Physics
View on GitHub
☆30Apr 1, 2025Updated last year
GAIR-NLP / AIME-Preview
View on GitHub
☆84Mar 11, 2025Updated last year
PenroseWang / SimGPS
View on GitHub
A repo for introduction of GNSS
☆13Jun 22, 2026Updated 2 weeks ago
timdavidlee / fastai_dl2019p2
View on GitHub
Deep Learning Part 2, 2019 edition - transcriptions, screenshots and notebooks
☆11Jul 19, 2019Updated 6 years ago
ssu-humane / K-HATERS
View on GitHub
Hate speech detection corpus in Korean, shared with EMNLP 2023 paper
☆17Apr 19, 2024Updated 2 years ago
wulingyun / CRF
View on GitHub
R package for Conditional Random Fields
☆20Oct 22, 2025Updated 8 months ago
autonlab / aqua
View on GitHub
AQuA: A Benchmarking Tool for Label Quality Assessment, NeurIPS'23 D&B
☆23Oct 17, 2023Updated 2 years ago
limanling / uiuc_ie_pipeline_fine_grained
View on GitHub
A script to run fine-grained entity, relation and event extraction
☆24Nov 4, 2021Updated 4 years ago
karthikbhamidipati / multi-task-speech-classification
View on GitHub
Multi-Task Speech classification of accent and gender of an english speaker on Mozilla's common voice dataset
☆28Jun 10, 2026Updated 3 weeks ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
pipecat-ai / stt-benchmark
View on GitHub
Benchmarking STT service TTFB and semantic WER for real-time AI applications
☆84Jun 22, 2026Updated 2 weeks ago
sjlmg / CP-KGC
View on GitHub
Can Text-based Knowledge Graph Completion Benefit From Zero-Shot Large Language Models?
☆18Dec 9, 2024Updated last year
dave-fernandes / SpeakerClassifier
View on GitHub
A random forest classifier to predict the age-group and gender of a speaker from voice measurements.
☆18Apr 30, 2019Updated 7 years ago
ANRGUSC / covid19_risk_estimation
View on GitHub
COVID-19 Risk Estimation for L.A. County using a Bayesian Time-varying SIR-model
☆12Feb 17, 2023Updated 3 years ago
yeyimilk / LLMGeo
View on GitHub
LLMGeo: Benchmarking Large Language Models on Image Geolocation In-the-wild
☆16Oct 31, 2024Updated last year
naver-ai / KoNET
View on GitHub
Evaluating Multimodal Generative AI with Korean Educational Standards, NAACL 2025.
☆27May 15, 2025Updated last year
facebookresearch / NeuralMemory
View on GitHub
A Data Source for Reasoning Embodied Agents
☆20Sep 18, 2023Updated 2 years ago