Shopping MMLU: A Multi-Task Online Shopping Benchmark for LLMs.
☆47Nov 4, 2024Updated last year
Alternatives and similar repositories for ShoppingMMLU
Users that are interested in ShoppingMMLU are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Survey of available speech datasets for Polish ASR development☆17Jan 1, 2025Updated last year
- ☆13Jul 14, 2024Updated last year
- Repository for "Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators"☆12Mar 25, 2025Updated last year
- Library to extract text from HTML files☆11Dec 20, 2015Updated 10 years ago
- BenchBench is a Python package to evaluate multi-task benchmarks.☆23Oct 12, 2025Updated 8 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- The reproduce of paper "Continual Vision-Language Representation Learning with Off-Diagonal Information ".(Mod-X)☆12Oct 31, 2023Updated 2 years ago
- 🎉 TrustJudge is accepted to ICLR 2026!☆47Sep 27, 2025Updated 8 months ago
- Codebase the paper "The Remarkable Robustness of LLMs: Stages of Inference?"☆20Jun 11, 2025Updated last year
- An unofficial implementation of SOLAR-10.7B model and the newly proposed interlocked-DUS(iDUS) implementation and experiment details.☆14Mar 20, 2024Updated 2 years ago
- Code implementation for paper "On the Efficacy of Small Self-Supervised Contrastive Models without Distillation Signals".☆17Dec 15, 2021Updated 4 years ago
- Code and data of "Controllable Unsupervised Event-based Video Generation" (accepted as ICIP oral and invited by WACV workshop)☆19Nov 5, 2024Updated last year
- Official PyTorch code for "Sample Efficient Offline-to-Online Reinforcement Learning" in TKDE'23.☆16Aug 14, 2023Updated 2 years ago
- Explore and Control with Adversarial Surprise☆10Jul 20, 2021Updated 4 years ago
- A simple JSON parser specifically designed to handle malformed JSON output from Large Language Models (LLMs) like GPT, Claude, and others…☆27Jun 20, 2025Updated 11 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- arXiv 2024 | ZIP: entropy-law data selection for efficient LLM alignment.☆28Jun 10, 2026Updated last week
- The most extensive open massively multilingual corpus of datasets for training sentiment models. The corpus consists of 79 manually selec…☆16Nov 14, 2023Updated 2 years ago
- Many ASRs under one roof. With Benchmarking... answering the question. What is the best ASR for my dataset?☆19Oct 5, 2022Updated 3 years ago
- ☆30Apr 1, 2025Updated last year
- ☆84Mar 11, 2025Updated last year
- Large language models for document ranking.☆75May 20, 2026Updated 3 weeks ago
- A baseline Automatic Speech Recognition system for Polish based on Kaldi.☆18Dec 21, 2021Updated 4 years ago
- Deep Learning Part 2, 2019 edition - transcriptions, screenshots and notebooks☆11Jul 19, 2019Updated 6 years ago
- Hate speech detection corpus in Korean, shared with EMNLP 2023 paper☆17Apr 19, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- CVPR2021☆12Mar 29, 2021Updated 5 years ago
- R package for Conditional Random Fields☆20Oct 22, 2025Updated 7 months ago
- Benchmarking STT service TTFB and semantic WER for real-time AI applications☆77Updated this week
- Multi-Task Speech classification of accent and gender of an english speaker on Mozilla's common voice dataset☆28Jun 10, 2026Updated last week
- Official code repository for the ICLR 2022 paper "You Mostly Walk Alone: Analyzing Feature Attribution in Trajectory Prediction".☆14Jul 25, 2024Updated last year
- A random forest classifier to predict the age-group and gender of a speaker from voice measurements.☆18Apr 30, 2019Updated 7 years ago
- COVID-19 Risk Estimation for L.A. County using a Bayesian Time-varying SIR-model☆12Feb 17, 2023Updated 3 years ago
- Data Benchmarking☆25May 24, 2024Updated 2 years ago
- Evaluating Multimodal Generative AI with Korean Educational Standards, NAACL 2025.☆26May 15, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A dataset for Vietnamese Spelling Correction☆17Sep 27, 2021Updated 4 years ago
- [NAACL 2024] CoE-SQL: In-Context Learning for Multi-Turn Text-to-SQL with Chain-of-Editions☆13May 7, 2024Updated 2 years ago
- This repository has been redirected into https://kuaisar.github.io/.☆11Oct 12, 2023Updated 2 years ago
- Zurich Morphological Lexicon for German: a tool to extract a morphological lexicon from Wiktionary☆12Aug 10, 2023Updated 2 years ago
- [EMNLP 2024] A Peek into Token Bias: Large Language Models Are Not Yet Genuine Reasoners☆27Dec 11, 2024Updated last year
- ELECTRA MODEL NLP☆13Apr 8, 2020Updated 6 years ago
- How to really install tensorflow-gpu from source on a clean instance of Ubuntu☆11Sep 29, 2023Updated 2 years ago