URS Benchmark: Evaluating LLMs on User Reported Scenarios
☆30May 30, 2025Updated 9 months ago
Alternatives and similar repositories for URS
Users that are interested in URS are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A high-throughput and memory-efficient inference and serving engine for LLMs☆17Jun 3, 2024Updated last year
- Code of LeCoRE☆13Feb 15, 2023Updated 3 years ago
- code and data associated with CoMPosT: Characterizing and Evaluating Caricature in LLM Simulations☆11Oct 13, 2023Updated 2 years ago
- ☆32Sep 28, 2025Updated 5 months ago
- Automated testing and benchmarking for code generation agents.☆18Jun 27, 2023Updated 2 years ago
- ☆16Mar 3, 2024Updated 2 years ago
- Code to compute AnthroScore, a computational linguistic measure of anthropomorphism in text☆18Mar 31, 2025Updated 11 months ago
- The backup repository for FairytaleQA dataset and paper "Fantastic Questions and Where to Find Them: FairytaleQA – An Authentic Dataset f…☆10May 30, 2023Updated 2 years ago
- Official Repository for "BlendX: Complex Multi-intent Detection with Blended Patterns"☆27Jan 16, 2026Updated 2 months ago
- ☆11Sep 19, 2025Updated 6 months ago
- This repo is to demo the concept of lossless compression with Transformers as encoder and decoder.☆14May 2, 2024Updated last year
- 언어모델을 학습하기 위한 공개 한국어 instruction dataset들을 모아두었습니다.☆19Jul 16, 2023Updated 2 years ago
- Implementation of self-certainty as an extention of ZeroEval Project☆36May 31, 2025Updated 9 months ago
- Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"☆92Oct 30, 2024Updated last year
- Code and data for Marked Personas (ACL 2023)☆28May 26, 2023Updated 2 years ago
- Query Performance Prediction for Conversational Search (QPP4CS)☆29May 22, 2024Updated last year
- Code to reproduce THUIR‘s submissions for COLIEE 2023 Task1 and Task2☆28May 12, 2023Updated 2 years ago
- Code and data for "KoDialogBench: Evaluating Conversational Understanding of Language Models with Korean Dialogue Benchmark" (LREC-COLING…☆17Apr 15, 2025Updated 11 months ago
- Official repository for Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning [ICLR 2025]☆50Jan 24, 2025Updated last year
- Official implementation of "Disentangled Knowledge Transfer for OOD Intent Discovery with Unified Contrastive Learning", ACL2022 main con…☆14Jul 23, 2022Updated 3 years ago
- Some example codes for drawing figures in research paper☆35Mar 3, 2022Updated 4 years ago
- OneFlow Serving☆20Apr 10, 2025Updated 11 months ago
- StrategyQA 데이터 세트 번역☆23Apr 12, 2024Updated last year
- Official code and dataset repository of KoBBQ (TACL 2024)☆19May 13, 2024Updated last year
- ☆15Dec 3, 2024Updated last year
- This is a project using neural-network reinforcement learning to solve the 8 puzzle problem (or even N puzzle)☆11Mar 24, 2018Updated 7 years ago
- KLUE Benchmark 1st place (2021.12) solutions. (RE, MRC, NLI, STS, TC)☆25Apr 11, 2022Updated 3 years ago
- [SIGIR'25] Code of "Generative Recommender with End-to-End Learnable Item Tokenization".☆24Apr 17, 2025Updated 11 months ago
- Suite of 500 procedurally-generated NLP tasks to study language model adaptability☆21Jul 16, 2022Updated 3 years ago
- Code for the ACL 2022 (Long paper): "New Intent Discovery with Pre-training and Contrastive Learning".☆14Jul 18, 2022Updated 3 years ago
- 🔍 Awesome Agentic Search is a curated list of papers, tools, and resources on agentic search—where AI agents plan, search, and reason to…☆55Aug 28, 2025Updated 6 months ago
- [ACL 2025 Findings] Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts (As Huggingface Daily Papers: …☆90Nov 23, 2025Updated 4 months ago
- BERT finetuned on NER downstream tasks☆15Jun 12, 2023Updated 2 years ago
- ☆42Feb 2, 2024Updated 2 years ago
- ☆36Oct 4, 2023Updated 2 years ago
- Code for RATIONALYST: Pre-training Process-Supervision for Improving Reasoning https://arxiv.org/pdf/2410.01044☆35Oct 3, 2024Updated last year
- Official Implementation of "A Hybrid Architecture for Out of Domain Intent Detection and Intent Discovery"☆11May 31, 2023Updated 2 years ago
- Optimizing bit-level Jaccard Index and Population Counts for large-scale quantized Vector Search via Harley-Seal CSA and Lookup Tables☆21May 18, 2025Updated 10 months ago
- Offical code repository for PromptMix: A Class Boundary Augmentation Method for Large Language Model Distillation, EMNLP 2023☆12Dec 13, 2023Updated 2 years ago