URS Benchmark: Evaluating LLMs on User Reported Scenarios
☆31May 30, 2025Updated 11 months ago
Alternatives and similar repositories for URS
Users that are interested in URS are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆12Nov 22, 2024Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMs☆17Jun 3, 2024Updated last year
- NUIX-Studio App helps developers to create devices for VR-IoT environment☆23Jan 9, 2023Updated 3 years ago
- Automated testing and benchmarking for code generation agents.☆18Jun 27, 2023Updated 2 years ago
- An extended project of the LLM Compiler paper, focusing on developing LLM-based Autonomous Agents.☆26Oct 22, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- The backup repository for FairytaleQA dataset and paper "Fantastic Questions and Where to Find Them: FairytaleQA – An Authentic Dataset f…☆10May 30, 2023Updated 2 years ago
- ☆11Sep 19, 2025Updated 7 months ago
- This repo is to demo the concept of lossless compression with Transformers as encoder and decoder.☆14May 2, 2024Updated 2 years ago
- Neural Unification for Logic Reasoning over Language☆22Nov 15, 2021Updated 4 years ago
- Blog post☆17Feb 16, 2024Updated 2 years ago
- 언어모델을 학습하기 위한 공개 한국어 instruction dataset들을 모아두었습니다.☆19Jul 16, 2023Updated 2 years ago
- Implementation of self-certainty as an extention of ZeroEval Project☆36May 31, 2025Updated 11 months ago
- Baseline system for Language-based Audio Retrieval (Task 6B) in DCASE 2023 Challenge☆10Aug 8, 2023Updated 2 years ago
- code for paper Sparse Structure Search for Delta Tuning☆11Oct 16, 2022Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Code and data for Marked Personas (ACL 2023)☆30May 26, 2023Updated 2 years ago
- Automatically Generated d2l-zh TensorFlow Notebooks for Colab☆13Aug 18, 2023Updated 2 years ago
- Query Performance Prediction for Conversational Search (QPP4CS)☆31May 22, 2024Updated last year
- Code to reproduce THUIR‘s submissions for COLIEE 2023 Task1 and Task2☆28May 12, 2023Updated 2 years ago
- Code and data for "KoDialogBench: Evaluating Conversational Understanding of Language Models with Korean Dialogue Benchmark" (LREC-COLING…☆17Apr 15, 2025Updated last year
- SCREWS: A Modular Framework for Reasoning with Revisions☆27Sep 26, 2023Updated 2 years ago
- Official repository for Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning [ICLR 2025]☆50Jan 24, 2025Updated last year
- ☆22Dec 15, 2023Updated 2 years ago
- Learning and buiding API using Fast API☆16Aug 7, 2021Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Official implementation of "Disentangled Knowledge Transfer for OOD Intent Discovery with Unified Contrastive Learning", ACL2022 main con…☆14Jul 23, 2022Updated 3 years ago
- Some example codes for drawing figures in research paper☆35Mar 3, 2022Updated 4 years ago
- ☆13Feb 11, 2021Updated 5 years ago
- StrategyQA 데이터 세트 번역☆22Apr 12, 2024Updated 2 years ago
- ☆12Feb 25, 2024Updated 2 years ago
- This is a project using neural-network reinforcement learning to solve the 8 puzzle problem (or even N puzzle)☆11Mar 24, 2018Updated 8 years ago
- KLUE Benchmark 1st place (2021.12) solutions. (RE, MRC, NLI, STS, TC)☆25Apr 11, 2022Updated 4 years ago
- Suite of 500 procedurally-generated NLP tasks to study language model adaptability☆21Jul 16, 2022Updated 3 years ago
- Pre-training Multi-task Contrastive Learning Models for Scientific Literature Understanding (Findings of EMNLP'23)☆11Aug 24, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Code for the ACL 2022 (Long paper): "New Intent Discovery with Pre-training and Contrastive Learning".☆14Jul 18, 2022Updated 3 years ago
- ☆36Oct 4, 2023Updated 2 years ago
- BERT finetuned on NER downstream tasks☆15Jun 12, 2023Updated 2 years ago
- [ACL 2025 Findings] Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts (https://huggingface.co/papers…☆91Nov 23, 2025Updated 5 months ago
- Code for RATIONALYST: Pre-training Process-Supervision for Improving Reasoning https://arxiv.org/pdf/2410.01044☆35Oct 3, 2024Updated last year
- Official code repository for Findings of EMNLP 2022 paper: PseudoReasoner: Leveraging Pseudo Labels for Commonsense Knowledge Base Popula…☆11Oct 18, 2022Updated 3 years ago
- Dataaset Release for Explanations for CommonsenseQA, ACL 2021 Paper☆20Jul 30, 2021Updated 4 years ago