URS Benchmark: Evaluating LLMs on User Reported Scenarios
☆31May 30, 2025Updated last year
Alternatives and similar repositories for URS
Users that are interested in URS are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A high-throughput and memory-efficient inference and serving engine for LLMs☆17Jun 3, 2024Updated 2 years ago
- Code of LeCoRE☆13Feb 15, 2023Updated 3 years ago
- code and data associated with CoMPosT: Characterizing and Evaluating Caricature in LLM Simulations☆11Oct 13, 2023Updated 2 years ago
- NUIX-Studio App helps developers to create devices for VR-IoT environment☆23Jan 9, 2023Updated 3 years ago
- This is our implementation of IntEL-Intent-aware Ranking Ensemble for Personalized Recommendation (SIGIR2023)☆24Nov 17, 2023Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆16Mar 3, 2024Updated 2 years ago
- Code to compute AnthroScore, a computational linguistic measure of anthropomorphism in text☆19Mar 31, 2025Updated last year
- An extended project of the LLM Compiler paper, focusing on developing LLM-based Autonomous Agents.☆26Oct 22, 2024Updated last year
- ☆43Sep 28, 2025Updated 8 months ago
- The backup repository for FairytaleQA dataset and paper "Fantastic Questions and Where to Find Them: FairytaleQA – An Authentic Dataset f…☆10May 30, 2023Updated 3 years ago
- Official Repository for "BlendX: Complex Multi-intent Detection with Blended Patterns"☆27Apr 27, 2026Updated last month
- This repo is to demo the concept of lossless compression with Transformers as encoder and decoder.☆14May 2, 2024Updated 2 years ago
- Neural Unification for Logic Reasoning over Language☆22Nov 15, 2021Updated 4 years ago
- 언어모델을 학습하기 위한 공개 한국어 instruction dataset들을 모아두었습니다.☆19Jul 16, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Implementation of self-certainty as an extention of ZeroEval Project☆36May 31, 2025Updated last year
- Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"☆92Oct 30, 2024Updated last year
- Baseline system for Language-based Audio Retrieval (Task 6B) in DCASE 2023 Challenge☆10Aug 8, 2023Updated 2 years ago
- Code and data for Marked Personas (ACL 2023)☆30May 26, 2023Updated 3 years ago
- Automatically Generated d2l-zh TensorFlow Notebooks for Colab☆13Aug 18, 2023Updated 2 years ago
- Code to reproduce THUIR‘s submissions for COLIEE 2023 Task1 and Task2☆28May 12, 2023Updated 3 years ago
- Code and data for "KoDialogBench: Evaluating Conversational Understanding of Language Models with Korean Dialogue Benchmark" (LREC-COLING…☆18Apr 15, 2025Updated last year
- SCREWS: A Modular Framework for Reasoning with Revisions☆27Sep 26, 2023Updated 2 years ago
- Official repository for Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning [ICLR 2025]☆51Jan 24, 2025Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆22Dec 15, 2023Updated 2 years ago
- Some example codes for drawing figures in research paper☆35Mar 3, 2022Updated 4 years ago
- Repository for "Rescan: Inductive Instance Segmentation for Indoor RGBD Scans" (ICCV 2019)☆17Mar 12, 2020Updated 6 years ago
- OneFlow Serving☆20Apr 10, 2025Updated last year
- StrategyQA 데이터 세트 번역☆22Apr 12, 2024Updated 2 years ago
- ☆15Dec 3, 2024Updated last year
- ☆12Feb 25, 2024Updated 2 years ago
- This is a project using neural-network reinforcement learning to solve the 8 puzzle problem (or even N puzzle)☆12Mar 24, 2018Updated 8 years ago
- ☆14Mar 25, 2023Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Tools for the evaluation of audio captioning.☆19May 23, 2020Updated 6 years ago
- Suite of 500 procedurally-generated NLP tasks to study language model adaptability☆21Jul 16, 2022Updated 3 years ago
- Query Performance Prediction for Conversational Search (QPP4CS)☆112May 22, 2024Updated 2 years ago
- Pre-training Multi-task Contrastive Learning Models for Scientific Literature Understanding (Findings of EMNLP'23)☆11Aug 24, 2024Updated last year