shachardon / naturally_occurring_feedbackLinks
☆13Updated 9 months ago
Alternatives and similar repositories for naturally_occurring_feedback
Users that are interested in naturally_occurring_feedback are comparing it to the libraries listed below
Sorting:
- 🦄 Unitxt is a Python library for enterprise-grade evaluation of AI performance, offering the world's largest catalog of tools and data …☆196Updated this week
- Repository for "Attribute First, then Generate: Locally-attributable Grounded Text Generation", ACL 2024☆29Updated 5 months ago
- A package dedicated for running benchmark agreement testing☆16Updated 3 weeks ago
- Top papers related to LLM-based agent evaluation☆68Updated 2 weeks ago
- ☆34Updated 2 months ago
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆128Updated last year
- This is the official repository for Inheritune.☆111Updated 3 months ago
- ☆11Updated 2 weeks ago
- Synthetic Data Generation for Evaluation☆14Updated 3 months ago
- Repository for the ACL 2024 conference website☆18Updated 4 months ago
- Exploring Model Kinship for Merging Large Language Models☆24Updated last month
- [arXiv preprint] Official Repository for "Evaluating Language Models as Synthetic Data Generators"☆33Updated 5 months ago
- Learning to route instances for Human vs AI Feedback (ACL 2025 Main)☆23Updated 3 weeks ago
- Stanford NLP Python library for benchmarking the utility of LLM interpretability methods☆89Updated last week
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆57Updated 9 months ago
- ☆24Updated 4 months ago
- Code and Data for "Long-context LLMs Struggle with Long In-context Learning" [TMLR2025]☆106Updated 3 months ago
- Code accompanying "How I learned to start worrying about prompt formatting".☆105Updated 8 months ago
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆36Updated last year
- Functional Benchmarks and the Reasoning Gap☆86Updated 8 months ago
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆24Updated last year
- codebase release for EMNLP2023 paper publication☆19Updated 3 weeks ago
- ☆17Updated 2 months ago
- Code for Zero-Shot Tokenizer Transfer☆128Updated 4 months ago
- ☆57Updated 8 months ago
- ☆49Updated 7 months ago
- Codebase accompanying the Summary of a Haystack paper.☆78Updated 8 months ago
- ☆45Updated 2 weeks ago
- 🤗 Disaggregators: Curated data labelers for in-depth analysis.☆66Updated 2 years ago
- [NeurIPS 2024] Goldfish Loss: Mitigating Memorization in Generative LLMs☆87Updated 6 months ago