This repository contains the resource introduced in the paper: "Truth or Mirage? Towards End-to-End Factuality Evaluation with LLM-Oasis". LLM-Oasis is a large-scale resource for end-to-end factuality evaluation obtained by extracting and falsifying information from Wikipedia.
β25Oct 15, 2025Updated 7 months ago
Alternatives and similar repositories for LLM-Oasis
Users that are interested in LLM-Oasis are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Word Sense Linking model is designed to identify and disambiguate spans of text to their most suitable senses from a reference inventory.β13Aug 23, 2024Updated last year
- A Word Level Transformer layer based on PyTorch and π€ Transformers.β34Jan 31, 2024Updated 2 years ago
- A tool to assist in the interpretation of learned features in sparse autoencoders (in particular the four SAE's trained by Joseph Bloom oβ¦β19Oct 4, 2024Updated last year
- β15Dec 26, 2024Updated last year
- β67Jun 10, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- The official implementation of Cross-Task Experience Sharing (COPS)β29Oct 23, 2024Updated last year
- Code repository for the paper "The Inherent Limits of Pretrained LLMs: The Unexpected Convergence of Instruction Tuning and In-Context Leβ¦β14Jan 16, 2025Updated last year
- The official implementation for Collaborative Word-based Pre-trained Item Representation for Transferable Recommendation.β25Jan 30, 2024Updated 2 years ago
- PyTorch Implementation of "ASTRA: An Action Spotting TRAnsformer for Soccer Videos", ACM MMSports 2023. | 3rd place solution for SoccerNeβ¦β44May 20, 2024Updated 2 years ago
- classy is a simple-to-use library for building high-performance Machine Learning models in NLP.β87Apr 6, 2026Updated 2 months ago
- β17Apr 9, 2025Updated last year
- [NeurIPS XAIA & Springer] Code and notebooks to paper "A Fresh Look at Sanity Checks for Saliency Maps"β25Jul 12, 2024Updated last year
- β44Jul 24, 2024Updated last year
- Official code release for the paper Trapped in texture bias? A large scale comparison of deep instance segmentation, accepted at ECCV 202β¦β16Jan 16, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer β’ AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- This is the repository for NAACL'25 paper "TART: An Open-Source Tool-Augmented Framework for Explainable Table-based Reasoning"β58May 3, 2025Updated last year
- Improving Your Model Ranking on Chatbot Arena by Vote Rigging (ICML 2025)β27Feb 25, 2025Updated last year
- β15Apr 12, 2021Updated 5 years ago
- This is the official code for the paper "Virus: Harmful Fine-tuning Attack for Large Language Models Bypassing Guardrail Moderation"β57Feb 2, 2025Updated last year
- β11Oct 12, 2023Updated 2 years ago
- This repository hosts the dataset for the paper Computer Science Named Entity Recognition in the Open Research Knowledge Graphβ21Jan 8, 2024Updated 2 years ago
- Find informative examples to efficiently (human)-evaluate NLG models.β17Apr 22, 2026Updated last month
- Official implementation of Inconsistency Masks. A robust semi-supervised segmentation framework that reframes model disagreement as aβ¦β19Jan 23, 2026Updated 4 months ago
- Official repository for "Craw4LLM: Efficient Web Crawling for LLM Pretraining"β654Feb 24, 2025Updated last year
- Serverless GPU API endpoints on Runpod - Get Bonus Credits β’ AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Official Implementation of "Pix2Cap-COCO: Advancing Visual Comprehension via Pixel-Level Captioning"β28Dec 16, 2025Updated 5 months ago
- Code associated with the EMNLP 2024 Main paper: "Image, tell me your story!" Predicting the original meta-context of visual misinformatioβ¦β45Dec 6, 2025Updated 6 months ago
- β29Jan 27, 2025Updated last year
- [NeurIPS 2024] GTBench: Uncovering the Strategic Reasoning Limitations of LLMs via Game-Theoretic Evaluationsβ69Sep 6, 2024Updated last year
- CaBRNet - Case-Based Reasoning Networks made simpleβ22May 6, 2026Updated last month
- Low-latency Space-time Supersampling for Real-time Renderingβ33Feb 1, 2024Updated 2 years ago
- Problem-Oriented Segmentation and Retrieval EMNLP 2024 Findingsβ34Nov 12, 2024Updated last year
- Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Modelsβ41Sep 30, 2024Updated last year
- This is the official repository for OVIR-3D: Open-Vocabulary 3D Instance Retrieval Without Training on 3D Data. (CoRL'23)β112Nov 10, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer β’ AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- (NeurIPS 2023) CorresNeRF: Image Correspondence Priors for Neural Radiance Fieldsβ49Sep 4, 2024Updated last year
- Open-source Python toolkit focused on deep learning with ordinal methodologiesβ70May 28, 2026Updated last week
- [3DV 2025] Learning Naturally Aggregated Appearance for Efficient 3D Editingβ33Feb 13, 2025Updated last year
- β33May 15, 2024Updated 2 years ago
- [ECAI 2023] MonoSKD: General Distillation Framework for Monocular 3D Object Detection via Spearman Correlation Coefficientβ32Dec 8, 2023Updated 2 years ago
- LexEval: A Comprehensive Benchmark for Evaluating Large Language Models in Legal Domainβ98Oct 30, 2024Updated last year
- Statewide Visual Geolocalization in the Wild (ECCV 2024)β75Dec 2, 2024Updated last year