amazon-science/tofueval

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/amazon-science/tofueval)

amazon-science / tofueval

☆32

Alternatives and similar repositories for tofueval

Users that are interested in tofueval are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Seondong / LocEmb
View on GitHub
LocEmb: Location Embedding (Currently covering districts, roads, and businesses in Korea)
☆11Aug 15, 2022Updated 3 years ago
tingofurro / summac
View on GitHub
Codebase, data and models for the SummaC paper in TACL
☆110Jan 30, 2025Updated last year
amazon-science / synthesizrr
View on GitHub
Synthesizing realistic and diverse text-datasets from augmented LLMs
☆19Apr 4, 2026Updated 3 months ago
s-nlp / mutual_implication_score
View on GitHub
☆12May 18, 2022Updated 4 years ago
primeqa / clapnq
View on GitHub
☆46Jan 21, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
INK-USC / expl-refinement
View on GitHub
Code for the paper "Refining Language Model with Compositional Explanation" (NeurIPS 2021)
☆11Oct 25, 2021Updated 4 years ago
kaist-dmlab / FL-Sim
View on GitHub
☆20Feb 19, 2020Updated 6 years ago
mlampros / nmslibR
View on GitHub
Non Metric Space ( Approximate ) Library in R
☆12Feb 2, 2023Updated 3 years ago
DISL-Lab / FineSurE-ACL24
View on GitHub
The official repo of FineSure (ACL-2024)
☆36Jul 8, 2024Updated 2 years ago
GChrysostomou / ood_faith
View on GitHub
☆13Jul 26, 2023Updated 2 years ago
kaist-dmlab / revisit
View on GitHub
A repository of customer revisit prediction research.
☆24Jan 8, 2019Updated 7 years ago
chaitanyamalaviya / ExpertQA
View on GitHub
[Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers
☆139Mar 14, 2024Updated 2 years ago
Anjiang-Wei / CodeARC
View on GitHub
☆27Sep 30, 2025Updated 9 months ago
dblock / vectordb-hello-world
View on GitHub
Examples of vector DB indexing and query with various vector databases.
☆13May 20, 2026Updated 2 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
jpsamaroo / FluxAMDGPU.jl
View on GitHub
AMDGPU bindings for Flux
☆10Apr 6, 2021Updated 5 years ago
dqxiu / CaliNet
View on GitHub
☆32Oct 17, 2022Updated 3 years ago
kite99520 / DialSummEval
View on GitHub
Resources for paper "DialSummEval: Revisiting summarization evaluation for dialogues"
☆14Jul 22, 2025Updated last year
anthonywchen / RARR
View on GitHub
RARR: Researching and Revising What Language Models Say, Using Language Models
☆54Jun 22, 2023Updated 3 years ago
hasibi / EntityLinkingInQueries-ELQ
View on GitHub
Entity Linking in Queries: Tasks and Evaluation
☆33Sep 13, 2023Updated 2 years ago
jvladika / HealthFC
View on GitHub
HealthFC: Verifying Health Claims with Evidence-Based Medical Fact-Checking
☆14Apr 11, 2025Updated last year
Liyan06 / MiniCheck
View on GitHub
MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents [EMNLP 2024]
☆214Aug 27, 2025Updated 10 months ago
kaist-dmlab / k-Medoid
View on GitHub
☆31Mar 18, 2017Updated 9 years ago
amazon-science / e2e-docie
View on GitHub
☆17Apr 18, 2024Updated 2 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
microsoft / ConstrainedReasoner
View on GitHub
☆13Aug 26, 2024Updated last year
microsoft / MSMARCO-Passage-Ranking-Submissions
View on GitHub
Submission archive for the MS MARCO passage ranking leaderboard
☆13Apr 21, 2023Updated 3 years ago
nelson-liu / evaluating-verifiability-in-generative-search-engines
View on GitHub
Companion repo for "Evaluating Verifiability in Generative Search Engines".
☆87May 12, 2023Updated 3 years ago
amallia / gpu-integers-compression
View on GitHub
GPU-Accelerated Faster Decoding of Integer Lists
☆13Aug 20, 2019Updated 6 years ago
xiye17 / EvalQAExpl
View on GitHub
Code for Evaluating Explanations for Reading Comprehension with Realistic Counterfactuals.
☆17Apr 25, 2021Updated 5 years ago
prometheus-eval / scaling-evaluation-compute
View on GitHub
Repository for "Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators"
☆12Mar 25, 2025Updated last year
BunsenFeng / FactKB
View on GitHub
Code for "FactKB: Generalizable Factuality Evaluation using Language Models Enhanced with Factual Knowledge". EMNLP 2023.
☆20Dec 25, 2023Updated 2 years ago
NoSyu / VHUCM
View on GitHub
Implementation of Variational Hierarchical User-based Conversation Model
☆10Jul 2, 2021Updated 5 years ago
ziweiji / Self_Reflection_Medical
View on GitHub
Code for paper Towards Mitigating LLM Hallucination via Self Reflection
☆30Oct 9, 2023Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
wade3han / normlens
View on GitHub
An official codebase for "NormLens: Reading Books is Great, But Not if You Are Driving! Visually Grounded Reasoning about Defeasible Comm…
☆10May 9, 2024Updated 2 years ago
amazon-science / summary-reference-revision
View on GitHub
☆19Apr 10, 2024Updated 2 years ago
semantic-health / allennlp-multi-label
View on GitHub
A multi-label classification plugin for AllenNLP.
☆11Jan 13, 2023Updated 3 years ago
1171-jpg / MARVEL_AVR
View on GitHub
Github repo for MARVEL: Multidimensional Abstraction and Reasoning through Visual Evaluation and Learning
☆18Jun 12, 2024Updated 2 years ago
passing2961 / PersonaChatGen
View on GitHub
🎭 Official code and dataset for our CCGPK@COLING 2022 paper - "PersonaChatGen: Generating Personalized Dialogue using GPT-3"
☆13Mar 26, 2024Updated 2 years ago
haebin-seong / HarmAug
View on GitHub
HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models
☆14Mar 6, 2025Updated last year
judy-vscode / Judy
View on GitHub
Just another Julia Debugger
☆14May 29, 2019Updated 7 years ago