ronigold / TokenSHAPLinks
TokenSHAP: Explain individual token importance in large language model prompts with SHAP values. Gain insights, debug models, detect biases, and enhance transparency effortlessly
β48Updated 4 months ago
Alternatives and similar repositories for TokenSHAP
Users that are interested in TokenSHAP are comparing it to the libraries listed below
Sorting:
- Toolkit for attaching, training, saving and loading of new heads for transformer modelsβ284Updated 5 months ago
- π¦ Unitxt is a Python library for enterprise-grade evaluation of AI performance, offering the world's largest catalog of tools and data β¦β206Updated this week
- This is the reproduction repository for my π€ Hugging Face blog post on synthetic dataβ68Updated last year
- A framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings.β166Updated last week
- Attribute (or cite) statements generated by LLMs back to in-context information.β270Updated 10 months ago
- EvalAssist is an open-source project that simplifies using large language models as evaluators (LLM-as-a-Judge) of the output of other laβ¦β66Updated this week
- β40Updated last year
- β72Updated last year
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absoluteβ¦β49Updated last year
- Truth Forest: Toward Multi-Scale Truthfulness in Large Language Models through Intervention without Tuningβ46Updated last year
- A comprehensive guide to LLM evaluation methods designed to assist in identifying the most suitable evaluation techniques for various useβ¦β132Updated last week
- π A curated list of papers & technical articles on AI Quality & Safetyβ189Updated 3 months ago
- code for training & evaluating Contextual Document Embedding modelsβ196Updated 2 months ago
- Interpret text data using LLMs (scikit-learn compatible).β169Updated this week
- Notebooks for training universal 0-shot classifiers on many different tasksβ135Updated 7 months ago
- β145Updated last year
- PyTorch library for Active Fine-Tuningβ88Updated 5 months ago
- Code accompanying "How I learned to start worrying about prompt formatting".β108Updated 2 months ago
- Generalist and Lightweight Model for Text Classificationβ148Updated last month
- Efficient multi-prompt evaluation of LLMsβ22Updated 8 months ago
- β118Updated 11 months ago
- LangFair is a Python library for conducting use-case level LLM bias and fairness assessmentsβ224Updated 2 weeks ago
- A mechanistic approach for understanding and detecting factual errors of large language models.β47Updated last year
- β154Updated last year
- Public code repo for paper "SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales"β109Updated 10 months ago
- A small library of LLM judgesβ251Updated last week
- β104Updated 6 months ago
- β69Updated 11 months ago
- Vision Document Retrieval (ViDoRe): Benchmark. Evaluation code for the ColPali paper.β223Updated last week
- β76Updated this week