kddubey / capprLinks
Completion After Prompt Probability. Make your LLM make a choice
β78Updated 7 months ago
Alternatives and similar repositories for cappr
Users that are interested in cappr are comparing it to the libraries listed below
Sorting:
- Datasets collection and preprocessings framework for NLP extreme multitask learningβ182Updated 4 months ago
- π Reference-Free automatic summarization evaluation with potential hallucination detectionβ99Updated last year
- minimal pytorch implementation of bm25 (with sparse tensors)β101Updated last year
- Simple replication of [ColBERT-v1](https://arxiv.org/abs/2004.12832).β80Updated last year
- Reimplementation of the task generation part from the Alpaca paperβ118Updated 2 years ago
- Our open source implementation of MiniLMv2 (https://aclanthology.org/2021.findings-acl.188)β61Updated last year
- Lightweight demos for finetuning LLMs. Powered by π€ transformers and open-source datasets.β77Updated 7 months ago
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 laβ¦β48Updated last year
- Efficient few-shot learning with cross-encoders.β52Updated last year
- NLP with Rust for Python π¦πβ62Updated 3 weeks ago
- Pre-train Static Word Embeddingsβ70Updated this week
- Small finetuned LLMs for a diverse set of useful tasksβ125Updated last year
- π Datasets and models for instruction-tuningβ238Updated last year
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)β76Updated 7 months ago
- Experiments with generating opensource language model assistantsβ97Updated 2 years ago
- Pretraining Efficiently on S2ORC!β164Updated 7 months ago
- β44Updated 6 months ago
- Mixing Language Models with Self-Verification and Meta-Verificationβ104Updated 5 months ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absoluteβ¦β49Updated 10 months ago
- Codebase accompanying the Summary of a Haystack paper.β78Updated 8 months ago
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.β59Updated 10 months ago
- FastFit β‘ When LLMs are Unfit Use FastFit β‘ Fast and Effective Text Classification with Many Classesβ206Updated 3 weeks ago
- Lightweight tools for quick and easy LLM demo'sβ27Updated 8 months ago
- A library for squeakily cleaning and filtering language datasets.β46Updated last year
- Seahorse is a dataset for multilingual, multi-faceted summarization evaluation. It consists of 96K summaries with human ratings along 6 qβ¦β88Updated last year
- Vector Database with support for late interaction and token level embeddings.β54Updated 8 months ago
- Python API for https://vespa.ai, the open big data serving engineβ125Updated 3 weeks ago
- an implementation of Self-Extend, to expand the context window via grouped attentionβ119Updated last year
- XTR: Rethinking the Role of Token Retrieval in Multi-Vector Retrievalβ51Updated 11 months ago
- A set of utilities for running few-shot prompting experiments on large-language modelsβ121Updated last year