neuml / txtinstruct
π Datasets and models for instruction-tuning
β232Updated last year
Alternatives and similar repositories for txtinstruct:
Users that are interested in txtinstruct are comparing it to the libraries listed below
- Domain Adapted Language Modeling Toolkit - E2E RAGβ313Updated 2 months ago
- In-Context Learning for eXtreme Multi-Label Classification (XMC) using only a handful of examples.β401Updated 11 months ago
- data cleaning and curation for unstructured textβ328Updated 5 months ago
- Neural Searchβ349Updated 7 months ago
- Small finetuned LLMs for a diverse set of useful tasksβ126Updated last year
- β440Updated last year
- Convenient wrapper for fine-tuning and inference of Large Language Models (LLMs) with several quantization techniques (GTPQ, bitsandbytesβ¦β146Updated last year
- β199Updated 11 months ago
- Completion After Prompt Probability. Make your LLM make a choiceβ71Updated 2 months ago
- FastFit β‘ When LLMs are Unfit Use FastFit β‘ Fast and Effective Text Classification with Many Classesβ181Updated 3 months ago
- A framework to empower forecasting using Large Language Models (LLMs)β104Updated 6 months ago
- Benchmark various LLM Structured Output frameworks: Instructor, Mirascope, Langchain, LlamaIndex, Fructose, Marvin, Outlines, etc on taskβ¦β139Updated 3 months ago
- Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steering vectoβ¦β214Updated 8 months ago
- Lightweight demos for finetuning LLMs. Powered by π€ transformers and open-source datasets.β66Updated 2 months ago
- π Reference-Free automatic summarization evaluation with potential hallucination detectionβ99Updated last year
- A joint community effort to create one central leaderboard for LLMs.β288Updated 4 months ago
- β154Updated last year
- PanML is a high level generative AI/ML development and analysis library designed for ease of use and fast experimentation.β115Updated last year
- β205Updated 11 months ago
- Fine-Tuning Embedding for RAG with Synthetic Dataβ477Updated last year
- RankLLM is a Python toolkit for reproducible information retrieval research using rerankers, with a focus on listwise reranking.β388Updated 2 weeks ago
- This repo is for handling Question Answering, especially for Multi-hop Question Answeringβ66Updated last year
- Mistral + Haystack: build RAG pipelines that rock π€β100Updated 11 months ago
- A command-line interface to generate textual and conversational datasets with LLMs.β294Updated last year
- Late Interaction Models Training & Retrievalβ223Updated this week
- Generate Synthetic Data Using OpenAI, MistralAI or AnthropicAIβ222Updated 8 months ago
- Fast & more realistic evaluation of chat language models. Includes leaderboard.β183Updated last year
- A fast, lightweight and easy-to-use Python library for splitting text into semantically meaningful chunks.β222Updated last week
- awesome synthetic (text) datasetsβ253Updated 2 months ago
- experiments with inference on llamaβ104Updated 7 months ago