neuml / txtinstruct
π Datasets and models for instruction-tuning
β233Updated last year
Related projects β
Alternatives and complementary repositories for txtinstruct
- Domain Adapted Language Modeling Toolkit - E2E RAGβ311Updated last week
- β200Updated 9 months ago
- data cleaning and curation for unstructured textβ327Updated 3 months ago
- Toolkit for attaching, training, saving and loading of new heads for transformer modelsβ246Updated 2 weeks ago
- In-Context Learning for eXtreme Multi-Label Classification (XMC) using only a handful of examples.β385Updated 9 months ago
- A fast and lightweight pure Python library for splitting text into semantically meaningful chunks.β182Updated 4 months ago
- FastFit β‘ When LLMs are Unfit Use FastFit β‘ Fast and Effective Text Classification with Many Classesβ183Updated last month
- Lightweight demos for finetuning LLMs. Powered by π€ transformers and open-source datasets.β64Updated last month
- awesome synthetic (text) datasetsβ242Updated 3 weeks ago
- Neural Searchβ344Updated 5 months ago
- Convenient wrapper for fine-tuning and inference of Large Language Models (LLMs) with several quantization techniques (GTPQ, bitsandbytesβ¦β145Updated last year
- Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steering vectoβ¦β203Updated 6 months ago
- Small finetuned LLMs for a diverse set of useful tasksβ123Updated last year
- This is our own implementation of 'Layer Selective Rank Reduction'β232Updated 5 months ago
- Completion After Prompt Probability. Make your LLM make a choiceβ69Updated 2 weeks ago
- This repo is for handling Question Answering, especially for Multi-hop Question Answeringβ64Updated 11 months ago
- β93Updated last month
- Baguetter is a flexible, efficient, and hackable search engine library implemented in Python. It's designed for quickly benchmarking, impβ¦β162Updated 2 months ago
- Client Code Examples, Use Cases and Benchmarks for Enterprise h2oGPTe RAG-Based GenAI Platformβ81Updated this week
- Let's build better datasets, together!β205Updated this week
- π Reference-Free automatic summarization evaluation with potential hallucination detectionβ98Updated 10 months ago
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo rankerβ106Updated 3 weeks ago
- experiments with inference on llamaβ105Updated 5 months ago
- Manage scalable open LLM inference endpoints in Slurm clustersβ236Updated 4 months ago
- π Retrieval augmented generation (RAG) and language model powered search applicationsβ279Updated 10 months ago
- Late Interaction Models Training & Retrievalβ165Updated this week
- Some simple scripts that I use day-to-day when working with LLMs and Huggingface Hubβ155Updated last year
- Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first appβ¦β161Updated 10 months ago
- PanML is a high level generative AI/ML development and analysis library designed for ease of use and fast experimentation.β114Updated last year
- Generalist and Lightweight Model for Text Classificationβ49Updated this week