perone / feste
Feste is a free and open-source framework allowing scalable composition of NLP tasks using a graph execution model that is optimized and executed by specialized schedulers.
β41Updated 2 years ago
Alternatives and similar repositories for feste:
Users that are interested in feste are comparing it to the libraries listed below
- a pipeline for using api calls to agnostically convert unstructured data into structured training dataβ30Updated 6 months ago
- NLP with Rust for Python π¦πβ61Updated 10 months ago
- β18Updated last year
- Production-grade embedding generation, for any length of text, for transformer models.β23Updated 4 months ago
- A library for squeakily cleaning and filtering language datasets.β46Updated last year
- Writing Blog Posts with Generative Feedback Loops!β47Updated last year
- β24Updated last year
- This repository contains code for cleaning your training data of benchmark data to help combat data snooping.β25Updated last year
- QLoRA for Masked Language Modelingβ21Updated last year
- Check for data drift between two OpenAI multi-turn chat jsonl files.β37Updated 11 months ago
- Chat Markup Language conversation libraryβ55Updated last year
- Lite weight wrapper for the independent implementation of SPLADE++ models for search & retrieval pipelines. Models and Library created byβ¦β29Updated 7 months ago
- Fast and versatile tokenizer for language models, compatible with SentencePiece, Tokenizers, Tiktoken and more. Supports BPE, Unigram andβ¦β19Updated 2 weeks ago
- Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and teβ¦β42Updated last year
- π€ HuggingFace Inference Toolkit for Google Cloud Vertex AI (similar to SageMaker's Inference Toolkit, but for Vertex AI and unofficial)β17Updated last year
- β22Updated last year
- Tool to apply Legal Matter Specification Standard (LMSS) to documentsβ13Updated 7 months ago
- Explore the use of DSPy for extracting features from PDFs πβ39Updated last year
- Telemetry for applications that use LLM tools.β25Updated last year
- Experimenting with LLMs to Research, Reflect, and Plan (LLM assistants, retrieval, and Discord integration)β30Updated 8 months ago
- β41Updated 9 months ago
- Ludwig benchmarkβ20Updated 3 years ago
- β27Updated 4 months ago
- β38Updated last month
- Utility for OpenAI GPT Functionsβ14Updated last year
- Training and Inference Notebooks for the RedPajama (OpenLlama) modelsβ18Updated last year
- Pre-train Static Word Embeddingsβ51Updated 3 weeks ago
- Mixtral finetuningβ19Updated last year
- Tokenization across languages. Useful as preprocessing for subword tokenization.β22Updated 2 years ago
- LLM training in simple, raw C/CUDAβ14Updated 3 months ago