rasbt / nn_plus_gzip
Gzip and nearest neighbors for text classification
β56Updated last year
Alternatives and similar repositories for nn_plus_gzip:
Users that are interested in nn_plus_gzip are comparing it to the libraries listed below
- NLP with Rust for Python π¦πβ60Updated 7 months ago
- QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning Pβ¦β34Updated last year
- QLoRA for Masked Language Modelingβ21Updated last year
- Generalist and Lightweight Model for Text Classificationβ58Updated 2 weeks ago
- π€ Trade any tensors over the networkβ30Updated last year
- π Reference-Free automatic summarization evaluation with potential hallucination detectionβ99Updated last year
- Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and teβ¦β42Updated 11 months ago
- β46Updated 11 months ago
- Tools to make language models a bit easier to useβ32Updated last month
- β48Updated last year
- Highly commented implementations of Transformers in PyTorchβ131Updated last year
- Exporting youtube videos using whisperβ17Updated 2 years ago
- β76Updated 7 months ago
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.β34Updated last month
- Named Entity Recognition with an decoder-only (autoregressive) LLM using HuggingFaceβ41Updated 2 months ago
- Lite weight wrapper for the independent implementation of SPLADE++ models for search & retrieval pipelines. Models and Library created byβ¦β29Updated 4 months ago
- a pipeline for using api calls to agnostically convert unstructured data into structured training dataβ29Updated 3 months ago
- β67Updated 5 months ago
- A miniture AI training framework for PyTorchβ37Updated 3 weeks ago
- Training and Inference Notebooks for the RedPajama (OpenLlama) modelsβ18Updated last year
- This repository contains code for cleaning your training data of benchmark data to help combat data snooping.β25Updated last year
- π€ HuggingFace Inference Toolkit for Google Cloud Vertex AI (similar to SageMaker's Inference Toolkit, but for Vertex AI and unofficial)β17Updated 9 months ago
- Simplified implementation of UMAP like dimensionality reduction algorithmβ44Updated last month
- Use sync mode Playwright interactively, inside a Jupyter notebookβ14Updated last month
- A framework for simulating e-commerce data and interactions that can be used to build recommendation systemsβ10Updated last year
- Drift detection module for machine learning pipelines.β21Updated last year
- β24Updated last year
- Check for data drift between two OpenAI multi-turn chat jsonl files.β37Updated 9 months ago
- Using short models to classify long textsβ21Updated last year
- Command Line Interface for Hugging Face Inference Endpointsβ67Updated 9 months ago