Dahoas / gpt-neox-finetuningLinks
☆15Updated 3 years ago
Alternatives and similar repositories for gpt-neox-finetuning
Users that are interested in gpt-neox-finetuning are comparing it to the libraries listed below
Sorting:
- ☆22Updated 3 years ago
- Tool to apply Legal Matter Specification Standard (LMSS) to documents☆13Updated 11 months ago
- A library for squeakily cleaning and filtering language datasets.☆47Updated 2 years ago
- Tools for managing datasets for governance and training.☆85Updated last month
- Training & Implementation of chatbots leveraging GPT-like architecture with the aitextgen package to enable dynamic conversations.☆48Updated 2 years ago
- Completion After Prompt Probability. Make your LLM make a choice☆80Updated 8 months ago
- Search through Facebook Research's PyTorch BigGraph Wikidata-dataset with the Weaviate vector search engine☆31Updated 3 years ago
- A TextTiling-based algorithm for text segmentation (aka topic segmentation) that uses neural sentence encoders, as well as extractive sum…☆47Updated 2 years ago
- The Synthetic-Persona-Chat dataset is a synthetically generated persona-based dialogue dataset. It extends the original Persona-Chat data…☆96Updated last year
- Efficient few-shot learning with cross-encoders.☆54Updated last year
- ☆20Updated 4 years ago
- Using short models to classify long texts☆21Updated 2 years ago
- ☆94Updated last year
- Using open source LLMs to build synthetic datasets for direct preference optimization☆65Updated last year
- Datasets collection and preprocessings framework for NLP extreme multitask learning☆184Updated 2 weeks ago
- ☆33Updated 2 years ago
- A Multilingual Dataset for Parsing Realistic Task-Oriented Dialogs☆114Updated 2 years ago
- H&M Fashion Image similarity search with Weaviate and DocArray☆43Updated last year
- ☆23Updated 2 years ago
- Build Semantic Search with S-BERT and Fine-tune your model in unsupervised way☆58Updated 3 years ago
- Source codes for the paper "Bounding the Capabilities of Large Language Models in Open Text Generation with Prompt Constraints"☆27Updated 2 years ago
- Dockerfile and web server for running GPT-J-6B on AWS GPU instances☆18Updated 3 years ago
- A dataset for pretraining language models targeted for legal tasks.☆134Updated 3 years ago
- Code for constructing TLDR corpus from Reddit dataset☆25Updated 3 years ago
- Reimplementation of the task generation part from the Alpaca paper☆119Updated 2 years ago
- ☆57Updated 9 months ago
- [EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.☆108Updated last year
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆66Updated 8 months ago
- Fine-tuning 6-Billion GPT-J (& other models) with LoRA and 8-bit compression☆66Updated 2 years ago
- ☆23Updated 2 years ago