☆185Oct 13, 2023Updated 2 years ago
Alternatives and similar repositories for library-of-phi
Users that are interested in library-of-phi are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Generate textbook-quality synthetic LLM pretraining data☆508Oct 19, 2023Updated 2 years ago
- A multi-purpose LLM framework for RAG and data creation.☆625Jan 13, 2024Updated 2 years ago
- A Language and Live Runtime for Styling and Labeling Typeset Math Formulas☆26Oct 29, 2023Updated 2 years ago
- ☆11Aug 26, 2024Updated last year
- ☆21Oct 6, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆50Feb 5, 2025Updated last year
- Score LLM pretraining data with classifiers☆55Nov 2, 2023Updated 2 years ago
- [AAAI 2025] Augmenting Math Word Problems via Iterative Question Composing (https://arxiv.org/abs/2401.09003)☆23Oct 2, 2025Updated 9 months ago
- ☆21Aug 27, 2023Updated 2 years ago
- a pipeline for using api calls to agnostically convert unstructured data into structured training data☆32Sep 22, 2024Updated last year
- Package and scripts used to build a dataset of Wikipedia articles in Markdown.☆20Sep 11, 2023Updated 2 years ago
- Conversion script adapting vicuna dataset into alpaca format for use with oobabooga's trainer☆13Jun 21, 2023Updated 3 years ago
- Let's create synthetic textbooks together :)☆74Jan 29, 2024Updated 2 years ago
- AgentSearch is a framework for powering search agents and enabling customizable local search.☆537Apr 22, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Learn Anything Easily With Personalized Learning Paths Using AI☆15May 4, 2025Updated last year
- Generate High Quality textual or multi-modal datasets with Agents☆18Jun 7, 2023Updated 3 years ago
- ☆571Nov 20, 2024Updated last year
- Data and preprocessing scripts for SemEval 2022 Task 2: Multilingual Idiomaticity Detection and Sentence Embedding☆16Feb 3, 2022Updated 4 years ago
- Full finetuning of large language models without large memory requirements☆93Sep 22, 2025Updated 9 months ago
- DeepDip, a DRL Gym agent that plays no-press Diplomacy in BANDANA☆13Jul 22, 2019Updated 6 years ago
- DSPy program/pipeline inspector widget for Jupyter/VSCode Notebooks.☆45Feb 15, 2024Updated 2 years ago
- ☆126Dec 18, 2024Updated last year
- Beginner-friendly serverless LLM deployment with Replicate & fly.io☆13Sep 3, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Multipack distributed sampler for fast padding-free training of LLMs☆207Aug 10, 2024Updated last year
- An Infr app that helps you replay & talk to everything you've ever seen.☆15Sep 19, 2023Updated 2 years ago
- entropix style sampling + GUI☆27Oct 30, 2024Updated last year
- [ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning☆641Mar 4, 2024Updated 2 years ago
- Customizable implementation of the self-instruct paper.☆1,052Mar 7, 2024Updated 2 years ago
- Recipes to prepare datasets!☆15Updated this week
- 🎧 Pod-Helper: Real-time audio transcription and repair on consumer hardware☆76Feb 23, 2024Updated 2 years ago
- Deploy your GGML models to HuggingFace Spaces with Docker and gradio☆38Jun 6, 2023Updated 3 years ago
- repository for dreamoving-phantom https://www.modelscope.cn/studios/vigen/DreaMoving_Phantom/summary. DreaMoving-Phantom is a general and…☆142Feb 2, 2024Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Mixture of Expert (MoE) techniques for enhancing LLM performance through expert-driven prompt mapping and adapter combinations.☆12Feb 11, 2024Updated 2 years ago
- The imdb files with SBD-Trans OCR for TextVQA dataset.☆11Nov 30, 2021Updated 4 years ago
- [ICLR 2025] Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist☆34Oct 23, 2024Updated last year
- This repo helps to transform text into a better form for lora training☆12Apr 9, 2023Updated 3 years ago
- assign color hues to a collection of text fragments based on embeddings☆20Jun 15, 2024Updated 2 years ago
- ☆21Jun 4, 2024Updated 2 years ago
- Plug in and play implementation of " Textbooks Are All You Need", ready for training, inference, and dataset generation☆73Sep 18, 2023Updated 2 years ago