Score LLM pretraining data with classifiers
☆55Nov 2, 2023Updated 2 years ago
Alternatives and similar repositories for classified
Users that are interested in classified are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Generate textbook-quality synthetic LLM pretraining data☆508Oct 19, 2023Updated 2 years ago
- Convert all of libgen to high quality markdown☆255Dec 13, 2023Updated 2 years ago
- Your fruity companion for transformers☆14May 25, 2022Updated 3 years ago
- Just large language models. Hackable, with as little abstraction as possible. Done for my own purposes, feel free to rip.☆44Sep 6, 2023Updated 2 years ago
- ☆11Aug 26, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- This repository contains all code examples for my TensorFlow World talk about "Advanced model deployments with TensorFlow Serving"☆17Dec 8, 2022Updated 3 years ago
- ☆11Mar 11, 2016Updated 10 years ago
- ☆22Aug 27, 2023Updated 2 years ago
- A structured framework for defining, verifying and certifying AI systems.☆19Mar 11, 2025Updated last year
- ☆24May 19, 2024Updated 2 years ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆19Feb 27, 2023Updated 3 years ago
- Globe Engineer - Handkerchief: A higher quality alternative to vector database RAG.☆24Jan 4, 2024Updated 2 years ago
- A collection of optimizers, some arcane others well known, for Flax.☆29Aug 6, 2021Updated 4 years ago
- ☆18Mar 20, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Measuring RAG solutions throughput and latency☆20Jul 23, 2024Updated last year
- A new way to generate large quantities of high quality synthetic data (on par with GPT-4), with better controllability, at a fraction of …☆23Oct 1, 2024Updated last year
- ☆45Oct 13, 2023Updated 2 years ago
- An algorithm that intelligently executes a crypto order over time via Coinbase☆13Oct 26, 2021Updated 4 years ago
- tiny_fnc_engine is a minimal python library that provides a flexible engine for calling functions extracted from a LLM.☆38Sep 11, 2024Updated last year
- An Educational Framework Based on PyTorch for Deep Learning Education and Exploration☆11Dec 24, 2023Updated 2 years ago
- ☆21Oct 6, 2023Updated 2 years ago
- Lightweight open-source perplexity☆62May 6, 2024Updated 2 years ago
- ☆185Oct 13, 2023Updated 2 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- prediction market indexer with semantic search☆37Jan 27, 2026Updated 3 months ago
- I use various Data Science and machine learning techniques to analyze customer data using STP framework. I preprocessed the data, perform…☆12Apr 26, 2020Updated 6 years ago
- a getting-started sample for Clojure and Solr☆11Aug 28, 2015Updated 10 years ago
- A simple AI agent controlling a simulation of a smart home☆13Jun 13, 2024Updated last year
- [ICML 2025] Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale☆269Jul 8, 2025Updated 10 months ago
- Simple FieldCache based query introspection Solr Search Component - solves the 'red sofa' problem☆11Jan 27, 2025Updated last year
- Examples for the Activate conference☆11Sep 11, 2019Updated 6 years ago
- Automatically research and outbound companies with Exa API and google sheets app scripts.☆18Jun 24, 2024Updated last year
- Python library for Evaluation☆17Mar 31, 2026Updated last month
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆16May 17, 2012Updated 14 years ago
- minimalist vector ad☆11Feb 11, 2024Updated 2 years ago
- opennlp-solr-examples☆10Jul 1, 2022Updated 3 years ago
- A module that allows you to proxy a single hypercore replication stream to multiple peers☆21Jul 5, 2018Updated 7 years ago
- ☆12Dec 13, 2023Updated 2 years ago
- This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"☆17Feb 22, 2024Updated 2 years ago
- The Solr Package Directory and Sanctuary☆13Oct 14, 2025Updated 7 months ago