Chat data cleaning, filtering and deduplication pipeline.
☆22Jul 25, 2023Updated 2 years ago
Alternatives and similar repositories for chat-data-pipeline
Users that are interested in chat-data-pipeline are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code base for internal reward models and PPO training☆24Oct 1, 2023Updated 2 years ago
- Copy objects from real life and directly paste them on a background image using only your phone's camera☆23Feb 10, 2026Updated 2 months ago
- Using Spectral Noise Gating (SNG) techniques to reduce background noise in streaming microphone input for enhanced vocal recognition☆25Dec 10, 2018Updated 7 years ago
- ☆16Dec 31, 2021Updated 4 years ago
- A bunch of LLaMa model investigations, including recreating generative agents (from the paper Generative Agents: Interactive Simulacra of…☆23May 31, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Code for Massive-scale Decoding for Text Generation using Lattices☆44Jul 29, 2022Updated 3 years ago
- [ECCV22] BungeeNeRF: Progressive Neural Radiance Field for Extreme Multi-scale Scene Rendering (Jittor)☆11Sep 16, 2022Updated 3 years ago
- Tools for content datamining and NLP at scale☆45Jun 20, 2024Updated last year
- RWKV is a RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best …☆10Nov 3, 2023Updated 2 years ago
- Vast-ai public repository for open sourced tools, plugins, etc.☆16Nov 4, 2024Updated last year
- Utility for React components to easily subscribe to Mutant streams☆13Dec 9, 2017Updated 8 years ago
- classy is a simple-to-use library for building high-performance Machine Learning models in NLP.☆87Apr 6, 2026Updated last week
- Code for the NeurIPS 2020 paper "Improved analysis of clippind algorithms for non-convex optimization", including various clipping algori…☆10Feb 17, 2021Updated 5 years ago
- Gel supercharges Postgres with a modern data model, graph queries, Auth & AI solutions, and much more.☆64Feb 9, 2026Updated 2 months ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- 👜 Callbag listener sink that receives data from any listenable source☆14Feb 6, 2018Updated 8 years ago
- A tiny server to run local inference on MLX model in the style of OpenAI☆13Jan 31, 2024Updated 2 years ago
- Get CPU usage percentage of own process☆18Jan 18, 2020Updated 6 years ago
- My self-learning about Apache Airflow☆33Jul 13, 2022Updated 3 years ago
- USB Hid handler for nodejs☆11Sep 30, 2022Updated 3 years ago
- Rust AV1 Decoder☆15Jun 19, 2019Updated 6 years ago
- DataOps framework for Machine Learning projects.☆61May 4, 2023Updated 2 years ago
- ☆29Feb 24, 2025Updated last year
- Lottery Ticket Adaptation☆40Nov 20, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Official PyTorch Lightning Implementation of "Minimal Neural Atlas: Parameterizing Complex Surfaces with Minimal Charts and Distortion" (…☆20Aug 19, 2023Updated 2 years ago
- Lightweight knowledge distillation pipeline☆28Nov 29, 2021Updated 4 years ago
- Train Llama Loras Easily☆31Aug 3, 2023Updated 2 years ago
- Official PyTorch implementation of CD-MOE☆12Mar 18, 2026Updated last month
- Render diagrams from your kubernetes manifests☆14Nov 24, 2025Updated 4 months ago
- Complete Hashistack including TFE, Terraform, Consul, Vault, Nomad, Packer all in a single packer manifest. Builds in parallel on Qemu, …☆14Jul 18, 2022Updated 3 years ago
- A Vote-and-Verify Strategy for Fast Spatial Verification in Image Retrieval☆19Jun 7, 2017Updated 8 years ago
- 📑 Collection of smart contracts (mostly Ethereum) for reference and learning.☆12Feb 17, 2022Updated 4 years ago
- Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit☆63Jun 21, 2023Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Using continual sentiment analysis of social content (twitter, reddit, news, etc), perform profitable trades by following the sentiment/p…☆14Dec 9, 2019Updated 6 years ago
- The LBRY Android app (now without a blacklist)☆12Jul 17, 2022Updated 3 years ago
- Compare strings line by line.☆11Feb 14, 2025Updated last year
- Code for the paper "Critical Thinking for Language Models"☆12Jun 1, 2021Updated 4 years ago
- An experiment to see if chatgpt can improve the output of the stanford alpaca dataset☆12Mar 29, 2023Updated 3 years ago
- Query builder for elasticsearch (Node.js / Javascript)☆11Nov 16, 2015Updated 10 years ago
- A simple Google Search Engine Crawler.☆22Feb 16, 2024Updated 2 years ago