A scalable data preprocessing framework built on PySpark for LLM training
☆23Dec 9, 2025Updated 3 months ago
Alternatives and similar repositories for Kaiyuan-Spark
Users that are interested in Kaiyuan-Spark are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Python interface and preprocessing pipeline for the BBBC021 dataset of cellular images☆14Sep 19, 2021Updated 4 years ago
- A free lunch for LLMs recognition images☆10Apr 29, 2025Updated 11 months ago
- ☆17Feb 11, 2023Updated 3 years ago
- ☆12Mar 14, 2025Updated last year
- SAS and Python code examples for use with SAS Viya Workbench.☆11Jun 26, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- UMM-Discovery is a fully unsupervised deep learning method to cluster cellular images with similar phenotypes together, solely based on t…☆11Nov 4, 2020Updated 5 years ago
- 人脸识别的项目,基于arcface算法,使用CelebA和LFW数据集。☆14Nov 4, 2021Updated 4 years ago
- ☆38Jan 16, 2026Updated 2 months ago
- 一个Bilibili的弹幕屏蔽规则☆11Jul 11, 2024Updated last year
- Follow nginx log, and find out bad guys!☆24Mar 7, 2026Updated 3 weeks ago
- Jupyter Notebooks for the Image Data Resource☆19Feb 20, 2026Updated last month
- LibGDX and Web port of the awesome Pixel Dungeon (1.9.2a)☆20Oct 16, 2022Updated 3 years ago
- A minimal example of Abductive Learning☆19Dec 6, 2023Updated 2 years ago
- Saving tenhou paifu (replays) with key information. 天鳳牌譜記錄器。☆13Feb 5, 2025Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Documentation for Digital Design course☆20Jun 10, 2025Updated 9 months ago
- Code for "YOLOv8-SMOT: An Efficient and Robust Framework for Real-Time Small Object Tracking via Slice-Assisted Training and Adaptive Ass…☆20Nov 5, 2025Updated 4 months ago
- Full State Quantum Circuit Simulation Beyond Memory Limit☆16Aug 5, 2024Updated last year
- [ICCV2025] Constructing Ophthalmic MLLM for Positioning-diagnosis Collaboration Through Clinical Cognitive Chain Reasoning☆24Nov 13, 2025Updated 4 months ago
- Exemplar Guided Deep Neural Network for Spatial Transcriptomics Analysis of Gene Expression Prediction☆19Oct 20, 2023Updated 2 years ago
- laboratory assignments of cs143-Compilers☆18Jun 8, 2021Updated 4 years ago
- Llama 2 Everywhere (L2E)☆14Jun 26, 2024Updated last year
- Docker file templates for GZCTF, including crypto, pwn, web.☆21Aug 10, 2023Updated 2 years ago
- ⚡ Triton implementation of Clifford algebra neural networks.☆36Oct 24, 2025Updated 5 months ago
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- A framework to compress classical machine learning model during training by quantum machine learning☆19Aug 1, 2024Updated last year
- ⚡Japanese sentence splitting(日本語文境界判定器), 40–250× faster via a Rust-accelerated Python library with near-perfect API compatibility with …☆71Oct 14, 2025Updated 5 months ago
- Optimizing the Cell Painting assay for image-based profiling☆21Aug 11, 2025Updated 7 months ago
- ☆27Jan 8, 2024Updated 2 years ago
- The official implementation of the AAAI 2024 paper Bi-ViT.☆12Dec 18, 2023Updated 2 years ago
- WS-DINO: a novel framework to use weak label information in a self-supervised setting to learn phenotypic representations from high-conte…☆23Mar 28, 2023Updated 3 years ago
- Deep learning examples for the Instant Super Computer☆20Jan 28, 2026Updated 2 months ago
- Computer vision course experiment☆13Feb 14, 2020Updated 6 years ago
- Code for Kolmogorov-Arnold Network for Quantum Architecture Search i.e., KANQAS☆19Jan 9, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Official PyTorch implementation of QwT—“Quantization without Tears” (CVPR 2025): fast, accurate, and hassle-free post-training network qu…☆32Sep 30, 2025Updated 6 months ago
- Repository for RuDiK, a system for discovering declarative logical rules over RDF Knowledge Bases☆27May 8, 2024Updated last year
- A Python OOP CLI+GUI interface with the object database in the mobile game Gakuen Idolm@ster.☆30Mar 19, 2026Updated last week
- [仅napcat] 让QQ的野生bot也能发送按钮!☆27Mar 17, 2026Updated last week
- An implementation of HPL-AI Mixed-Precision Benchmark based on hpl-2.3☆30May 30, 2021Updated 4 years ago
- Demos for Volcengine's OpenAPIs☆19Feb 25, 2026Updated last month
- Some minimal implementation of some Diffusion Models. Try to use as less code and as simple arch as possible☆21Jan 10, 2025Updated last year