thu-pacman / Kaiyuan-SparkView external linksLinks
A scalable data preprocessing framework built on PySpark for LLM training
☆21Dec 9, 2025Updated 2 months ago
Alternatives and similar repositories for Kaiyuan-Spark
Users that are interested in Kaiyuan-Spark are comparing it to the libraries listed below
Sorting:
- SAS and Python code examples for use with SAS Viya Workbench.☆11Jun 26, 2024Updated last year
- Azure Machine Learning - MLOps Python SDKv2☆10Jul 24, 2023Updated 2 years ago
- OCRVerse: Towards Holistic OCR in End-to-End Vision-Language Models☆26Feb 4, 2026Updated last week
- [ICCV2025] Constructing Ophthalmic MLLM for Positioning-diagnosis Collaboration Through Clinical Cognitive Chain Reasoning☆23Nov 13, 2025Updated 3 months ago
- Saving tenhou paifu (replays) with key information. 天鳳牌譜記錄器。☆13Feb 5, 2025Updated last year
- Python interface and preprocessing pipeline for the BBBC021 dataset of cellular images☆13Sep 19, 2021Updated 4 years ago
- Modern normalizing flows in Python. Simple to use and easily extensible.☆11Updated this week
- Image Tokenizer Needs Post-Training☆24Oct 4, 2025Updated 4 months ago
- [ICML 2025] LaCache: Ladder-Shaped KV Caching for Efficient Long-Context Modeling of Large Language Models☆17Nov 4, 2025Updated 3 months ago
- Creating Your Divine Agent 😇☆10Jan 26, 2026Updated 3 weeks ago
- Color detection, Contour mapping, Detecting holes, Motion detection☆10Mar 20, 2014Updated 11 years ago
- A lightweight, production-ready C++ library for LLM tokenization, fully compatible with HuggingFace tokenizer.json.☆22Jan 4, 2026Updated last month
- A free lunch for LLMs recognition images☆10Apr 29, 2025Updated 9 months ago
- Python solutions to coding questions in Leetcode☆13Sep 12, 2020Updated 5 years ago
- UMM-Discovery is a fully unsupervised deep learning method to cluster cellular images with similar phenotypes together, solely based on t…☆11Nov 4, 2020Updated 5 years ago
- My templates used in OI. All C++.☆11Jul 17, 2018Updated 7 years ago
- Code for "YOLOv8-SMOT: An Efficient and Robust Framework for Real-Time Small Object Tracking via Slice-Assisted Training and Adaptive Ass…☆19Nov 5, 2025Updated 3 months ago
- ✨✨VITA: Towards Open-Source Interactive Omni Multimodal LLM☆11Jun 16, 2025Updated 8 months ago
- ☆12Mar 14, 2025Updated 11 months ago
- Various test models in WNNX format. It can view with `pip install wnetron && wnetron`☆12Jun 22, 2022Updated 3 years ago
- Codes for our paper "AgentMonitor: A Plug-and-Play Framework for Predictive and Secure Multi-Agent Systems"☆13Dec 13, 2024Updated last year
- Computer vision course experiment☆13Feb 14, 2020Updated 6 years ago
- A framework to compress classical machine learning model during training by quantum machine learning☆19Aug 1, 2024Updated last year
- Code for ACL 2023 Oral Paper: ManagerTower: Aggregating the Insights of Uni-Modal Experts for Vision-Language Representation Learning☆12Aug 23, 2025Updated 5 months ago
- Llama 2 Everywhere (L2E)☆14Jun 26, 2024Updated last year
- KDD 2024 AQA competition 2nd place solution☆12Jul 21, 2024Updated last year
- LLM手撕代码合集☆19Mar 25, 2025Updated 10 months ago
- Mixture-of-Basis-Experts for Compressing MoE-based LLMs☆27Dec 24, 2025Updated last month
- [ICLR 2026] Official repo for "Spotlight on Token Perception for Multimodal Reinforcement Learning"☆49Jan 30, 2026Updated 2 weeks ago
- Official code of "UniVid: Unifying Vision Tasks with Pre-trained Video Generation Models" WACV2026☆36Nov 24, 2025Updated 2 months ago
- Open Platform Robot☆14Jul 9, 2022Updated 3 years ago
- ☆17Feb 11, 2023Updated 3 years ago
- Code for Kolmogorov-Arnold Network for Quantum Architecture Search i.e., KANQAS☆18Jan 9, 2025Updated last year
- Quickly and easily deploy TF2 Image Object Detection models from TensorFlow Hub trained on COCO 2017 dataset.☆12Nov 11, 2020Updated 5 years ago
- AgentProg: Empowering Long-Horizon GUI Agents with Program-Guided Context Management☆22Dec 31, 2025Updated last month
- ☆35Jan 16, 2026Updated last month
- EfficientSAM + YOLO World base model for use with Autodistill.☆10Feb 21, 2024Updated last year
- ☆11Jan 8, 2025Updated last year
- Beyond Real: Imaginary Extension of Rotary Position Embeddings for Long-Context LLMs☆33Dec 9, 2025Updated 2 months ago