jdh-algo/JoyDataForge

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/jdh-algo/JoyDataForge)

jdh-algo / JoyDataForge

数据合成工具，简单高效的合成不同业务场景的大模型训练数据

☆46

Alternatives and similar repositories for JoyDataForge

Users that are interested in JoyDataForge are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

jdh-algo / Citrus-V
View on GitHub
Citrus-V: Advancing Medical Foundation Models with Unified Medical Image Grounding for Clinical Reasoning
☆24Sep 26, 2025Updated 10 months ago
tiagocrz / KGs_for_Vertical_AI
View on GitHub
☆16Nov 22, 2025Updated 8 months ago
pkulcwmzx / knowledge-boundary
View on GitHub
[ACL 2024] Benchmarking Knowledge Boundary for Large Language Models: A Different Perspective on Model Evaluation
☆10May 26, 2024Updated 2 years ago
NVIDIA / Nemotron-CORTEXA
View on GitHub
Nemotron-CORTEXA is an open-source software engineering agent that fixes GitHub issues.
☆25Aug 7, 2025Updated 11 months ago
MTU-Bench-Team / MTU-Bench
View on GitHub
MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language Models
☆60Jul 24, 2025Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
wzhouad / WPO
View on GitHub
Code and models for EMNLP 2024 paper "WPO: Enhancing RLHF with Weighted Preference Optimization"
☆41Sep 24, 2024Updated last year
Ananyaiitbhilai / KGViz
View on GitHub
[KGC '24] This application is for visualisation of Knowledge Graphs. We employe a novel technique which uses LLM based agent for triple e…
☆11Apr 17, 2024Updated 2 years ago
jofalk / Xcbwin
View on GitHub
Xcbwin - a simple C++ class for graphical outputs using XCB
☆12May 12, 2015Updated 11 years ago
lemon-little / BetterSynth
View on GitHub
天池Better Synth多模态大模型数据合成挑战赛-打赢baseline就算成功方案
☆30Oct 30, 2025Updated 9 months ago
rajpurkarlab / craft-md
View on GitHub
☆47Mar 14, 2025Updated last year
zhangmazi1 / Data-preprocessing
View on GitHub
数据预处理——插值法填补缺失值，并且标记填充位置
☆10Apr 19, 2019Updated 7 years ago
InternLM / Condor
View on GitHub
[ACL 2025] An official pytorch implement of the paper: Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement
☆40May 28, 2025Updated last year
muyiyiyi / ClipSearch
View on GitHub
基于cn-clip模型封装的本地图片搜索工具
☆11Jul 6, 2023Updated 3 years ago
hetianch / AdaBoost
View on GitHub
A simple adaboost code using decision stumps as weak classifiers
☆11Nov 1, 2012Updated 13 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
manueldeprada / Pretraining-T5-PyTorch-Lightning
View on GitHub
Collection of scripts to pretrain T5 in unsupervised text, using PyTorch Lightning. CORD-19 pretraining provided as example.
☆32Apr 26, 2021Updated 5 years ago
ZexuSun / AgentSkiller
View on GitHub
☆31Feb 11, 2026Updated 5 months ago
VimalWill / Vstream
View on GitHub
Vstream - Video Analytics pipeline with Hardware based accelerations (dev - stage)
☆10Feb 2, 2024Updated 2 years ago
ccoder64 / search_rag
View on GitHub
search_with_lepton 的自部署版
☆14May 4, 2024Updated 2 years ago
cangumeli / Scan2CADRasterizer
View on GitHub
A custom rasterizer for Scan2CAD labels over ScanNet images. Created for our ROCA project (CVPR 2022).
☆11Jul 24, 2025Updated last year
ArtemBaskal / model-based-testing-calculator
View on GitHub
Model Based Testing of the App Based On The Description from Constructing the User Interface with Statecharts Book of Ian Horrocks using …
☆13Feb 20, 2024Updated 2 years ago
PKU-Baichuan-MLSystemLab / PAS
View on GitHub
☆53Sep 11, 2024Updated last year
usnistgov / nestor
View on GitHub
Quantifying tacit knowledge for investigatory analysis
☆13Mar 7, 2024Updated 2 years ago
TRT2022 / ControlNet_TensorRT
View on GitHub
天池 NVIDIA TensorRT Hackathon 2023 —— 生成式AI模型优化赛初赛第三名方案
☆50Aug 16, 2023Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
TheAgentArk / Toucan
View on GitHub
Official repo of Toucan: Synthesizing 1.5M Tool-Agentic Data from Real-World MCP Environments
☆260Dec 16, 2025Updated 7 months ago
seanzhang-zhichen / baichuan-Dynamic-NTK-ALiBi
View on GitHub
百川Dynamic NTK-ALiBi的代码实现：无需微调即可推理更长文本
☆49Aug 27, 2023Updated 2 years ago
williamliujl / Qilin-Med
View on GitHub
☆16May 21, 2024Updated 2 years ago
Maginaaa / QuestionAnsweringSystem-V2.0
View on GitHub
简单问答系统优化版本
☆12Dec 10, 2019Updated 6 years ago
alycialee / beyond-scale-language-data-diversity
View on GitHub
☆13Jul 22, 2026Updated last week
microsoft / FEA-Bench
View on GitHub
[ACL25] FEA-Bench: A Benchmark for Evaluating Repository-Level Code Generation for Feature Implementation
☆57Jan 28, 2026Updated 6 months ago
EasyPaperSniper / CrossLoco
View on GitHub
Implementation of CrossLoco, currently lite version
☆14May 12, 2024Updated 2 years ago
cjiang2 / onto_python
View on GitHub
An Introductory Jupyter Notebook to Manipulate Ontologies with Owlready2
☆11Jan 10, 2020Updated 6 years ago
nzjin / awesome_moe
View on GitHub
The collections of MOE (Mixture Of Expert) papers, code and tools, etc.
☆12Mar 15, 2024Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
NoGameNoLive / Issac_AMP
View on GitHub
Recurrent AMP: Adversarial Motion Priors for Stylized Physics-Based Character Control in Issac Gym 4
☆10Jan 27, 2024Updated 2 years ago
xinqiao123 / Credit-card-fraud-detection
View on GitHub
comparison study of tab transformer and ft transformer for credit card fraud detection
☆12Jan 6, 2023Updated 3 years ago
ShayekhBinIslam / hummingbird-mlagent
View on GitHub
A Simple Game Using Unity ML-Agents
☆10Nov 20, 2020Updated 5 years ago
haozheji / exact-optimization
View on GitHub
ICML 2024 - Official Repository for EXO: Towards Efficient Exact Optimization of Language Model Alignment
☆55Jun 16, 2024Updated 2 years ago
lezhang7 / TreeMix
View on GitHub
[NAACL 2022] TreeMix: Compositional Constituency-based Data Augmentation for Natural Language Understanding
☆10Jul 15, 2023Updated 3 years ago
LikeGiver / VideoRAG
View on GitHub
a tiny project to test the effectiveness of video QA through RAG techniques and multimodal LLMs
☆15Jun 2, 2024Updated 2 years ago
hejunqing / webMedQA
View on GitHub
A Chinese medical question answering dataset
☆71Jan 14, 2020Updated 6 years ago