RUC-GSAI / Yulan-GARDENView external linksLinks
Official Repository for SIGIR2024 Demo Paper "An Integrated Data Processing Framework for Pretraining Foundation Models"
☆85Aug 27, 2024Updated last year
Alternatives and similar repositories for Yulan-GARDEN
Users that are interested in Yulan-GARDEN are comparing it to the libraries listed below
Sorting:
- YuLan: An Open-Source Large Language Model☆634Jan 10, 2025Updated last year
- A comprehensive library for implementing LLMs, including a unified training pipeline and comprehensive model evaluation.☆849Jun 16, 2025Updated 8 months ago
- JDsearch: A Personalized Product Search Dataset with Real Queries and Full Interactions☆38May 8, 2023Updated 2 years ago
- An all-in-one framework for Ad-hoc Information Retrieval.☆18Apr 3, 2024Updated last year
- [ICLR'25] DataGen: Unified Synthetic Dataset Generation via Large Language Models☆65Mar 8, 2025Updated 11 months ago
- LoRA☆18Apr 15, 2023Updated 2 years ago
- react版本的labelImage☆11Oct 26, 2021Updated 4 years ago
- ☆38Nov 13, 2025Updated 3 months ago
- List some datasets in NLP field.☆29May 27, 2021Updated 4 years ago
- 《自然语言处理:大模型理论与实践》配套数据和代码☆76Dec 24, 2025Updated last month
- Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];☆40Jan 4, 2024Updated 2 years ago
- 中国人民大学 YOJ 题库☆11Jun 9, 2022Updated 3 years ago
- OCRVerse: Towards Holistic OCR in End-to-End Vision-Language Models☆26Feb 4, 2026Updated last week
- 《大语言模型》综述全书学习笔记☆13Aug 2, 2024Updated last year
- 红外和可见光融合☆10Apr 17, 2019Updated 6 years ago
- Collection of papers for scalable automated alignment.☆93Oct 22, 2024Updated last year
- Find strongest response of convolutional layers on an image dataset. Automatically compute receptive field for any CNN layer.☆14Feb 19, 2021Updated 4 years ago
- An implementation of a neural network training routine using derivative information in Pytorch.☆10Dec 19, 2020Updated 5 years ago
- Modern normalizing flows in Python. Simple to use and easily extensible.☆11Updated this week
- Azure Machine Learning - MLOps Python SDKv2☆10Jul 24, 2023Updated 2 years ago
- 生成训练文本检测数据集☆12Jul 1, 2020Updated 5 years ago
- The implementation for ICLR 2025 Oral: From Exploration to Mastery: Enabling LLMs to Master Tools via Self-Driven Interactions.☆53Aug 9, 2025Updated 6 months ago
- ☆395Apr 1, 2025Updated 10 months ago
- Papers of ASR, Tools of ASR☆41Feb 14, 2025Updated last year
- Extension for the SenTestingKit for asynchronous testing☆104May 20, 2013Updated 12 years ago
- ☆10Oct 19, 2020Updated 5 years ago
- ✨✨VITA: Towards Open-Source Interactive Omni Multimodal LLM☆11Jun 16, 2025Updated 8 months ago
- A scalable data preprocessing framework built on PySpark for LLM training☆21Dec 9, 2025Updated 2 months ago
- ☆13Mar 1, 2019Updated 6 years ago
- 4th place solution for Data Fusion 2021 Contest☆12May 20, 2022Updated 3 years ago
- A tool to visualize iOS linked object size change between different version☆10Jun 17, 2017Updated 8 years ago
- LIDA: Lightweight Interactive Dialogue Annotator (in EMNLP 2019)☆10Oct 18, 2021Updated 4 years ago
- Google《Introduction to Agents》中文翻译☆26Nov 14, 2025Updated 3 months ago
- My templates used in OI. All C++.☆11Jul 17, 2018Updated 7 years ago
- Accepted to MLSys 2026☆70Jan 29, 2026Updated 2 weeks ago
- Creating Your Divine Agent 😇☆10Jan 26, 2026Updated 3 weeks ago
- [ICML 2025] LaCache: Ladder-Shaped KV Caching for Efficient Long-Context Modeling of Large Language Models☆17Nov 4, 2025Updated 3 months ago
- Demo of fine-tuning QA models for answering FAQ of cloud providers documentation☆11Mar 7, 2023Updated 2 years ago
- Image Tokenizer Needs Post-Training☆24Oct 4, 2025Updated 4 months ago