Eric-is-good / pretrain-LLM-from-scratchLinks
从0训练类 o1 大语言模型。
☆132Updated last week
Alternatives and similar repositories for pretrain-LLM-from-scratch
Users that are interested in pretrain-LLM-from-scratch are comparing it to the libraries listed below
Sorting:
- 框架核心是两阶段“粗筛-精滤”数据清洗流程。首先,利用CLIP的多门控决策逻辑进行宏观粗筛,精准剔除插画、图表等非摄影类噪声。随后,利用DINOv2的细粒度特征,创新采用“相对边际分数”识别处于类别边界的混淆样本,并结合GMM模型为各类别动态确定清洗标准。整个流程内置最小样…☆208Updated 2 months ago
- Science-Star: A Platform for Building, Extending, and Experimenting with Scientific Agents.☆738Updated 3 months ago
- 超能文献|AI驱动的文档翻译与学术搜索服务。支持PDF、DOCX、PPTX等多格式文档的高质量翻译(支持11种语言),特别优化了数学公式翻译。同时提供PubMed学术文献智能搜索功能。更多访问:https://suppr.wilddata.cn☆246Updated 2 months ago
- 智川x-agent☆1,080Updated 4 months ago
- Fat-Cat: A document-centric context management Agent. Making context as simple as reading chat history.☆281Updated last week
- 双版本markitdown:Java命令行;Python Web☆446Updated last week
- ☆332Updated 2 months ago
- Synthetic Data Generation Platform By DataArcTech☆719Updated this week
- A real-time interactive Omni Avatar built on LiveKit, which allows you to seamlessly integrate with any open source Avatar components (re…☆557Updated this week
- Advanced Quantitative Factor Research: ML-powered stock return prediction with 72% performance improvement. Features comprehensive alpha …☆377Updated 4 months ago
- ☆516Updated 10 months ago
- GigaModels: A Comprehensive Repository and Platform for Multi-modal, Generative, and Perceptual Models☆650Updated last month
- 基于ragflow二次开发的后台管理系统,可以独立运行,支持批量管理知识库、聊天、智能体、用户等,解决ragflow在后台管理上的痛点。☆180Updated 3 weeks ago
- ☆81Updated last month
- ☆220Updated 3 months ago
- ☆462Updated 8 months ago
- A transparent, minimal, and hackable agent framework. ~300 lines of readable code. Full control, no magic.☆434Updated last week
- 2004-2025 美赛O奖论文☆165Updated 2 weeks ago
- efflux-desktop-ui☆315Updated 5 months ago
- Raman spectra correction and digitalization☆81Updated 3 months ago
- vue3+pinia+vue-router+elementPlus+vite7☆159Updated last month
- 【股票API、外汇API、期货API、加密货币API】Infoway API是一款高性能的行情接口产品。专为量化交易、股票分析平台、交易所设 计。☆255Updated 4 months ago
- Official implementation repository of Holistic Data Schedule☆199Updated last week
- next easy report☆471Updated 2 weeks ago
- ☆399Updated 2 months ago
- High-Performance Perpetual Futures Exchange Matching Engine☆380Updated 2 weeks ago
- Minimalist ML framework for Go.