waylandzhang / train_tokenizerView external linksLinks
A demonstration of how to train a custom tokenizer similar to TikToken.
☆16Jan 6, 2025Updated last year
Alternatives and similar repositories for train_tokenizer
Users that are interested in train_tokenizer are comparing it to the libraries listed below
Sorting:
- 训练自己的中文 Embedding 模型☆28Jan 6, 2025Updated last year
- This is the repository for the source code of the paper "Structure-Aware Single-Source Generalization with Pixel-Level Disentanglement fo…☆19Dec 22, 2024Updated last year
- ☆24Nov 21, 2025Updated 2 months ago
- “小谢记账本”是一个基本的个人记账系统,拥有账户注册登录系统,可以实现记录账单,删除某条账单,查询某一特定类型的账单,查询某日,某月,某年账 单,并根据账单数据生成对应图表的功能。☆15Dec 19, 2024Updated last year
- 结合《Java编程思想》整理的Java知识点脑图☆11Jun 28, 2020Updated 5 years ago
- Easy Dataset Docs☆13Jan 21, 2026Updated 3 weeks ago
- ☆11Apr 26, 2025Updated 9 months ago
- 【自用】2024 计算机考研复习文档☆11Oct 18, 2023Updated 2 years ago
- 一起学习Rust☆12Jan 19, 2026Updated 3 weeks ago
- LLM手撕代码合集☆19Mar 25, 2025Updated 10 months ago
- The all-in-one hacking toolbox for hardware penetration testing.☆18Jun 4, 2024Updated last year
- Glaucoma Detection based on Optic Cup and Disc Segmentation using U-Net☆11May 1, 2023Updated 2 years ago
- simplest online-softmax notebook for explain Flash Attention☆16Jan 27, 2026Updated 2 weeks ago
- Shopery - an organic grocery ecommerce html template☆12Aug 24, 2023Updated 2 years ago
- 💻NUAA 2018 操作系统小作业-模拟内存分配程序(BF算法)☆13Jul 2, 2018Updated 7 years ago
- ☆15Jun 22, 2025Updated 7 months ago
- 一些采用opencv3图像处理库做的一些项目,有检测人脸位置、人脸特效、头顶加LOGO等☆11Oct 31, 2022Updated 3 years ago
- Real-time observability system with agentless, performance cluster, prometheus-compatible, custom monitoring and status page building cap…☆26Jan 12, 2026Updated last month
- ML about cluster, regression, classification, and so on. As a playground. Just for fun.☆10Jun 11, 2022Updated 3 years ago
- LLMTechSite, 专注于通用人工智能领域的技术生态。☆12Jan 23, 2026Updated 3 weeks ago
- My CodeForces Solutions. Acts as my crash-course into Python programming. 500+ Solutions☆15Jul 13, 2018Updated 7 years ago
- A complete end-to-end system that takes mathematical problems and automatically generates polished educational videos☆30Jan 3, 2026Updated last month
- 🎯 企业级AI助手规则体系(中文版) - 专为中国开发者打造,支持Augment、Cursor、Claude Code、Trae AI等主流AI工具的一键安装和配置☆22Aug 1, 2025Updated 6 months ago
- Joint optic disc and optic cup segmentation based on boundary prior and adversarial learning☆15Jan 18, 2023Updated 3 years ago
- 中文版hf-alignment-handbook,大模型全套sft、dpo、orpo、cpt训练教程.☆14Aug 25, 2024Updated last year
- 第一个前端项目,静态页面模仿的一个B站。用CSS简单实现了弹幕下拉框,滚动轮播图,图标动画☆10Jul 17, 2018Updated 7 years ago
- This repository showcases a collection of creative and fun images generated by Google's Nano Banana image model. These handpicked example…☆36Dec 4, 2025Updated 2 months ago
- - 【LLM面经】大模型实习面试指南。手撕代码、面经经验、思考题等。初学者学习ing......欢迎指正错误☆27Nov 11, 2025Updated 3 months ago
- Car Rental System using Django FrameWork☆19Jul 15, 2023Updated 2 years ago
- 写给谭总转向计算机领域(量化交易)的一个入门引导☆17Aug 19, 2020Updated 5 years ago
- gRPC login demo, written in modern C++ and built with Bazel☆11Oct 3, 2020Updated 5 years ago
- published by Packt☆19Jan 19, 2026Updated 3 weeks ago
- 考研数学知识点文档☆14Oct 9, 2022Updated 3 years ago
- A library for parsing images in Mojo☆20Apr 14, 2025Updated 10 months ago
- ☆36Dec 18, 2025Updated last month
- Swiss Ephemeris binding for react-native☆15Jan 19, 2026Updated 3 weeks ago
- 零实现 AlphaGo Zero☆17Nov 10, 2024Updated last year
- A simple but useful chatbot based on gpt-3.5-turbo and whisper-1.☆14Mar 6, 2023Updated 2 years ago
- python/Mojo audio coding environment☆44Updated this week