A demonstration of how to train a custom tokenizer similar to TikToken.
☆15Jan 6, 2025Updated last year
Alternatives and similar repositories for train_tokenizer
Users that are interested in train_tokenizer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- “小谢记账本”是一个基本的个人记账系统,拥有账户注册登录系统,可以实现记录账单,删除某条账单,查询某一特定类型的账单,查询某日,某月,某年账单,并根据账单数据生成对应图表的功能。☆15Dec 19, 2024Updated last year
- babyLM WhisBERT code☆19May 27, 2024Updated last year
- 训练自己的中文 Embedding 模型☆29Jan 6, 2025Updated last year
- This is the repository for the source code of the paper "Structure-Aware Single-Source Generalization with Pixel-Level Disentanglement fo…☆19Dec 22, 2024Updated last year
- 结合《Java编程思想》整理的Java知识点脑图☆11Jun 28, 2020Updated 5 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Code implementation for the paper titled MusicLIME: Explainable Multimodal Music Understanding☆24Jan 27, 2025Updated last year
- ☆18Feb 20, 2024Updated 2 years ago
- Joint optic disc and optic cup segmentation based on boundary prior and adversarial learning☆15Jan 18, 2023Updated 3 years ago
- Shopery - an organic grocery ecommerce html template☆12Aug 24, 2023Updated 2 years ago
- Official repostory of the paper: Masked Scene Modeling (CVPR 2025)☆17Dec 13, 2025Updated 3 months ago
- Real-time observability system with agentless, performance cluster, prometheus-compatible, custom monitoring and status page building cap…☆26Jan 12, 2026Updated 2 months ago
- ☆42Dec 18, 2025Updated 3 months ago
- published by Packt☆21Jan 19, 2026Updated 2 months ago
- Glaucoma Detection based on Optic Cup and Disc Segmentation using U-Net☆11May 1, 2023Updated 2 years ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- AI assistant built with Streamlit, NVIDIA NIM (LLaMa 3.3:70B) / Ollama, and Model Control Protocol (MCP).☆43Feb 9, 2025Updated last year
- gRPC login demo, written in modern C++ and built with Bazel☆11Oct 3, 2020Updated 5 years ago
- Original source code for Alex Libby's Practical Next.js for E-Commerce☆17Sep 12, 2023Updated 2 years ago
- Code for the 2025 ACL publication "Fine-Tuning on Diverse Reasoning Chains Drives Within-Inference CoT Refinement in LLMs"☆32Jun 25, 2025Updated 9 months ago
- Java S2I Builder image☆15Aug 12, 2025Updated 7 months ago
- Train LoRA using Microsoft's official implementation with Stable Diffusion models.☆33May 9, 2023Updated 2 years ago
- Swiss Ephemeris binding for react-native☆15Jan 19, 2026Updated 2 months ago
- AgentForge is a powerful and flexible signal-driven workflow framework designed for building intelligent, dynamic, and adaptive systems.☆19Feb 5, 2026Updated last month
- 🎯 企业级AI助手规则体系(中文版) - 专为中国开发者打造,支持Augment、Cursor、Claude Code、Trae AI等主流AI工具的一键安装和配置☆27Aug 1, 2025Updated 7 months ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- The RouteBoxer class generates a set of LatLngBounds objects that are guaranteed to cover every point within a specified distance of a pa…☆19Nov 11, 2015Updated 10 years ago
- coze api to openai☆15Sep 1, 2024Updated last year
- Solutions of the problems NER and RE in the domain of business documents with the BERT+CRF model.☆15Jan 16, 2023Updated 3 years ago
- Flash Sculptor: Modular 3D Worlds from Objects☆33Apr 13, 2025Updated 11 months ago
- Car Rental System using Django FrameWork☆19Jul 15, 2023Updated 2 years ago
- Batch processor to enable large content be digested by Ollama, focused around book processing and translations by default, fully, configu…☆36Oct 27, 2025Updated 5 months ago
- Python code & Cloudflare worker for Mistral-OCR☆12Mar 8, 2025Updated last year
- This is HAUE-CS-WIKI(河南工程学院计算机学习指南) project, which collects and organizes some computer-related course guides that I have personally stud…☆38Mar 4, 2025Updated last year
- A practical utility library for LangChain and LangGraph development☆104Mar 4, 2026Updated 3 weeks ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆30Nov 18, 2025Updated 4 months ago
- 全球AI攻防挑战赛—赛道一:大模型生图安全疫苗注入第二名解题方案☆27Nov 7, 2024Updated last year
- Text2Neo4j 是一个遍历文档、从文本中提取关系并将其保存到 Neo4j 数据库中以形成知识图谱的工具。本项目结合了 Dify 和 LLaMA3.1(8B 模型)来高效处理和提取复杂关系。☆23Aug 31, 2024Updated last year
- A simple demo of e-commerce for wechat mini program☆24Dec 11, 2022Updated 3 years ago
- [AAAI 24] Official Codebase for BridgeQA: Bridging the Gap between 2D and 3D Visual Question Answering: A Fusion Approach for 3D VQA☆27Jul 12, 2024Updated last year
- An abstraction library for building domain-specific intelligent agents based on Large Language Models (LLMs). LLMAgent provides a core ar…☆27Feb 5, 2026Updated last month
- A simple but useful chatbot based on gpt-3.5-turbo and whisper-1.☆14Mar 6, 2023Updated 3 years ago