secsilm / chinese-tokens-in-tiktokenLinks
Chinese tokens in tiktoken tokenizers.
☆32Updated last year
Alternatives and similar repositories for chinese-tokens-in-tiktoken
Users that are interested in chinese-tokens-in-tiktoken are comparing it to the libraries listed below
Sorting:
- Enable tool-use ability for any LLM model (DeepSeek V3/R1, etc.)☆51Updated 3 weeks ago
- A lightweight script for processing HTML page to markdown format with support for code blocks☆79Updated last year
- Evaluation for AI apps and agent☆42Updated last year
- kimi-chat 测试数据☆7Updated last year
- 🔥Your Daily Dose of AI Research from Hugging Face 🔥 Stay updated with the latest AI breakthroughs! This bot automatically collects and…☆52Updated this week
- Token level visualization tools for large language models☆81Updated 5 months ago
- 我们是第一个完全可商用的角色大模型。☆40Updated 10 months ago
- support BM25+vecetor☆29Updated last month
- ☆32Updated last year
- XVERSE-MoE-A36B: A multilingual large language model developed by XVERSE Technology Inc.☆39Updated 9 months ago
- Auto Thinking Mode switch for Qwen3 in Open webui☆65Updated last month
- (撰写ing..)本仓库偏教程性质,以「模型中文化」为一个典型的模型训练问题切入场景,指导读者上手学习LLM二次微调训练。☆34Updated 10 months ago
- 最简易的R1结果在小模型上的复现,阐述类O1与DeepSeek R1最重要的本质。Think is all your need。利用实验佐证,对于强推理能力,think思考过程性内容是AGI/ASI的核心。☆45Updated 4 months ago
- Deep Reasoning Translation (DRT) Project☆224Updated 3 weeks ago
- 珠算代码大模型(Abacus Code LLM)☆55Updated 9 months ago
- Pretrain、decay、SFT a CodeLLM from scratch 🧙♂️☆36Updated last year
- XVERSE-MoE-A4.2B: A multilingual large language model developed by XVERSE Technology Inc.☆39Updated last year
- The complete training code of the open-source high-performance Llama model, including the full process from pre-training to RLHF.☆69Updated 2 years ago
- Computer Agent Arena: Test & compare AI agents in real desktop apps & web environments. Code/data coming soon!☆45Updated 2 months ago
- 官方transformers源码解析。AI大模型时代,pytorch、transformer是新操作系统,其他都是运行在其上面的软件。☆17Updated last year
- A benchmarking tool for comparing different LLM API providers' DeepSeek model deployments.☆28Updated 2 months ago
- Implemented a script that automatically adjusts Qwen3's inference and non-inference capabilities, based on an OpenAI-like API. The infere…☆20Updated last month
- SUS-Chat: Instruction tuning done right☆48Updated last year
- ☆53Updated last week
- ☆36Updated 9 months ago
- Fine-Tune LLM Synthetic-Data application and "From Data to AGI: Unlocking the Secrets of Large Language Model"☆17Updated 11 months ago
- An Open Math Pre-trainng Dataset with 370B Tokens.☆89Updated 2 months ago
- Official Implementation of APB (ACL 2025 main)☆28Updated 4 months ago
- Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models☆133Updated last year
- Code execution sandbox(support Open-R1), Supports multiple languages(Python/Java/C/Kotlin/Swift/OC/GO/...)☆20Updated 3 months ago