RQLuo / MixTeX-DataHub
LaTeXDataHub is an open-source platform dedicated to the sharing and contribution of real-world LaTeX image datasets and their annotations, allows users to upload, download, and contribute to a growing collection of high-quality LaTeX datasets.
☆11Updated 7 months ago
Alternatives and similar repositories for MixTeX-DataHub:
Users that are interested in MixTeX-DataHub are comparing it to the libraries listed below
- ☆56Updated last year
- Large scale training of Latex formula recognition model, currently being organized and open source☆53Updated 11 months ago
- Chinese tokens in tiktoken tokenizers.☆31Updated 10 months ago
- 🔥Your Daily Dose of AI Research from Hugging Face 🔥 Stay updated with the latest AI breakthroughs! This bot automatically collects and…☆49Updated this week
- ☆29Updated 7 months ago
- Exploration of World Languages☆20Updated last year
- Codebase for fine-tuning / evaluating nougat-based image2latex generation models☆146Updated 6 months ago
- Another LaTex formula OCR tool☆16Updated 2 years ago
- A GUI implement of MixTex with rust☆29Updated last month
- ☆14Updated 6 months ago
- GUI for offline LaTex OCR tool for Pix2Text nougat texify three models:用于Pix2Text-nougat-texify三个 模型的离线LaTex-OCR的工具的GUI☆13Updated last year
- Datasets and Evaluation Scripts for CompHRDoc☆36Updated last month
- This repo is used to release the ArxivFormula dataset.☆24Updated 4 months ago
- Official code implementation of Slow Perception:Let's Perceive Geometric Figures Step-by-step☆121Updated last month
- 珠算代码大模型(Abacus Code LLM)☆55Updated 6 months ago
- Fast pdf translate是一款pdf翻译软件,基于MinerU实现pdf转markdown 的功能,接着对markdown进行分割, 送给大模型翻译,最后组装翻译结果并由pypandoc生成结果pdf。☆12Updated last week
- 🐦⬛小鸦抢课: 北师大/北京师范大学/BNU/蹲课/抢课软件, 图形界面, 使用简单, 支持多线程, 跨平台😼 !使用前请完整阅读下方使用说明!☆36Updated 3 months ago
- [AAAI2025 Oral] Predicting the Original Appearance of Damaged Historical Documents☆69Updated 2 weeks ago
- 最简易的R1结果在小模型上的复现,阐述类O1与DeepSeek R1最重要的本质。Think is all your need。利用实验佐证,对于强推理能力,think思考过程性内容是AGI/ASI的核心。☆41Updated last month
- This repo contains code and data for ICLR 2025 paper MIA-Bench: Towards Better Instruction Following Evaluation of Multimodal LLMs☆25Updated 3 weeks ago
- BNU CERNET CLI 是一款专为北京师范大学校园网用户设计的命令行客户端。自2023年7月1日校园网服务升级后,原有的命令行客户端无法正常使用。为了解决这个问题,我们开发了这款新的客户端,使用户能够在命令行环境下便捷地登录校园网并访问互联网资源。☆13Updated last year
- 研究GOT-OCR-项目落地加速,不限语言☆59Updated 5 months ago
- A light proxy solution for HuggingFace hub.☆46Updated last year
- A repo for the Formula Recognition Model (im2latex) based on Vision Encoder Decoder Model☆16Updated 7 months ago
- Read Ten Lines at One Glance: Line-Aware Semi-Autoregressive Transformer for Multi-Line Handwritten Mathematical Expression Recognition☆27Updated last year
- XVERSE-MoE-A36B: A multilingual large language model developed by XVERSE Technology Inc.☆36Updated 6 months ago
- 中文论文、证券类、财报类PDF数据☆25Updated 9 months ago
- [ECCV2024] PosFormer: Recognizing Complex Handwritten Mathematical Expression with Position Forest Transformer☆69Updated 6 months ago
- Using Llam.cpp and onnxruntime to accelerate inference of GOT-OCR2.0☆14Updated 3 weeks ago
- 😜 表情包视觉数据集,使用glm-4v、step-1v的图像解析能力标注。☆118Updated 11 months ago