chonkie-ai / autotiktokenizer
π§° The AutoTokenizer that TikToken always needed -- Load any tokenizer with TikToken now! β¨
β33Updated 2 weeks ago
Alternatives and similar repositories for autotiktokenizer:
Users that are interested in autotiktokenizer are comparing it to the libraries listed below
- β58Updated last week
- β24Updated last year
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing β‘β63Updated 2 months ago
- WangChanGLM π -βThe Multilingual Instruction-Following Modelβ94Updated last year
- Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for freeβ224Updated 2 months ago
- Set of scripts to finetune LLMsβ36Updated 9 months ago
- WangchanX Fine-tuning Pipelineβ44Updated 3 months ago
- End-to-End LLM Guideβ100Updated 6 months ago
- πΉοΈ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.β138Updated 5 months ago
- β12Updated 3 months ago
- Google TPU optimizations for transformers modelsβ88Updated this week
- β117Updated 2 months ago
- Use QLoRA to tune LLM in PyTorch-Lightning w/ Huggingface + MLflowβ57Updated last year
- β96Updated 4 months ago
- Tools to make language models a bit easier to useβ33Updated this week
- Maya: An Instruction Finetuned Multilingual Multimodal Model using Ayaβ100Updated 2 weeks ago
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Modelsβ99Updated last month
- A template to kick-start your Python project β¨πβ51Updated 3 weeks ago
- code for training & evaluating Contextual Document Embedding modelsβ166Updated last week
- Repository containing awesome resources regarding Hugging Face tooling.β46Updated last year
- β119Updated last month
- Lite weight wrapper for the independent implementation of SPLADE++ models for search & retrieval pipelines. Models and Library created byβ¦β29Updated 4 months ago
- Explore the use of DSPy for extracting features from PDFs πβ38Updated 10 months ago
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.β34Updated last month
- A Lightweight Library for AI Observabilityβ230Updated this week
- Manage scalable open LLM inference endpoints in Slurm clustersβ248Updated 6 months ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absoluteβ¦β48Updated 6 months ago
- Collection of autoregressive model implementationβ76Updated 2 weeks ago
- experiments with inference on llamaβ104Updated 7 months ago
- Notebook and Scripts that showcase running quantized diffusion models on consumer GPUsβ37Updated 2 months ago