thomwolf / sesame-explorationsLinks
☆29Updated 5 months ago
Alternatives and similar repositories for sesame-explorations
Users that are interested in sesame-explorations are comparing it to the libraries listed below
Sorting:
- ☆124Updated 10 months ago
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆105Updated 6 months ago
- ☆49Updated 7 months ago
- Open-source reproducible benchmarks from Argmax☆58Updated this week
- ☆135Updated last month
- ☆71Updated 2 months ago
- Maya: An Instruction Finetuned Multilingual Multimodal Model using Aya☆116Updated last month
- Collection of autoregressive model implementation☆86Updated 5 months ago
- The Batched API provides a flexible and efficient way to process multiple requests in a batch, with a primary focus on dynamic batching o…☆149Updated 2 months ago
- An introduction to LLM Sampling☆79Updated 9 months ago
- smolLM with Entropix sampler on pytorch☆150Updated 10 months ago
- Code for ExploreTom☆86Updated 3 months ago
- Hugging Face Jobs☆19Updated 2 months ago
- ☆157Updated 9 months ago
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆146Updated 7 months ago
- Datamodels for hugging face tokenizers☆76Updated this week
- Using open source LLMs to build synthetic datasets for direct preference optimization☆66Updated last year
- Set of scripts to finetune LLMs☆38Updated last year
- Hugging Face Inference Toolkit used to serve transformers, sentence-transformers, and diffusers models.☆87Updated last week
- A HuggingFace compatible Small Language Model trainer.☆76Updated 7 months ago
- DeMo: Decoupled Momentum Optimization☆191Updated 9 months ago
- A simple, hackable text-to-speech system in PyTorch and MLX☆174Updated last month
- Chunk your text using gpt4o-mini more accurately☆44Updated last year
- Optimus is a flexible and scalable framework built to train language models efficiently across diverse hardware configurations, including…☆67Updated 2 months ago
- Video+code lecture on building nanoGPT from scratch☆69Updated last year
- ☆135Updated last year
- ☆209Updated last week
- code for training & evaluating Contextual Document Embedding models☆197Updated 4 months ago
- Trully flash implementation of DeBERTa disentangled attention mechanism.☆63Updated 3 weeks ago
- ☆150Updated last year