RichardKelley / hflm
A simple library for working with Hugging Face models.
☆15Updated 2 months ago
Related projects ⓘ
Alternatives and complementary repositories for hflm
- entropix style sampling + GUI☆25Updated 3 weeks ago
- RWKV-7: Surpassing GPT☆45Updated this week
- A Javascript library (with Typescript types) to parse metadata of GGML based GGUF files.☆45Updated 3 months ago
- A sleek, customizable interface for managing LLMs with responsive design and easy agent personalization.☆12Updated 2 months ago
- One Line To Build Zero-Data Classifiers in Minutes☆33Updated last month
- ☆31Updated 10 months ago
- Modified Beam Search with periodical restart☆12Updated 2 months ago
- The simplest, fastest repository for training/finetuning medium-sized xLSTMs.☆38Updated 5 months ago
- Local LLM inference & management server with built-in OpenAI API☆31Updated 7 months ago
- run ollama & gguf easily with a single command☆48Updated 6 months ago
- implementation of https://arxiv.org/pdf/2312.09299☆19Updated 4 months ago
- 5X faster 60% less memory QLoRA finetuning☆21Updated 5 months ago
- ☆53Updated 5 months ago
- ☆49Updated 8 months ago
- Experimental sampler to make LLMs more creative☆30Updated last year
- A clone of OpenAI's Tokenizer page for HuggingFace Models☆44Updated last year
- AnyModal is a Flexible Multimodal Language Model Framework for PyTorch☆49Updated this week
- ☆33Updated 6 months ago
- ☆21Updated 5 months ago
- An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO☆22Updated this week
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated 8 months ago
- Synthetic data derived by templating, few shot prompting, transformations on public domain corpora, and monte carlo tree search.☆22Updated last month
- utilities for loading and running text embeddings with onnx☆39Updated 3 months ago
- Karpathy's llama2.c transpiled to MLX for Apple Silicon☆15Updated 10 months ago
- Simple LLM inference server☆18Updated 5 months ago
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆113Updated 3 weeks ago
- Fast approximate inference on a single GPU with sparsity aware offloading☆38Updated 10 months ago
- Latent Large Language Models☆16Updated 2 months ago
- ☆28Updated this week
- The Benefits of a Concise Chain of Thought on Problem Solving in Large Language Models☆20Updated 9 months ago