chi2liu / mamba-gpt-3b
It is almost the best 3B model in the current open source industry, surpassing Dolly v2-3b, open lama-3b, and even outperforming the EleutherAI/pythia-12b model in terms of performance. Can refer to open_llm_leaderboard
☆13Updated last year
Alternatives and similar repositories for mamba-gpt-3b:
Users that are interested in mamba-gpt-3b are comparing it to the libraries listed below
- ☆52Updated 11 months ago
- An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO☆28Updated 3 weeks ago
- Latent Large Language Models☆17Updated 7 months ago
- Simple Implementation of a Transformer in the new framework MLX by Apple☆20Updated 4 months ago
- LLM reads a paper and produce a working prototype☆51Updated 2 weeks ago
- Local LLM inference & management server with built-in OpenAI API☆31Updated 11 months ago
- Machine Learning Serving focused on GenAI with simplicity as the top priority.☆58Updated 2 months ago
- FMS Model Optimizer is a framework for developing reduced precision neural network models.☆16Updated this week
- ☆60Updated last year
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated last year
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆35Updated 11 months ago
- ☆84Updated last year
- GPT* - Training faster small transformers using ALiBi, Parallel Residual Connections and more!☆21Updated 2 years ago
- A simple package for leveraging Falcon 180B and the HF ecosystem's tools, including training/inference scripts, safetensors, integrations…☆13Updated last year
- fine tuning mistral 7B using Huggingface, Weights and Biases, Choline, and Vast AI☆37Updated last year
- ☆38Updated last year
- Simple Implementation of TinyGPTV in super simple Zeta lego blocks☆16Updated 4 months ago
- a pipeline for using api calls to agnostically convert unstructured data into structured training data☆30Updated 6 months ago
- 🦾💻🌐 distributed training & serverless inference at scale on RunPod☆17Updated 10 months ago
- Code for paper: "QuIP: 2-Bit Quantization of Large Language Models With Guarantees" adapted for Llama models☆36Updated last year
- LLMs as Collaboratively Edited Knowledge Bases☆45Updated last year
- Very minimal (and stateless) agent framework☆41Updated 2 months ago
- Generate High Quality textual or multi-modal datasets with Agents☆18Updated last year
- A library for simplifying fine tuning with multi gpu setups in the Huggingface ecosystem.☆16Updated 5 months ago
- ☆33Updated last year
- A guidance compatibility layer for llama-cpp-python☆34Updated last year
- Simple GRPO scripts and configurations.☆59Updated last month
- ☆48Updated 4 months ago
- ☆22Updated last year
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆31Updated 10 months ago