chi2liu / mamba-gpt-3bLinks
It is almost the best 3B model in the current open source industry, surpassing Dolly v2-3b, open lama-3b, and even outperforming the EleutherAI/pythia-12b model in terms of performance. Can refer to open_llm_leaderboard
☆13Updated 2 years ago
Alternatives and similar repositories for mamba-gpt-3b
Users that are interested in mamba-gpt-3b are comparing it to the libraries listed below
Sorting:
- inference code for mixtral-8x7b-32kseqlen☆101Updated last year
- A bagel, with everything.☆324Updated last year
- ☆416Updated last year
- ☆199Updated last year
- Develop, evaluate and monitor LLM applications at scale☆100Updated 9 months ago
- ☆161Updated last month
- Continuously learning web-browsing AI agent that extends the Voyager architecture.☆40Updated 3 months ago
- an implementation of Self-Extend, to expand the context window via grouped attention☆118Updated last year
- An all-new Language Model That Processes Ultra-Long Sequences of 100,000+ Ultra-Fast☆152Updated last year
- Some simple scripts that I use day-to-day when working with LLMs and Huggingface Hub☆161Updated last year
- 🎸 Integrating AI plugins to LLMs☆230Updated last year
- Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".☆277Updated last year
- Democratizing access to LLMs for the open-source community. Let's advance AI, together.☆28Updated 2 years ago
- Landmark Attention: Random-Access Infinite Context Length for Transformers☆425Updated last year
- Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free☆232Updated 10 months ago
- Command-line script for inferencing from models such as MPT-7B-Chat☆100Updated 2 years ago
- Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first app…☆169Updated last year
- Extend existing LLMs way beyond the original training length with constant memory usage, without retraining☆711Updated last year
- Long context evaluation for large language models☆220Updated 6 months ago
- An AI agent for interacting with a computer using the graphical user interface☆78Updated last year
- 💬 Chatbot web app + HTTP and Websocket endpoints for LLM inference with the Petals client☆315Updated last year
- batched loras☆345Updated 2 years ago
- Co-Coder is a Python package that streamlines error debugging from Open AI chat GPT and Google Bard by providing hints, example code, and…☆46Updated 2 years ago
- Generate textbook-quality synthetic LLM pretraining data☆505Updated last year
- Embed arbitrary modalities (images, audio, documents, etc) into large language models.☆187Updated last year
- Weekly visualization report of Open LLM model performance based on 4 metrics.☆87Updated last year
- [ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning☆661Updated last year
- Inference code for Mistral and Mixtral hacked up into original Llama implementation☆371Updated last year
- ☆77Updated last year
- Merge Transformers language models by use of gradient parameters.☆208Updated last year