chi2liu / mamba-gpt-3bLinks
It is almost the best 3B model in the current open source industry, surpassing Dolly v2-3b, open lama-3b, and even outperforming the EleutherAI/pythia-12b model in terms of performance. Can refer to open_llm_leaderboard
☆14Updated 2 years ago
Alternatives and similar repositories for mamba-gpt-3b
Users that are interested in mamba-gpt-3b are comparing it to the libraries listed below
Sorting:
- inference code for mixtral-8x7b-32kseqlen☆104Updated 2 years ago
- Merge Transformers language models by use of gradient parameters.☆209Updated last year
- Some simple scripts that I use day-to-day when working with LLMs and Huggingface Hub☆161Updated 2 years ago
- Automated prompting and scoring framework to evaluate LLMs using updated human knowledge prompts☆109Updated 2 years ago
- ☆415Updated 2 years ago
- Advanced Ultra-Low Bitrate Compression Techniques for the LLaMA Family of LLMs☆110Updated last year
- A bagel, with everything.☆325Updated last year
- ☆31Updated last year
- An AI agent for interacting with a computer using the graphical user interface☆78Updated 2 years ago
- [ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning☆662Updated last year
- an implementation of Self-Extend, to expand the context window via grouped attention☆119Updated last year
- ☆198Updated last year
- a tiny, exploitable chatbot that can use tools☆32Updated 2 years ago
- Weekly visualization report of Open LLM model performance based on 4 metrics.☆86Updated 2 years ago
- 1.58-bit LLaMa model☆83Updated last year
- An all-new Language Model That Processes Ultra-Long Sequences of 100,000+ Ultra-Fast☆150Updated last year
- Multipack distributed sampler for fast padding-free training of LLMs☆202Updated last year
- Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".☆279Updated 2 years ago
- This is our own implementation of 'Layer Selective Rank Reduction'☆240Updated last year
- Implementation of Adepts Fuyu all-new Multi-Modality model in pytorch☆24Updated last year
- ☆74Updated 2 years ago
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆83Updated 2 years ago
- Embed arbitrary modalities (images, audio, documents, etc) into large language models.☆188Updated last year
- batched loras☆347Updated 2 years ago
- ☆78Updated last year
- The World's First AI-Enabled Multi-Modality Native Search Engine☆25Updated last year
- ☆95Updated 2 years ago
- QLoRA: Efficient Finetuning of Quantized LLMs☆79Updated last year
- ToK aka Tree of Knowledge for Large Language Models LLM. It's a novel dataset that inspires knowledge symbolic correlation in simple inpu…☆54Updated 2 years ago
- This repo is for handling Question Answering, especially for Multi-hop Question Answering☆67Updated last year