mustafaaljadery / gemma-2B-10M
Gemma 2B with 10M context length using Infini-attention.
☆959Updated 8 months ago
Alternatives and similar repositories for gemma-2B-10M:
Users that are interested in gemma-2B-10M are comparing it to the libraries listed below
- OpenCodeInterpreter is a suite of open-source code generation systems aimed at bridging the gap between large language models and sophist…☆1,627Updated 8 months ago
- Official Pytorch repository for Extreme Compression of Large Language Models via Additive Quantization https://arxiv.org/pdf/2401.06118.p…☆1,210Updated last month
- Llama-3 agents that can browse the web by following instructions and talking to you☆1,388Updated last month
- Reaching LLaMA2 Performance with 0.1M Dollars☆967Updated 6 months ago
- [NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments☆1,565Updated last week
- YaFSDP: Yet another Fully Sharded Data Parallel☆878Updated this week
- We introduced a new model designed for the Code generation task. Its test accuracy on the HumanEval base dataset surpasses that of GPT-4 …☆829Updated 6 months ago
- A series of math-specific large language models of our Qwen2 series.☆729Updated 2 weeks ago
- Accelerate your Hugging Face Transformers 7.6-9x. Native to Hugging Face and PyTorch.☆687Updated 5 months ago
- ☆665Updated this week
- The first open-source Artificial Narrow Intelligence generalist agentic framework Computer-Using-Agent that fully operates graphical-user…☆1,277Updated 7 months ago
- TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones☆1,259Updated 9 months ago
- Automate the analysis of GitHub repositories for LLMs with RepoToTextForLLMs. Fetch READMEs, structure, and non-binary files efficiently.…☆715Updated 8 months ago
- Mora: More like Sora for Generalist Video Generation☆1,543Updated 3 months ago
- Agent S: an open agentic framework that uses computers like a human☆771Updated last week
- ☆446Updated 9 months ago
- Training LLMs with QLoRA + FSDP☆1,442Updated 2 months ago
- Library for industrial alignment.☆370Updated this week
- LLM Transparency Tool (LLM-TT), an open-source interactive toolkit for analyzing internal workings of Transformer-based language models. …☆791Updated last month
- ☆920Updated this week
- A Gradio demo of MGIE☆346Updated 11 months ago
- Port of OpenAI's Whisper model in C/C++ with xtts and wav2lip☆800Updated 6 months ago
- Official implementation of "Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling"☆833Updated last week
- Code for Quiet-STaR☆706Updated 5 months ago
- GRadient-INformed MoE☆261Updated 4 months ago
- ☆783Updated 4 months ago
- Codebase for Aria - an Open Multimodal Native MoE☆978Updated last week
- An innovative open-source Code Interpreter with (GPT,Gemini,Claude,LLaMa) models.☆245Updated 2 weeks ago
- llama3.np is a pure NumPy implementation for Llama 3 model.☆972Updated 7 months ago
- Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.☆1,913Updated 6 months ago