mustafaaljadery / gemma-2B-10M
Gemma 2B with 10M context length using Infini-attention.
☆933Updated 4 months ago
Related projects: ⓘ
- Official Pytorch repository for Extreme Compression of Large Language Models via Additive Quantization https://arxiv.org/pdf/2401.06118.p…☆1,116Updated last week
- OpenCodeInterpreter is a suite of open-source code generation systems aimed at bridging the gap between large language models and sophist…☆1,555Updated 4 months ago
- OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments☆1,122Updated 3 weeks ago
- Reaching LLaMA2 Performance with 0.1M Dollars☆957Updated last month
- We introduced a new model designed for the Code generation task. Its test accuracy on the HumanEval base dataset surpasses that of GPT-4 …☆792Updated 2 months ago
- A series of math-specific large language models of our Qwen2 series.☆439Updated last month
- Llama-3 agents that can browse the web by following instructions and talking to you☆1,318Updated 2 months ago
- ☆544Updated this week
- Accelerate your Hugging Face Transformers 7.6-9x. Native to Hugging Face and PyTorch.☆679Updated 3 weeks ago
- TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones☆1,235Updated 5 months ago
- YaFSDP: Yet another Fully Sharded Data Parallel☆821Updated 2 weeks ago
- Automate the analysis of GitHub repositories for LLMs with RepoToTextForLLMs. Fetch READMEs, structure, and non-binary files efficiently.…☆613Updated 3 months ago
- The first open source Large Action Model generalist Artificial Narrow Intelligence that controls completely human user interfaces by only…☆1,267Updated 3 months ago
- ☆876Updated this week
- ☆1,031Updated 6 months ago
- LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs☆1,025Updated 2 weeks ago
- ☆617Updated this week
- Port of OpenAI's Whisper model in C/C++ with xtts and wav2lip☆744Updated last month
- Training LLMs with QLoRA + FSDP☆1,385Updated this week
- Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI☆1,309Updated 5 months ago
- ☆449Updated 5 months ago
- DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence☆1,952Updated 2 months ago
- Code for Quiet-STaR☆478Updated 3 weeks ago
- proof of concept prototype for generating and querying against an ever-expanding knowledge graph with ai☆839Updated 5 months ago
- DeepSeek-VL: Towards Real-World Vision-Language Understanding☆2,013Updated 4 months ago
- Large-scale LLM inference engine☆934Updated this week
- A native PyTorch Library for large model training☆1,727Updated this week
- A realtime live transcription and translation app built with Huggingface Transformer.js and Supabase Realtime.☆335Updated this week
- ☆640Updated this week
- The official implementation of Self-Play Fine-Tuning (SPIN)☆958Updated 4 months ago