poloclub / transformer-explainerLinks
Transformer Explained Visually: Learn How LLM Transformer Models Work with Interactive Visualization
☆5,810Updated 2 weeks ago
Alternatives and similar repositories for transformer-explainer
Users that are interested in transformer-explainer are comparing it to the libraries listed below
Sorting:
- 3D Visualization of an GPT-style LLM☆5,114Updated last year
- Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.☆12,220Updated last month
- ☆5,481Updated 9 months ago
- llama3 implementation one matrix multiplication at a time☆15,182Updated last year
- Task-Aware Agent-driven Prompt Optimization Framework☆3,668Updated 3 weeks ago
- Simple, unified interface to multiple Generative AI providers☆12,710Updated this week
- PyTorch native post-training library☆5,576Updated this week
- s1: Simple test-time scaling☆6,590Updated 4 months ago
- A Next-Generation Training Engine Built for Ultra-Large MoE Models☆4,963Updated last week
- Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.☆20,578Updated 7 months ago
- The simplest, fastest repository for training/finetuning small-sized VLMs.☆4,229Updated last week
- Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.☆25,257Updated 3 weeks ago
- Run PyTorch LLMs locally on servers, desktop and mobile☆3,617Updated last month
- Vision agent☆5,093Updated 2 months ago
- 🔍 An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT)☆6,665Updated 4 months ago
- [CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型☆9,420Updated last month
- Easily fine-tune, evaluate and deploy gpt-oss, Qwen3, DeepSeek-R1, or any open source LLM / VLM!☆8,584Updated this week
- A modular graph-based Retrieval-Augmented Generation (RAG) system☆29,014Updated this week
- Composable building blocks to build Llama Apps☆8,146Updated this week
- Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model☆7,988Updated 8 months ago
- Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.☆15,819Updated 2 weeks ago
- This is the official repository for The Hundred-Page Language Models Book by Andriy Burkov☆1,968Updated 5 months ago
- Build custom inference engines for models, agents, multi-modal systems, RAG, pipelines and more.☆3,681Updated this week
- Utilities intended for use with Llama models.☆7,329Updated last month
- Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning☆3,603Updated 3 months ago
- The LLM's practical guide: From the fundamentals to deploying advanced LLM and RAG apps to AWS using LLMOps best practices☆4,351Updated 8 months ago
- Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.☆48,036Updated this week
- An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.☆27,579Updated last month
- An open-source RAG-based tool for chatting with your documents.☆24,597Updated 4 months ago
- ☆5,581Updated last year