A graph visualization of attention
☆56May 20, 2025Updated last year
Alternatives and similar repositories for attention-graph
Users that are interested in attention-graph are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Approximating the joint distribution of language models via MCTS☆22Nov 3, 2024Updated last year
- It's a baby compiler. (Lean btw.)☆16May 19, 2025Updated last year
- Entropy Based Sampling and Parallel CoT Decoding☆17Oct 9, 2024Updated last year
- ☆145Mar 31, 2026Updated 2 months ago
- NSA Triton Kernels written with GPT5 and Opus 4.1☆70Aug 12, 2025Updated 10 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- An introduction to LLM Sampling☆80Dec 15, 2024Updated last year
- Simple Transformer in Jax☆144Jun 22, 2024Updated last year
- 🤖 Complete reproduction of 'AlphaGo Moment for Model Architecture Discovery' using MLX-LM instead of GPT-4. Autonomous neural architectu…☆29Jul 27, 2025Updated 10 months ago
- A text compressor based on the PAQ architecture.☆22Sep 12, 2025Updated 9 months ago
- ☆40Jul 26, 2024Updated last year
- look how they massacred my boy☆63Oct 16, 2024Updated last year
- ☆31Apr 24, 2026Updated last month
- An upscaler node for flow-matching models like Qwen, applying the DemoFusion approach☆60Jan 29, 2026Updated 4 months ago
- This is sample code for Paho MQTT server with Python 2.7☆10Mar 29, 2016Updated 10 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆54Apr 13, 2025Updated last year
- Code release for the paper "Style Vectors for Steering Generative Large Language Models", accepted to the Findings of the EACL 2024.☆36Sep 26, 2024Updated last year
- It's automation magic - a headless uma bot☆67May 30, 2026Updated 2 weeks ago
- smolLM with Entropix sampler on pytorch☆149Oct 31, 2024Updated last year
- trio async MQTT client that wraps paho-mqtt☆12Feb 8, 2021Updated 5 years ago
- ☆20Mar 4, 2025Updated last year
- An AI character interaction system with emotional modeling and advanced memory management☆17Oct 26, 2024Updated last year
- A prompt management, versioning, testing, and evaluation inference server and UI toolkit. Provider agnostic and OpenAI API compatible.☆119Jun 26, 2025Updated 11 months ago
- Collection of resources for RL and Reasoning☆27Feb 3, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆16Dec 29, 2024Updated last year
- Unofficial Implementation of Chain-of-Thought Reasoning Without Prompting☆35Mar 19, 2024Updated 2 years ago
- template repo for my web library projects☆28Jan 6, 2025Updated last year
- [AAAI 2026] ReCode: Reinforced Code Knowledge Editing for API Updates☆27Jul 1, 2025Updated 11 months ago
- working implimention of deepseek MLA☆44Jan 8, 2025Updated last year
- NeurIPS 2026 paper: The Geometry of Consolidation — follow-up to HIDE and No-Escape.☆110May 5, 2026Updated last month
- Git Repo for managing the ontological logger☆12Dec 27, 2020Updated 5 years ago
- MoE training for Me and You and maybe other people☆386Mar 15, 2026Updated 2 months ago
- ☆10Oct 24, 2024Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- DiscoDB is a NoSQL Document Database that promises to provide infinite storage at zero cost☆21Jun 5, 2023Updated 3 years ago
- A streamlined implementation of Grounding DINO and SAM for advanced image segmentation. This lightweight solution simplifies the integrat…☆68Sep 30, 2024Updated last year
- Deepseek-CoT☆10Oct 6, 2024Updated last year
- LayerNorm(SmallInit(Embedding)) in a Transformer to improve convergence☆61Feb 21, 2022Updated 4 years ago
- 📰 Computing the information content of trained neural networks☆23Oct 8, 2021Updated 4 years ago
- ☆10Apr 10, 2023Updated 3 years ago
- Use miniGPT-4 batch to generate captions for a lot of images! You should be able to create the best captions you always wanted!☆18Jul 20, 2023Updated 2 years ago