Explore visualization tools for understanding Transformer-based large language models (LLMs)
☆22Dec 1, 2024Updated last year
Alternatives and similar repositories for Awesome-Transformer-Visualization
Users that are interested in Awesome-Transformer-Visualization are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Fine-tuning GPT-2 to generate research paper abstracts☆12Apr 28, 2021Updated 4 years ago
- ☆14May 7, 2024Updated last year
- A C project template with support for CMake and Unity test framework☆11Jun 12, 2018Updated 7 years ago
- This is the GPT2 baseline for ProtoQA☆12Jan 3, 2022Updated 4 years ago
- This repository provides the code for applying Contrastive Learning Penalty Loss (CLPL) and Mixture of Experts (MoE) to the BGE-M3 text e…☆11Dec 27, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆14Jul 25, 2024Updated last year
- ☆14Apr 16, 2024Updated 2 years ago
- 🎓Automatically Update CV Papers Daily using Github Actions (Update Every 12th hours)☆12Updated this week
- A minimal example of a formally verified parser using ocamllex and Menhir's Coq backend.☆21Mar 19, 2015Updated 11 years ago
- Data Augmentation on Graphs: A Technical Survey☆15Feb 12, 2023Updated 3 years ago
- This repository provides a 3D implementation of DINOv2 for self-supervised pretraining on volumetric (3D) medical images using Lightly, M…☆53Updated this week
- ☆15Apr 13, 2026Updated last week
- A set of kernel-based (Un)conditional independence tests including SDCIT (Lee and Honavar, UAI 2017)☆16Feb 6, 2020Updated 6 years ago
- An automated data pipeline scaling RL to pretraining levels☆75Oct 11, 2025Updated 6 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Qwen3-0.6B megakernel: 527 tok/s decode on RTX 3090 (3.8x faster than PyTorch)☆88Feb 10, 2026Updated 2 months ago
- ☆53Feb 10, 2025Updated last year
- This is an updated version of the MolecularTransformer of Schwaller et. al.☆13Jan 17, 2022Updated 4 years ago
- [NAACL 2024] CoE-SQL: In-Context Learning for Multi-Turn Text-to-SQL with Chain-of-Editions☆13May 7, 2024Updated last year
- For Certified Robustness to Text Adversarial Attacks by Randomized [MASK]☆17Oct 8, 2024Updated last year
- ☆17Jan 31, 2025Updated last year
- ☆16Apr 30, 2025Updated 11 months ago
- ☆16May 31, 2024Updated last year
- R1-Code-Interpreter: Training LLMs to Reason with Code via Supervised and Reinforcement Learning☆38Feb 9, 2026Updated 2 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- CIKM 2021: Pooling Architecture Search for Graph Classification☆21Jul 19, 2022Updated 3 years ago
- ☆13Apr 14, 2026Updated last week
- ☆14Jul 24, 2023Updated 2 years ago
- ☆96Jun 5, 2024Updated last year
- Yet Another Introduction to Quantum Computing☆14Oct 27, 2025Updated 5 months ago
- Code for replicating experiments from the paper, Preference Exploration for Efficient Bayesian Optimization with Multiple Outcomes, publi…☆13Jun 22, 2023Updated 2 years ago
- I-SHEEP: Iterative Self-enHancEmEnt Paradigm of LLMs through Self-Instruct and Self-Assessment☆17Jan 16, 2025Updated last year
- A 22.9 million carbon atom dataset☆16Mar 7, 2023Updated 3 years ago
- Efficiently creating diverse multi-turn Text-to-SQL training samples in just 3 steps! 🚀☆14Jan 31, 2026Updated 2 months ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Selected Online Courses☆14Jan 15, 2020Updated 6 years ago
- This repository contains the dataset and code for our ACL'23 publication: "MatSci-NLP: Evaluating Scientific Language Models on Materials…☆17Nov 21, 2023Updated 2 years ago
- A Unix shell written in Java☆16Aug 30, 2016Updated 9 years ago
- [IROS2023]Learning to Solve Tasks with Exploring Prior Behaviours☆12Mar 3, 2024Updated 2 years ago
- ☆22Jun 10, 2025Updated 10 months ago
- Contains implementation of the DoubIL and ResiduIL algorithms from the ICML '22 paper Causal Imitation Learning under Temporally Correlat…☆11Dec 9, 2022Updated 3 years ago
- Python library to compress LitGPT models for resource efficient inference.☆16Updated this week