This is a fork of SGLang for hip-attention integration. Please refer to hip-attention for detail.
☆18Dec 23, 2025Updated 2 months ago
Alternatives and similar repositories for sglang
Users that are interested in sglang are comparing it to the libraries listed below
Sorting:
- Training-free Post-training Efficient Sub-quadratic Complexity Attention. Implemented with OpenAI Triton.☆150Nov 3, 2025Updated 3 months ago
- ☆16Nov 24, 2025Updated 3 months ago
- Detecting Hallucinations in Large Language Model Generation: A Token Probability Approach. This repository includes the implementation of…☆17Jun 1, 2024Updated last year
- Automatic differentiation for Triton Kernels☆29Aug 12, 2025Updated 6 months ago
- [ACL 2023] Gradient Ascent Post-training Enhances Language Model Generalization☆29Sep 12, 2024Updated last year
- ☆13Oct 5, 2025Updated 4 months ago
- Build an AI bot in Discord to serve user's personalized reports on what's up in tech☆28Sep 14, 2025Updated 5 months ago
- ☆12Jun 19, 2024Updated last year
- ☆12Jul 8, 2024Updated last year
- Linear Relational Embeddings (LREs) and Linear Relational Concepts (LRCs) for LLMs in PyTorch☆10Aug 7, 2024Updated last year
- Reference implementation of Thin and Deep Gaussian Processes (NeurIPS 2023)☆14Nov 25, 2024Updated last year
- my profile readme☆14Updated this week
- Code for experiments on self-prediction as a way to measure introspection in LLMs☆16Dec 10, 2024Updated last year
- [ACL 2024] RelayAttention for Efficient Large Language Model Serving with Long System Prompts☆40Feb 29, 2024Updated 2 years ago
- 《GPT-4, ChatGPT, 라마인덱스, 랭체인을 활용한 인공지능 프로그래밍》 예제 코드☆10Jan 16, 2024Updated 2 years ago
- ☆14May 21, 2024Updated last year
- Unofficial implementation of "Gaussian-Flow: 4D Reconstruction with Dynamic 3D Gaussian Particle"☆13Jul 3, 2024Updated last year
- ☆16Nov 26, 2024Updated last year
- Python package for compressing floating-point PyTorch tensors☆13Jul 22, 2024Updated last year
- Official Implementation of SEA: Sparse Linear Attention with Estimated Attention Mask (ICLR 2024)☆11Jun 20, 2025Updated 8 months ago
- A Benchmark for Multi-Stage Legal Case Documents Generation☆15Feb 24, 2025Updated last year
- ☆11Dec 11, 2024Updated last year
- Predicting the Stock Market - Can we do it?☆10Jul 24, 2021Updated 4 years ago
- [ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models☆11Dec 13, 2023Updated 2 years ago
- ☆10Nov 21, 2023Updated 2 years ago
- The backup repository for FairytaleQA dataset and paper "Fantastic Questions and Where to Find Them: FairytaleQA – An Authentic Dataset f…☆10May 30, 2023Updated 2 years ago
- The course work repo for UoSurrey EEEM071 (2023 Spring)☆11May 9, 2023Updated 2 years ago
- A Java-based framework for combinatorial test input generation, fault characterization and automated test execution.☆11Jan 22, 2024Updated 2 years ago
- implementing Weight Agnostic Neural Networks to Spiking Neural Networks☆10Jan 26, 2021Updated 5 years ago
- Korean Abstract Meaning Representation (AMR) Corpus☆10Feb 27, 2022Updated 4 years ago
- Minimal Transformer base in JAX. A single backbone for language modelling, diffusion, classification, etc...☆13May 28, 2025Updated 9 months ago
- ☆15Aug 19, 2025Updated 6 months ago
- 1st Place Team Crane: @aswinkumar1999 @rathull @kyolebu☆29Sep 8, 2025Updated 5 months ago
- 🤖 Implementation of Self Normalizing Networks (SNN) in PyTorch.☆12Jun 19, 2017Updated 8 years ago
- ☆14Aug 7, 2024Updated last year
- Implementation of the ACL Findings paper "OutFlip: Generating Examples for Unknown Intent Detection with Natural Language Attack"☆10May 24, 2021Updated 4 years ago
- Learning to Skip the Middle Layers of Transformers☆17Aug 7, 2025Updated 6 months ago
- Chain-of-thought 방식을 활용하여 llama2를 fine-tuning☆10Nov 18, 2023Updated 2 years ago
- ☆11Mar 3, 2024Updated last year