yeahjack / chatgpt_zulip_bot
Zulip-based bot to respond users by ChatGPT
☆16Updated last year
Alternatives and similar repositories for chatgpt_zulip_bot:
Users that are interested in chatgpt_zulip_bot are comparing it to the libraries listed below
- The Quartz Quantum Compiler☆79Updated this week
- ☆27Updated last year
- Estimate MFU for DeepSeekV3☆14Updated 2 weeks ago
- Pytorch implementation of our paper accepted by ICML 2024 -- CaM: Cache Merging for Memory-efficient LLMs Inference☆29Updated 7 months ago
- ☆36Updated 4 months ago
- Code associated with the paper **Fine-tuning Language Models over Slow Networks using Activation Compression with Guarantees**.☆27Updated last year
- ☆12Updated 5 months ago
- A happy way for research!☆24Updated last year
- Code release for AdapMoE accepted by ICCAD 2024☆10Updated 2 months ago
- ☆59Updated 2 months ago
- ☆13Updated 8 months ago
- ☆16Updated 3 weeks ago
- [NAACL 24 Oral] LoRETTA: Low-Rank Economic Tensor-Train Adaptation for Ultra-Low-Parameter Fine-Tuning of Large Language Models☆33Updated last week
- A Suite for Parallel Inference of Diffusion Transformers (DiTs) on multi-GPU Clusters☆37Updated 5 months ago
- SQUEEZED ATTENTION: Accelerating Long Prompt LLM Inference☆36Updated 2 months ago
- ☆48Updated last year
- Repository of the paper "Accelerating Transformer Inference for Translation via Parallel Decoding"☆114Updated 10 months ago
- ☆23Updated 2 years ago
- Official Repo for SparseLLM: Global Pruning of LLMs (NeurIPS 2024)☆51Updated last month
- ☆13Updated last month
- Course notes for Cyber Security (THUCST 2023 Spring)☆26Updated last year
- [ICML 2024] SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large Language Models☆18Updated 7 months ago
- [EMNLP 2024] RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization☆27Updated 3 months ago
- ☆14Updated last year
- SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs☆21Updated last month
- This repo contains the source code for: Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs☆32Updated 5 months ago
- EE-LLM is a framework for large-scale training and inference of early-exit (EE) large language models (LLMs).☆52Updated 7 months ago
- Implement Flash Attention using Cute.☆65Updated last month